FFECTIVE RETRIEVAL OF RELEVANT WEB DOCUMENTS: A QUERY EXPANSION APPROACH USING FORMAL CONCEPT ANALYSIS
DOI:
https://doi.org/10.33003/fjs-2025-0903-3242Keywords:
Formal concept, Precision, Query expansion, Query, Recall, Relevant document, Vector transformationAbstract
In information retrieval, vocabulary mismatch between the search vocabulary and documents vocabulary has been a common challenge for web users, hindering information access and retrieval. This issue is often attributed to the ambiguous representation of user information needs as queries, leading to the retrieval of many irrelevant documents, particularly for non-skilled web users. This paper aims to improve user query representation for effective retrieval of relevant documents from the web. To achieve this, a query expansion strategy was employed to identify terms with similar meanings to the user’s initial query term. The similarities between the expanded terms and the initial user query term were determined by calculating the cosine angle between the vector resenting document vocabulary and the vector representing query term. Thereafter, Formal Concept Analysis (FCA) was employed to analyze and present the results. Findings from the analysis revealed that concatenating similar terms with the initial user query terms resulted in a 0.16% improvement, as evident in the retrieval precision of 0.64% with the initial query and 0.80% with the expanded query terms.
References
Afuan, L; .Ashari, A. and Suyanto, Y. (2019). A study: query expansion methods in information retrieval. Journal of Physics: Conference Series 1367 (2019) 012001 https://doi.org./10.1088/1742-6596/1367/1/012001
Boissier, B.; Rychkova, I.; Le Grand, B. (2024). Using Formal Concept Analysis for Corpus Visualisation and Relevance Analysis. 16th International Conference on Knowledge Management and Information Systems, Nov 2024, Porto, Portugal. 120-129, ff10.5220/0013047800003838ff. ffhal-04808054ff
Boukhetta, E. S. and Trabelsi, M. (2023). Formal Concept Analysis for Trace Clustering in Process Mining. International Conference on Conceptual Structures, Sep. 2023. Berlin, Germany. 73-88.
Cakir, A. and Gurkan, M. (2023). Modified query expansion through generative adversarial networks for information extraction in e-commerce. Machine Learning with Applications, 14, (2023) 100509.
El Qadi A., Aboutajdine D., & Ennouary Y. (2010) Formal Concept Analysis for Information Retrieval: International Journal of Computer Science and Information Security, 7 (2), 119-125
Eminagaoglu, M. (2020). A new similarity measure for vector space models in text classification and information retrieval. Journal of Information Science, 48(4). https://doi.org/10.1177/0165551520968055
Huang, L., Milne, D., Frank, E., & Witten, I. H. (2011) Learning a Concept-based Document Similarity Measure: Retrieved on 3rd July 2015 from www.cs.waikato.ac.nz/
Jothilakshmi, R., Shanthi, N., & Babisaraswathi, R., (2013). A survey of semantic query Expansion: Journal of Theoretical and Applied Information Technology, 57, (1), 128-138
Kruiper, R.; Konstas, I; .Alasdair J.G.G.; Sadeghineko, F.; Watson, R. and Kumar, B. (2023). Document and Query Expansion for Information Retrieval on Building Regulations Lattices. Frontiers in Computing and Intelligent Systems, 3(3), 81-83.
Liu, Z., Natarajan, S. & Chen, Y. (2011). Query Expansion Based on Clustered Results Proceedings of the VLDB Endowment, 4, (6)
Messai, N., Devignes, M., Napoli, A., & Smail-Tabbone, M. (2008). Many-Valued Concept Lattices for Conceptual Clustering and Information Retrieval: ECAI 2008 IOS, https://doi.org./10.3233/978-1-58603-891-5-127
Mihai C.V (2014) Metric & Topological Aspect in Distributed System: Thesis Summary, Institute of Computer Science, Romanian Academy Iasi Branch
Niu, X., & Hemminger B. M. (2011). Effective of Real-time Query Expansion; ASIST 11,
Niwattanakul S., Singthongchai J., NaenudornE. & Wanapu (2013) Using of Jaccard Coeffient for Keyword similarity: Proceedings of the International Multi-conference of Engineers and Computer Scientist 1, 13 15
Poelmans J., Dmetry I. I., Viaene S., Dedene G & Kuzuetsov S. (2012) Text Mining Scientific Papers: A survey on FCA based Information Reaserch ICDM 2012, LNA 73, 77, 273 287
Rocco, M.C.; Hernandez-Perdomo, E. and Mun, J. (2020). Introduction to formal concept analysis and its applications in reliability engineering. Reliability Engineering and System Safety, 202, October 2020, 107002. https://doi.org/10.1016/j.ress.2020.107002.
Stathopoulos, E.A.; Karageorgiadis, A.I.; Kokkalas, A.; Diplaris, S.; Vrochidis, S.; Kompatsiaris,I. (2023) A Query Expansion Benchmark on Social Media Information Retrieval: Which Methodology Performs Best and Aligns with Semantics? Computers, 12, 119. https://doi.org/10.3390/computers12060119.
Vaidyanathan, R., Das, S., & Srivastava, M. (2014). Query Expansion Strategy based on Pseudo Relevance Feedback and Term Weight Scheme for Monolingual Retrieval: international Journal of Computer Applications, 105 (8), 0975 8887
Wang, X., Hu, Z., Bai, R., & Mou, Y. (2010). Automatic Semantic Retrieval and Visualization Model Based on the Integrated Ontology Library: Journal of Computational information Systems 6, (1), 139-145
Wang, Y.; Song, Y. and. Wang, Y. (2023). A Survey of Formal Concept Analysis and Concept Lattice. Frontiers in Computing and Intelligent Systems, 3(3), 81-83, 2023
Wenjie, L. & Qiuxiang X., (2011) A method of Concept Similarity Computation Based on Semantice Distance: Journal proceedings in Control engineering and Information 15 (2011), 3854 3859
Xia, Y., Wu, J., Kim, S., Yu, T., Rossi, A. R., Wang, H. and McAuley, J. (2024). Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval. 1-12, arXiv: 2410.13765v1 [sc.CL] 17 Oct 2024.
Zhang Y, GuoQiang, C., & Yue, J. D. (2011). A Modified Method for Concepts Similarity Calculation: Journal of Convergence Information Technology, 6, (1).
Zhang, L., Wu, Y., Yang, Q. and Nie, J. (2024). Exploring the Best Practices of Query Expansion with Large Language Models. Findings of the Association for Computational Linguistics: EMNLP 2024, pages 18721883. November 12-16, 2024
Zhang, W.; Liu, Z.; Wang, K. and Lian, S. (2024). Query expansion and verification with Large Language Model for information retrieval. Advance Intelligent Computing Technology and Applications. ICIC 2024, 341-351. https://doi.org/10.1007/978-981-97-5672-8_29.
Zhang, Y., & Feng, B., (2008). Clustering Search Result based on Formal Concept analysis: Journal of Information Technology 7, (5), 746 753.
Published
How to Cite
Issue
Section
FUDMA Journal of Sciences