FFECTIVE RETRIEVAL OF RELEVANT WEB DOCUMENTS: A QUERY EXPANSION APPROACH USING FORMAL CONCEPT ANALYSIS

  • Abdullahi Bn Umar Federal University of Education Kano
Keywords: Formal concept, Precision, Query expansion, Query, Recall, Relevant document, Vector transformation

Abstract

In information retrieval, vocabulary mismatch between the search vocabulary and documents vocabulary has been a common challenge for web users, hindering information access and retrieval. This issue is often attributed to the ambiguous representation of user information needs as queries, leading to the retrieval of many irrelevant documents, particularly for non-skilled web users. This paper aims to improve user query representation for effective retrieval of relevant documents from the web. To achieve this, a query expansion strategy was employed to identify terms with similar meanings to the user’s initial query term. The similarities between the expanded terms and the initial user query term were determined by calculating the cosine angle between the vector resenting document vocabulary and the vector representing query term. Thereafter, Formal Concept Analysis (FCA) was employed to analyze and present the results. Findings from the analysis revealed that concatenating similar terms with the initial user query terms resulted in a 0.16% improvement, as evident in the retrieval precision of 0.64% with the initial query and 0.80% with the expanded query terms.

References

Abdullahi, U.B. and Ekuobase, G. O. (2024). A Lingual Agnostic Information Retrieval System. The Scientific World Journal 2024, Article ID 6949281, 37. https://doi.org./10.1155/2024/6949281. DOI: https://doi.org/10.1155/2024/6949281

Afuan, L; .Ashari, A. and Suyanto, Y. (2019). A study: query expansion methods in information retrieval. Journal of Physics: Conference Series 1367 (2019) 012001 https://doi.org./10.1088/1742-6596/1367/1/012001 DOI: https://doi.org/10.1088/1742-6596/1367/1/012001

Boissier, B.; Rychkova, I.; Le Grand, B. (2024). Using Formal Concept Analysis for Corpus Visualisation and Relevance Analysis. 16th International Conference on Knowledge Management and Information Systems, Nov 2024, Porto, Portugal. 120-129, ff10.5220/0013047800003838ff. ffhal-04808054ff DOI: https://doi.org/10.5220/0013047800003838

Boukhetta, E. S. and Trabelsi, M. (2023). Formal Concept Analysis for Trace Clustering in Process Mining. International Conference on Conceptual Structures, Sep. 2023. Berlin, Germany. 73-88. DOI: https://doi.org/10.1007/978-3-031-40960-8_7

Cakir, A. and Gurkan, M. (2023). Modified query expansion through generative adversarial networks for information extraction in e-commerce. Machine Learning with Applications, 14, (2023) 100509. DOI: https://doi.org/10.1016/j.mlwa.2023.100509

El Qadi A., Aboutajdine D., & Ennouary Y. (2010) Formal Concept Analysis for Information Retrieval: International Journal of Computer Science and Information Security, 7 (2), 119-125

Eminagaoglu, M. (2020). A new similarity measure for vector space models in text classification and information retrieval. Journal of Information Science, 48(4). https://doi.org/10.1177/0165551520968055 DOI: https://doi.org/10.1177/0165551520968055

Huang, L., Milne, D., Frank, E., & Witten, I. H. (2011) Learning a Concept-based Document Similarity Measure: Retrieved on 3rd July 2015 from www.cs.waikato.ac.nz/

Jothilakshmi, R., Shanthi, N., & Babisaraswathi, R., (2013). A survey of semantic query Expansion: Journal of Theoretical and Applied Information Technology, 57, (1), 128-138 DOI: https://doi.org/10.1109/ICCCNT.2013.6726755

Kruiper, R.; Konstas, I; .Alasdair J.G.G.; Sadeghineko, F.; Watson, R. and Kumar, B. (2023). Document and Query Expansion for Information Retrieval on Building Regulations Lattices. Frontiers in Computing and Intelligent Systems, 3(3), 81-83.

Liu, Z., Natarajan, S. & Chen, Y. (2011). Query Expansion Based on Clustered Results Proceedings of the VLDB Endowment, 4, (6) DOI: https://doi.org/10.14778/1978665.1978667

Messai, N., Devignes, M., Napoli, A., & Smail-Tabbone, M. (2008). Many-Valued Concept Lattices for Conceptual Clustering and Information Retrieval: ECAI 2008 IOS, https://doi.org./10.3233/978-1-58603-891-5-127

Mihai C.V (2014) Metric & Topological Aspect in Distributed System: Thesis Summary, Institute of Computer Science, Romanian Academy Iasi Branch

Niu, X., & Hemminger B. M. (2011). Effective of Real-time Query Expansion; ASIST 11,

Niwattanakul S., Singthongchai J., NaenudornE. & Wanapu (2013) Using of Jaccard Coeffient for Keyword similarity: Proceedings of the International Multi-conference of Engineers and Computer Scientist 1, 13 15

Poelmans J., Dmetry I. I., Viaene S., Dedene G & Kuzuetsov S. (2012) Text Mining Scientific Papers: A survey on FCA based Information Reaserch ICDM 2012, LNA 73, 77, 273 287 DOI: https://doi.org/10.1007/978-3-642-31488-9_22

Rocco, M.C.; Hernandez-Perdomo, E. and Mun, J. (2020). Introduction to formal concept analysis and its applications in reliability engineering. Reliability Engineering and System Safety, 202, October 2020, 107002. https://doi.org/10.1016/j.ress.2020.107002. DOI: https://doi.org/10.1016/j.ress.2020.107002

Stathopoulos, E.A.; Karageorgiadis, A.I.; Kokkalas, A.; Diplaris, S.; Vrochidis, S.; Kompatsiaris,I. (2023) A Query Expansion Benchmark on Social Media Information Retrieval: Which Methodology Performs Best and Aligns with Semantics? Computers, 12, 119. https://doi.org/10.3390/computers12060119. DOI: https://doi.org/10.3390/computers12060119

Vaidyanathan, R., Das, S., & Srivastava, M. (2014). Query Expansion Strategy based on Pseudo Relevance Feedback and Term Weight Scheme for Monolingual Retrieval: international Journal of Computer Applications, 105 (8), 0975 8887

Wang, X., Hu, Z., Bai, R., & Mou, Y. (2010). Automatic Semantic Retrieval and Visualization Model Based on the Integrated Ontology Library: Journal of Computational information Systems 6, (1), 139-145

Wang, Y.; Song, Y. and. Wang, Y. (2023). A Survey of Formal Concept Analysis and Concept Lattice. Frontiers in Computing and Intelligent Systems, 3(3), 81-83, 2023 DOI: https://doi.org/10.54097/fcis.v3i3.8571

Wenjie, L. & Qiuxiang X., (2011) A method of Concept Similarity Computation Based on Semantice Distance: Journal proceedings in Control engineering and Information 15 (2011), 3854 3859 DOI: https://doi.org/10.1016/j.proeng.2011.08.721

Xia, Y., Wu, J., Kim, S., Yu, T., Rossi, A. R., Wang, H. and McAuley, J. (2024). Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval. 1-12, arXiv: 2410.13765v1 [sc.CL] 17 Oct 2024.

Zhang Y, GuoQiang, C., & Yue, J. D. (2011). A Modified Method for Concepts Similarity Calculation: Journal of Convergence Information Technology, 6, (1). DOI: https://doi.org/10.4156/jcit.vol6.issue1.4

Zhang, L., Wu, Y., Yang, Q. and Nie, J. (2024). Exploring the Best Practices of Query Expansion with Large Language Models. Findings of the Association for Computational Linguistics: EMNLP 2024, pages 18721883. November 12-16, 2024 DOI: https://doi.org/10.18653/v1/2024.findings-emnlp.103

Zhang, W.; Liu, Z.; Wang, K. and Lian, S. (2024). Query expansion and verification with Large Language Model for information retrieval. Advance Intelligent Computing Technology and Applications. ICIC 2024, 341-351. https://doi.org/10.1007/978-981-97-5672-8_29. DOI: https://doi.org/10.1007/978-981-97-5672-8_29

Zhang, Y., & Feng, B., (2008). Clustering Search Result based on Formal Concept analysis: Journal of Information Technology 7, (5), 746 753. DOI: https://doi.org/10.3923/itj.2008.746.753

Published
2025-03-31
How to Cite
Umar, A. B. (2025). FFECTIVE RETRIEVAL OF RELEVANT WEB DOCUMENTS: A QUERY EXPANSION APPROACH USING FORMAL CONCEPT ANALYSIS. FUDMA JOURNAL OF SCIENCES, 9(3), 340 - 353. https://doi.org/10.33003/fjs-2025-0903-3242