Information Retrieval and Natural Language Processing
Main concepts and methods related to information retrieval applied to web-based projects. Covers natural language processing pipelines and retrieval-augmented generation.
Offered at the University of Tirana, Faculty of Natural Sciences, Department of Informatics.
Overview
Information retrieval concerns enabling access to information found within different collections of documents. This course discusses the main concepts and methods related to information retrieval, applying them to concrete projects focused on retrieving information from the web. Since text-based information retrieval relies on natural language processing methods, these are also covered. Specifically, natural language processing pipelines are built on datasets collected from the web.
Learning Objectives
- 1.Introduction to Vector Space Model, Cosine Similarity
- 2.Introduction to Indexes and retrieval
- 3.Learn to create and use NLP corpora
- 4.Introduction to Recommender systems
Syllabus
Literature
- C. D. Manning, P. Raghavan and H. Schütze. Introduction to Information Retrieval. Cambridge University Press
- D. Altinok. Mastering spaCy. Packt Publishing, 2021
- K. Hoxha, A. Baxhaku. An Automatically Generated Annotated Corpus for Albanian Named Entity Recognition. Cybernetics and Information Technologies 18.1: 95-108
- D. Roy and M. Dutta. A systematic review and research perspective on recommender systems. Journal of Big Data, 2022
- M. Arslan et al. A Survey on RAG with LLMs. Procedia Computer Science, 2024
Interested in this course?
This course is offered at the University of Tirana, Faculty of Natural Sciences. For enrollment information and scheduling, please contact the department or reach out directly.
Get in Touch