Back to Teaching
Graduate6 ECTS

Information Retrieval and Natural Language Processing

Main concepts and methods related to information retrieval applied to web-based projects. Covers natural language processing pipelines and retrieval-augmented generation.

Offered at the University of Tirana, Faculty of Natural Sciences, Department of Informatics.

Vector Space ModelTF-IDFNLP PipelinesRecommender SystemsRAG

Overview

Information retrieval concerns enabling access to information found within different collections of documents. This course discusses the main concepts and methods related to information retrieval, applying them to concrete projects focused on retrieving information from the web. Since text-based information retrieval relies on natural language processing methods, these are also covered. Specifically, natural language processing pipelines are built on datasets collected from the web.

Learning Objectives

  • 1.Introduction to Vector Space Model, Cosine Similarity
  • 2.Introduction to Indexes and retrieval
  • 3.Learn to create and use NLP corpora
  • 4.Introduction to Recommender systems

Syllabus

Week 1Introduction to information retrieval models and architectures
Week 2Term dictionary. Word stemming. TF-IDF indexing
Week 3-4Vector space model. Cosine-based similarity
Week 5Evaluation of information retrieval systems
Week 6Probabilistic information retrieval
Week 7-8Implementation of a document search engine prototype
Week 9Natural language processing pipelines (NLP pipelines)
Week 10Supervised NLP taggers
Week 11Creation of labeled corpora for NLP
Week 12Introduction to recommender systems
Week 13Topic detection and tracking
Week 14-15Retrieval-augmented text generation (RAG)

Literature

  • C. D. Manning, P. Raghavan and H. Schütze. Introduction to Information Retrieval. Cambridge University Press
  • D. Altinok. Mastering spaCy. Packt Publishing, 2021
  • K. Hoxha, A. Baxhaku. An Automatically Generated Annotated Corpus for Albanian Named Entity Recognition. Cybernetics and Information Technologies 18.1: 95-108
  • D. Roy and M. Dutta. A systematic review and research perspective on recommender systems. Journal of Big Data, 2022
  • M. Arslan et al. A Survey on RAG with LLMs. Procedia Computer Science, 2024

Interested in this course?

This course is offered at the University of Tirana, Faculty of Natural Sciences. For enrollment information and scheduling, please contact the department or reach out directly.

Get in Touch