Predicting relevance in scientific information retrieval.

Supervisors: George Giannakopoulos, Artemis Dampa

Description:

The objective of this thesis is to propose and implement methods to improve relevance prediction in scientific information retrieval. During scientific research and literature reviews, researchers often retrieve a large number of documents, datasets, and other resources that need to be managed in order to extract valuable, domain-specific knowledge. To address this challenge, this work will primarily focus on, but is not limited to, the effective representation of scientific documents. This can be achieved by incorporating prior knowledge to better capture the complex, specialized nature of scientific texts. Additionally, the thesis will explore unbiased methods for corpora collection, annotation, and relevance evaluation, as well as relevance classification techniques that integrate both explicit and implicit user feedback. Lastly, query expansion using natural language processing methods will be utilized to automatically expand search queries with related terms, synonyms, and conceptually relevant keywords. Ultimately, the aim is to develop a more accurate and efficient system for retrieving and evaluating scientific literature, enhancing researchers’ ability to access the most relevant information.

Qualifications required: Python programming, Machine Learning and Deep Learning algorithms.
Qualifications desired: Machine learning and Deep learning tool-kits (e.g. PyTorch), Natural Language Processing (NLP).

ggianna [at] iit [dot] demokritos [dot] gr