In this work, our primary focus was on assessing the effectiveness, in terms of quality, of an AI Retrieval-augmented Generation application. After constructing this AI web application, we designed and generated a synthetic dataset consisting of questions and answers based on a set of documents. We then compared the performance of two distinct language models powering the application according to a predefined set of metrics when responding to the dataset's questions. In this project, we opted to utilize open Large Language Models and we run them locally without relying on any cloudbased service.
Time to Hire a Robot Psychologist? Evaluating a Corporate RAG Application
Odorizzi, Andrea;Mazzini, Gianluca
2024
Abstract
In this work, our primary focus was on assessing the effectiveness, in terms of quality, of an AI Retrieval-augmented Generation application. After constructing this AI web application, we designed and generated a synthetic dataset consisting of questions and answers based on a set of documents. We then compared the performance of two distinct language models powering the application according to a predefined set of metrics when responding to the dataset's questions. In this project, we opted to utilize open Large Language Models and we run them locally without relying on any cloudbased service.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


