Symbolic learning is the logic-based approach to machine learning, and its mission is to provide algorithms and methodologies to extract logical information from data and express it in an interpretable way. Interval temporal logic has been recently proposed as a suitable tool for symbolic learning, specifically via the design of an interval temporal logic decision tree extraction algorithm. In order to improve their performances, interval temporal decision trees can be embedded into interval temporal random forests, mimicking the corresponding schema at the propositional level. In this article we consider a dataset of cough and breath sample recordings of volunteer subjects, labeled with their COVID-19 status, originally collected by the University of Cambridge. By interpreting such recordings as multivariate time series, we study the problem of their automated classification using interval temporal decision trees and forests. While this problem has been approached with the same dataset as well as with other datasets, in all cases, non-symbolic learning methods (usually, deep learning-based) have been applied to solve it; in this article we apply a symbolic approach, and show that it does not only outperform the state-of-the-art obtained with the same dataset, but its results are also superior to those of most non-symbolic techniques applied on other datasets. As an added bonus, thanks to the symbolic nature of our approach, we are also able to extract explicit knowledge to help physicians characterize typical COVID-positive cough and breath.
The voice of COVID-19: Breath and cough recording classification with temporal decision trees and random forests
Federico ManzellaPrimo
Membro del Collaboration Group
;Giovanni PagliariniSecondo
Software
;Guido Sciavicco
Penultimo
Conceptualization
;
2023
Abstract
Symbolic learning is the logic-based approach to machine learning, and its mission is to provide algorithms and methodologies to extract logical information from data and express it in an interpretable way. Interval temporal logic has been recently proposed as a suitable tool for symbolic learning, specifically via the design of an interval temporal logic decision tree extraction algorithm. In order to improve their performances, interval temporal decision trees can be embedded into interval temporal random forests, mimicking the corresponding schema at the propositional level. In this article we consider a dataset of cough and breath sample recordings of volunteer subjects, labeled with their COVID-19 status, originally collected by the University of Cambridge. By interpreting such recordings as multivariate time series, we study the problem of their automated classification using interval temporal decision trees and forests. While this problem has been approached with the same dataset as well as with other datasets, in all cases, non-symbolic learning methods (usually, deep learning-based) have been applied to solve it; in this article we apply a symbolic approach, and show that it does not only outperform the state-of-the-art obtained with the same dataset, but its results are also superior to those of most non-symbolic techniques applied on other datasets. As an added bonus, thanks to the symbolic nature of our approach, we are also able to extract explicit knowledge to help physicians characterize typical COVID-positive cough and breath.File | Dimensione | Formato | |
---|---|---|---|
pubblicato.pdf
accesso aperto
Descrizione: Full text editoriale
Tipologia:
Full text (versione editoriale)
Licenza:
PUBBLICO - Pubblico con Copyright
Dimensione
956.35 kB
Formato
Adobe PDF
|
956.35 kB | Adobe PDF | Visualizza/Apri |
I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.