SFERA Archivio dei prodotti della Ricerca dell'Università di Ferrara

Free-text information is still widely used in emergency department (ED) records. Machine learning techniques are useful for analyzing narratives, but they have been used mostly for English-language data sets. Considering such a framework, the performance of an ML classification task of a Spanish-language ED visits database was tested. ED visits collected in the EDs of nine hospitals in Nicaragua were analyzed. Spanish-language, free-text discharge diagnoses were considered in the analysis. Five-hundred random forests were trained on a set of bootstrap samples of the whole data set (1,789 ED visits) to perform the classification task. For each one, after having identified optimal parameter value, the final validated model was trained on the whole bootstrapped data set and tested. The classification accuracies had a median of 0.783 (95% CI [0.779, 0.796]). Machine learning techniques seemed to be a promising opportunity for the exploitation of unstructured information reported in ED records in low- and middle-income Spanish-speaking countries.

Analysis of Unstructured Text-Based Data Using Machine Learning Techniques: The Case of Pediatric Emergency Department Records in Nicaragua

Lorenzoni G.^Primo;Bressan S.^Secondo;Lanera C.;Azzolina D.;Da Dalt L.;Gregori D.^Ultimo

2021

Abstract

Free-text information is still widely used in emergency department (ED) records. Machine learning techniques are useful for analyzing narratives, but they have been used mostly for English-language data sets. Considering such a framework, the performance of an ML classification task of a Spanish-language ED visits database was tested. ED visits collected in the EDs of nine hospitals in Nicaragua were analyzed. Spanish-language, free-text discharge diagnoses were considered in the analysis. Five-hundred random forests were trained on a set of bootstrap samples of the whole data set (1,789 ED visits) to perform the classification task. For each one, after having identified optimal parameter value, the final validated model was trained on the whole bootstrapped data set and tested. The classification accuracies had a median of 0.783 (95% CI [0.779, 0.796]). Machine learning techniques seemed to be a promising opportunity for the exploitation of unstructured information reported in ED records in low- and middle-income Spanish-speaking countries.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2021
			
	DOI
	
				https://dx.doi.org/10.1177/1077558719844123
			
	Titolo della Rivista
	
				MEDICAL CARE RESEARCH AND REVIEW
			
	Tutti gli autori
	
						Lorenzoni, G.; Bressan, S.; Lanera, C.; Azzolina, D.; Da Dalt, L.; Gregori, D.
					
	Appare nelle tipologie:
	
				03.1 Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
2019_Medical Care Research and Review.pdf solo gestori archivio Tipologia: Full text (versione editoriale) Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 435.41 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	435.41 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2485163

Citazioni

6

8

6

social impact