SFERA Archivio dei prodotti della Ricerca dell'Università di Ferrara

Sequences play a major role in the extraction of information from data. As an example, in business intelligence, they can be used to track the evolution of customer behaviors over time or to model relevant relationships. In this paper, we focus our attention on the domain of contact centers, where sequential data typically take the form of oral or written interactions, and word sequences often play a major role in text classification, and we investigate the connections between sequential data and text mining techniques. The main contribution of the paper is a new machine learning algorithm, called J48S, that associates semantic knowledge with telephone conversations. The proposed solution is based on the well-known C4.5 decision tree learner, and it is natively able to mix static, that is, numeric or categorical, data and sequential ones, such as texts, for classification purposes. The algorithm, evaluated in a real business setting, is shown to provide competitive classification performances compared with classical approaches, while generating highly interpretable models and effectively reducing the data preparation effort.

J48S: a Sequence Classification Approach to Speech Analysis based on Decision Trees

Brunello, Andrea;Marzano, Enrico;Montanari, Angelo;Sciavicco, Guido^Ultimo

2018

Abstract

Sequences play a major role in the extraction of information from data. As an example, in business intelligence, they can be used to track the evolution of customer behaviors over time or to model relevant relationships. In this paper, we focus our attention on the domain of contact centers, where sequential data typically take the form of oral or written interactions, and word sequences often play a major role in text classification, and we investigate the connections between sequential data and text mining techniques. The main contribution of the paper is a new machine learning algorithm, called J48S, that associates semantic knowledge with telephone conversations. The proposed solution is based on the well-known C4.5 decision tree learner, and it is natively able to mix static, that is, numeric or categorical, data and sequential ones, such as texts, for classification purposes. The algorithm, evaluated in a real business setting, is shown to provide competitive classification performances compared with classical approaches, while generating highly interpretable models and effectively reducing the data preparation effort.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2018
			
	ISBN
	
				9783319999715
			
	Appare nelle tipologie:
	
				04.2 Contributi in atti di convegno (in Volume)

File in questo prodotto:

File	Dimensione	Formato
J48S.pdf solo gestori archivio Descrizione: Pre-print Tipologia: Pre-print Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 287.04 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	287.04 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
J48S A Sequence Classification Approach to Text Analysis Based on Decision Trees.pdf solo gestori archivio Descrizione: Full text editoriale Tipologia: Full text (versione editoriale) Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 599.47 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	599.47 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2392501

Citazioni

ND

5

4

social impact