Symbolic learning is the subfield of machine learning concerned with learning predictive models with knowledge represented in logical form, such as decision tree and decision list models. Ensemble learning methods, such as random forests, are usually deployed to improve the performance of decision trees; unfortunately, interpreting tree ensembles is challenging. In order to deal with unstructured (e.g., temporal or spatial) data, moreover, decision trees and random forests have been recently generalized to the use of modal logics, which are harder to interpret than their propositional counterpart. Recently, a methodology for extracting simple rules from propositional random forests, based on a sequence of optimization steps, was proposed. In this work, we generalize this approach along two directions: from propositional to modal logic and from a sequence of optimization steps to a single multi-objective optimization problem. Even if confined to the temporal domain, our experimental results, based on open-source implementations and public data, show that our method is robust and able to extract small, accurate, and informative decision lists even for complex classification problems.
Evolutionary Explainable Rule Extraction from (Modal) Random Forests
Michele Ghiotti;Federico Manzella;Giovanni Pagliarini;Guido Sciavicco;Ionel Eduard Stan
2023
Abstract
Symbolic learning is the subfield of machine learning concerned with learning predictive models with knowledge represented in logical form, such as decision tree and decision list models. Ensemble learning methods, such as random forests, are usually deployed to improve the performance of decision trees; unfortunately, interpreting tree ensembles is challenging. In order to deal with unstructured (e.g., temporal or spatial) data, moreover, decision trees and random forests have been recently generalized to the use of modal logics, which are harder to interpret than their propositional counterpart. Recently, a methodology for extracting simple rules from propositional random forests, based on a sequence of optimization steps, was proposed. In this work, we generalize this approach along two directions: from propositional to modal logic and from a sequence of optimization steps to a single multi-objective optimization problem. Even if confined to the temporal domain, our experimental results, based on open-source implementations and public data, show that our method is robust and able to extract small, accurate, and informative decision lists even for complex classification problems.I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.