The large amount of data that is produced today with new technologies is an impediment for machine learning algorithms to work correctly, both due to the memory requirements and the necessary execution times. That is why the processes of reducing both the quantity and the size of the data are increasingly important. One of these processes is the so-called instance selection. In this paper we propose three-objective constrained optimization models to formulate instance selection wrapper and filter methods (separately) for classification problems, which are solved with multi-objective evolutionary algorithms and multi-objective differential evolution. In the proposed instance selection wrapper method, an objective is added to the usual ones to minimize the generalization error of the classifier. The proposed instance selection filter method simultaneously optimizes the correlation, redundancy and consistency of the datasets. Instance retention constraints are imposed on optimization models to retain a maximum percentage of samples, established by the decision maker, in big data scenarios. The experiments have been designed to compare (1) the NSGA-II and MODE algorithms, (2) two- and three-objective optimization models, (3) two different constraint handling techniques, and (4) the proposed evolutionary approaches and other 12 non-evolutionary approaches used in literature. The proposed wrapper and filter instance selection methods have been used in a real-world business engineering application, and have also been validated using three public datasets to facilitate the replicability of the research results. The results of the experiments show the superiority of the three-objective constrained evolutionary techniques proposed in this paper over the non-evolutionary techniques and over the two-objective evolutionary approaches used in the literature.

Three-objective constrained evolutionary instance selection for classification: Wrapper and filter approaches

Sciavicco G.
Ultimo
2022

Abstract

The large amount of data that is produced today with new technologies is an impediment for machine learning algorithms to work correctly, both due to the memory requirements and the necessary execution times. That is why the processes of reducing both the quantity and the size of the data are increasingly important. One of these processes is the so-called instance selection. In this paper we propose three-objective constrained optimization models to formulate instance selection wrapper and filter methods (separately) for classification problems, which are solved with multi-objective evolutionary algorithms and multi-objective differential evolution. In the proposed instance selection wrapper method, an objective is added to the usual ones to minimize the generalization error of the classifier. The proposed instance selection filter method simultaneously optimizes the correlation, redundancy and consistency of the datasets. Instance retention constraints are imposed on optimization models to retain a maximum percentage of samples, established by the decision maker, in big data scenarios. The experiments have been designed to compare (1) the NSGA-II and MODE algorithms, (2) two- and three-objective optimization models, (3) two different constraint handling techniques, and (4) the proposed evolutionary approaches and other 12 non-evolutionary approaches used in literature. The proposed wrapper and filter instance selection methods have been used in a real-world business engineering application, and have also been validated using three public datasets to facilitate the replicability of the research results. The results of the experiments show the superiority of the three-objective constrained evolutionary techniques proposed in this paper over the non-evolutionary techniques and over the two-objective evolutionary approaches used in the literature.
2022
Jimenez, F.; Sanchez, G.; Palma, J.; Sciavicco, G.
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0952197621003791-main.pdf

accesso aperto

Descrizione: Full text editoriale
Tipologia: Full text (versione editoriale)
Licenza: Creative commons
Dimensione 3.02 MB
Formato Adobe PDF
3.02 MB Adobe PDF Visualizza/Apri

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2471755
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 3
social impact