In Industry 5.0, the scarcity of data on defective components in smart manufacturing leads to imbalanced datasets. This imbalance poses a significant challenge to the development of robust Machine Learning (ML) models, which typically require a rich variety of data for effective training. The imbalance not only restricts the models’ accuracy but also their applicability in diverse industrial scenarios. To tackle this issue, our research delves into the capabilities of Deep Generative Models, with a special focus on Generative Adversarial Networks, for the generation of synthetic data. This approach is aimed at rectifying dataset imbalances, thereby enhancing the training process of ML models. We demonstrate how synthetic data can substantially bolster the performance and reliability of ML models in industrial settings. Furthermore, the paper presents an innovative MLOps pipeline and architecture, meticulously designed to incorporate Deep Generative Models (DGMs) into the entire ML development cycle. This solution is automated and goes beyond mere automation; it is self-optimizing and capable of making necessary corrections, specifically engineered to address the dual challenges of data imbalance and scarcity, thus enabling more precise and dependable ML applications in smart manufacturing.
An MLOps Framework for GAN-based Fault Detection in Bonfiglioli’s EVO Plant
Dahdal, Simon
;Colombi, Lorenzo;Brina, Matteo;Gilli, Alessandro;Tortonesi, Mauro;Stefanelli, Cesare
2024
Abstract
In Industry 5.0, the scarcity of data on defective components in smart manufacturing leads to imbalanced datasets. This imbalance poses a significant challenge to the development of robust Machine Learning (ML) models, which typically require a rich variety of data for effective training. The imbalance not only restricts the models’ accuracy but also their applicability in diverse industrial scenarios. To tackle this issue, our research delves into the capabilities of Deep Generative Models, with a special focus on Generative Adversarial Networks, for the generation of synthetic data. This approach is aimed at rectifying dataset imbalances, thereby enhancing the training process of ML models. We demonstrate how synthetic data can substantially bolster the performance and reliability of ML models in industrial settings. Furthermore, the paper presents an innovative MLOps pipeline and architecture, meticulously designed to incorporate Deep Generative Models (DGMs) into the entire ML development cycle. This solution is automated and goes beyond mere automation; it is self-optimizing and capable of making necessary corrections, specifically engineered to address the dual challenges of data imbalance and scarcity, thus enabling more precise and dependable ML applications in smart manufacturing.I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.