Testing the resiliency of complex IT services deployed in hybrid Cloud scenarios is a challenging task that requires expensive and possibly destructive operations. An interesting approach lies in Chaos Engineering, a set of practices to test the resiliency of software systems running in a production environment. However, Chaos Engineering is an expensive practice that requires the setup of complicated operations that further increase the complexity of management operations. To reduce this complexity, Chaos Engineering can benefit from the adoption of non-destructive approaches such as the definition of realistic digital twins. A digital twin is a virtual replica of a real-system on which experimenting with management configurations. This paper embraces this research avenue by extending our previous efforts to integrate Chaos Engineering techniques into an IT services management framework called ChaosTwin. ChaosTwin leverages novel methodologies and tools capable of identifying and promptly react to unexpected failures. Finally, to implement autonomous fault management, ChaosTwin defines scaling and migration policies that can quickly explore for more resilient placements of software components in case of system failures. We believe that ChaosTwin can provide useful guidance to service providers in finding cost-effective service configurations capable of minimizing the negative effects of unpredictable events.

A Chaos Engineering Approach for Improving the Resiliency of IT Services Configurations

Poltronieri, Filippo;Tortonesi, Mauro;Stefanelli, Cesare
2022

Abstract

Testing the resiliency of complex IT services deployed in hybrid Cloud scenarios is a challenging task that requires expensive and possibly destructive operations. An interesting approach lies in Chaos Engineering, a set of practices to test the resiliency of software systems running in a production environment. However, Chaos Engineering is an expensive practice that requires the setup of complicated operations that further increase the complexity of management operations. To reduce this complexity, Chaos Engineering can benefit from the adoption of non-destructive approaches such as the definition of realistic digital twins. A digital twin is a virtual replica of a real-system on which experimenting with management configurations. This paper embraces this research avenue by extending our previous efforts to integrate Chaos Engineering techniques into an IT services management framework called ChaosTwin. ChaosTwin leverages novel methodologies and tools capable of identifying and promptly react to unexpected failures. Finally, to implement autonomous fault management, ChaosTwin defines scaling and migration policies that can quickly explore for more resilient placements of software components in case of system failures. We believe that ChaosTwin can provide useful guidance to service providers in finding cost-effective service configurations capable of minimizing the negative effects of unpredictable events.
2022
9781665406017
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2501908
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 2
social impact