SFERA Archivio dei prodotti della Ricerca dell'Università di Ferrara

Regularized empirical risk minimization problems arise in a variety of applications, including machine learning, signal processing, and image processing. Proximal stochastic gradient algorithms are a standard approach to solve these problems due to their low computational cost per iteration and a relatively simple implementation. This paper introduces a class of proximal stochastic gradient methods built on three key elements: a variable metric underlying the iterations, a stochastic line search governing the decrease properties and an incremental mini-batch size technique based on additional sampling. Convergence results for the proposed algorithms are proved under different hypotheses on the function to minimize. No assumption is required regarding the Lipschitz continuity of the gradient of the differentiable part of the objective function. Possible strategies to automatically select the parameters of the suggested scheme are discussed. Numerical experiments on both binary classification and nonlinear regression problems show the effectiveness of the suggested approach compared to other state-of-the-art proximal stochastic gradient methods.

Variable metric proximal stochastic gradient methods with additional sampling

Krklec Jerinkic N.;Porta F.;Ruggiero V.;Trombini I.

2025

Abstract

Regularized empirical risk minimization problems arise in a variety of applications, including machine learning, signal processing, and image processing. Proximal stochastic gradient algorithms are a standard approach to solve these problems due to their low computational cost per iteration and a relatively simple implementation. This paper introduces a class of proximal stochastic gradient methods built on three key elements: a variable metric underlying the iterations, a stochastic line search governing the decrease properties and an incremental mini-batch size technique based on additional sampling. Convergence results for the proposed algorithms are proved under different hypotheses on the function to minimize. No assumption is required regarding the Lipschitz continuity of the gradient of the differentiable part of the objective function. Possible strategies to automatically select the parameters of the suggested scheme are discussed. Numerical experiments on both binary classification and nonlinear regression problems show the effectiveness of the suggested approach compared to other state-of-the-art proximal stochastic gradient methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	DOI
	
				https://dx.doi.org/10.1007/s10589-025-00720-w
			
	Titolo della Rivista
	
				COMPUTATIONAL OPTIMIZATION AND APPLICATIONS
			
	Tutti gli autori
	
						Krklec Jerinkic, N.; Porta, F.; Ruggiero, V.; Trombini, I.

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2596811

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

0

social impact