SFERA Archivio dei prodotti della Ricerca dell'Università di Ferrara

Authorship recognition and author names disambiguation are main issues affecting the quality and reliability of bibliographic records retrieved from digital libraries, such as Web of Science, Scopus, Google Scholar and many others. So far, these problems have been faced using methods mainly based on text-pattern-recognition for specific datasets, with high-level degree of errors. In this paper, we propose a different approach using neural networks to learn features automatically for solving authorship recognition and disambiguation of author names. The network learns for each author the set of co-writers, and from this information recovers authorship of papers. In addition, the network can be trained taking into account other features, such as author affiliations, keywords, projects and research areas. The network has been developed using the TensorFlow framework, and run on recent Nvidia GPUs and multi-core Intel CPUs. Test datasets have been selected from records of Scopus digital library, for several groups of authors working in the fields of computer science, environmental science and physics. The proposed methods achieves accuracies above 99% in authorship recognition, and is able to effectively disambiguate homonyms. We have taken into account several network parameters, such as training-set and batch size, number of levels and hidden units, weights initialization, back-propagation algorithms, and analyzed also their impact on accuracy of results. This approach can be easily extended to any dataset and any bibliographic records provider (Table presented.).

Authorship recognition and disambiguation of scientific papers using a neural networks approach

Schifano S. F.^Primo;Sgarbanti T.;Tomassetti L.^Ultimo

2018

Abstract

Authorship recognition and author names disambiguation are main issues affecting the quality and reliability of bibliographic records retrieved from digital libraries, such as Web of Science, Scopus, Google Scholar and many others. So far, these problems have been faced using methods mainly based on text-pattern-recognition for specific datasets, with high-level degree of errors. In this paper, we propose a different approach using neural networks to learn features automatically for solving authorship recognition and disambiguation of author names. The network learns for each author the set of co-writers, and from this information recovers authorship of papers. In addition, the network can be trained taking into account other features, such as author affiliations, keywords, projects and research areas. The network has been developed using the TensorFlow framework, and run on recent Nvidia GPUs and multi-core Intel CPUs. Test datasets have been selected from records of Scopus digital library, for several groups of authors working in the fields of computer science, environmental science and physics. The proposed methods achieves accuracies above 99% in authorship recognition, and is able to effectively disambiguate homonyms. We have taken into account several network parameters, such as training-set and batch size, number of levels and hidden units, weights initialization, back-propagation algorithms, and analyzed also their impact on accuracy of results. This approach can be easily extended to any dataset and any bibliographic records provider (Table presented.).

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2018
			
	Parole chiave
	
				Neural Networks, Database
			
	Appare nelle tipologie:
	
				04.1 Contributi in atti di convegno (in Rivista)

File in questo prodotto:

File	Dimensione	Formato
ISGC 2018 & FCDD_007.pdf accesso aperto Descrizione: Full text editoriale Tipologia: Full text (versione editoriale) Licenza: Creative commons Dimensione 916.54 kB Formato Adobe PDF Visualizza/Apri	916.54 kB	Adobe PDF	Visualizza/Apri

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2423424

Citazioni

ND

2

ND

social impact