This paper explores the possibility of classifying journal articles by exploiting multiple information sources, instead of relying on only one information source at a time. In particular, the Similarity Network Fusion (SNF) technique is used to merge the different layers of information about articles when they are organized as a multiplex network. The method proposed is tested on a case study consisting of the articles published in the Cambridge Journal of Economics. The information about articles is organized in a two-layer multiplex where the first layer contains similarities among articles based on the full-text of articles, and the second layer contains similarities based on the cited references. The unsupervised similarity network fusion process combines the two layers by building a new single-layer network. Distance correlation and partial distance correlation indexes are then used for estimating the contribution of each layer of information to the determination of the structure of the fused network. A clustering algorithm is lastly applied to the fused network for obtaining a classification of articles. The classification obtained through SNF has been evaluated from an expert point of view, by inspecting whether it can be interpreted and labelled with reference to research programs and methodologies adopted in economics. Moreover, the classification obtained in the fused network is compared with the two classifications obtained when cited references and contents are considered separately. Overall, the classification obtained on the fused network appears to be fine-grained enough to represent the extreme heterogeneity characterizing the contributions published in the Cambridge Journal of Economics.

Baccini, A., Barabesi, L., Cioni, M., Petrovich, E., Pignalosa, D. (2023). Fine-grained classification of journal articles by relying on multiple layers of information through similarity network fusion: the case of the Cambridge Journal of Economics [10.48550/arxiv.2305.00026].

Fine-grained classification of journal articles by relying on multiple layers of information through similarity network fusion: the case of the Cambridge Journal of Economics

Alberto Baccini
;
Lucio Barabesi;Martina Cioni;Eugenio Petrovich;Daria Pignalosa
2023-01-01

Abstract

This paper explores the possibility of classifying journal articles by exploiting multiple information sources, instead of relying on only one information source at a time. In particular, the Similarity Network Fusion (SNF) technique is used to merge the different layers of information about articles when they are organized as a multiplex network. The method proposed is tested on a case study consisting of the articles published in the Cambridge Journal of Economics. The information about articles is organized in a two-layer multiplex where the first layer contains similarities among articles based on the full-text of articles, and the second layer contains similarities based on the cited references. The unsupervised similarity network fusion process combines the two layers by building a new single-layer network. Distance correlation and partial distance correlation indexes are then used for estimating the contribution of each layer of information to the determination of the structure of the fused network. A clustering algorithm is lastly applied to the fused network for obtaining a classification of articles. The classification obtained through SNF has been evaluated from an expert point of view, by inspecting whether it can be interpreted and labelled with reference to research programs and methodologies adopted in economics. Moreover, the classification obtained in the fused network is compared with the two classifications obtained when cited references and contents are considered separately. Overall, the classification obtained on the fused network appears to be fine-grained enough to represent the extreme heterogeneity characterizing the contributions published in the Cambridge Journal of Economics.
2023
Baccini, A., Barabesi, L., Cioni, M., Petrovich, E., Pignalosa, D. (2023). Fine-grained classification of journal articles by relying on multiple layers of information through similarity network fusion: the case of the Cambridge Journal of Economics [10.48550/arxiv.2305.00026].
File in questo prodotto:
File Dimensione Formato  
arxiv-preprint-2305.00026.pdf

accesso aperto

Tipologia: Pre-print
Licenza: Creative commons
Dimensione 2.21 MB
Formato Adobe PDF
2.21 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1231794