During the Italian research assessment exercise, the national agency ANVUR performed an experiment to assess agreement between grades obtained through informed peer review (IR) and bibliometrics. A sample was evaluated by using both methods and concordance was analyzed by weighted Cohen's kappas. According to ANVUR results indicated an overall "more than adequate" agreement which "fully justifies" the choice of using jointly both techniques in the assessment. However, according to available statistical guidelines for kappa values, the degree of agreement has to be interpreted, for all research fields, as poor or, in a few cases, as, at most, fair. The only notable exception is Area 13 (economics and statistics) and its sub-areas, showing moderate agreement. However, a statistical meta-analysis rejects the hypothesis that kappas from Area 13 share the same distribution as those from the other areas. In fact, a scrutiny of the experiment protocol adopted by the Area 13 panel highlights substantial modifications with respect to protocols of all the other areas, to the point that results for Area 13 have to be considered as fatally flawed. The evidence of a poor to fair concordance supports the conclusion that IR and bibliometrics do not produce similar results. As a consequence, final results reached in the Italian research assessment possibly depend on the mix of instruments used for evaluating research outputs. The conclusion reached by ANVUR must be reversed: the available evidence does not justify at all the joint use of both techniques within the same research assessment exercise.

Baccini, A., Giuseppe, D.N. (2015). Do they agree? Bibliometric evaluation vs informed peer review in the Italian research assessment exercise.

Do they agree? Bibliometric evaluation vs informed peer review in the Italian research assessment exercise

BACCINI, ALBERTO;
2015-01-01

Abstract

During the Italian research assessment exercise, the national agency ANVUR performed an experiment to assess agreement between grades obtained through informed peer review (IR) and bibliometrics. A sample was evaluated by using both methods and concordance was analyzed by weighted Cohen's kappas. According to ANVUR results indicated an overall "more than adequate" agreement which "fully justifies" the choice of using jointly both techniques in the assessment. However, according to available statistical guidelines for kappa values, the degree of agreement has to be interpreted, for all research fields, as poor or, in a few cases, as, at most, fair. The only notable exception is Area 13 (economics and statistics) and its sub-areas, showing moderate agreement. However, a statistical meta-analysis rejects the hypothesis that kappas from Area 13 share the same distribution as those from the other areas. In fact, a scrutiny of the experiment protocol adopted by the Area 13 panel highlights substantial modifications with respect to protocols of all the other areas, to the point that results for Area 13 have to be considered as fatally flawed. The evidence of a poor to fair concordance supports the conclusion that IR and bibliometrics do not produce similar results. As a consequence, final results reached in the Italian research assessment possibly depend on the mix of instruments used for evaluating research outputs. The conclusion reached by ANVUR must be reversed: the available evidence does not justify at all the joint use of both techniques within the same research assessment exercise.
2015
Baccini, A., Giuseppe, D.N. (2015). Do they agree? Bibliometric evaluation vs informed peer review in the Italian research assessment exercise.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/983838
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo