Three Journal Similarity Metrics and their Application to Biomedical Journals



Journal Title

Journal ISSN

Volume Title


Public Library of Science


In the present paper, we have created several novel journal similarity metrics. The MeSH odds ratio measures the topical similarity of any pair of journals, based on the major MeSH headings assigned to articles in MEDLINE. The second metric employed the 2009 Authority author name disambiguation dataset as a gold standard for estimating the author odds ratio. This gives a straightforward, intuitive answer to the question: Given two articles in PubMed that share the same author name (lastname, first initial), how does knowing only the identity of the journals (in which the articles were published) predict the relative likelihood that they are written by the same person vs. different persons? The article pair odds ratio detects the tendency of authors to publish repeatedly in the same journal, as well as in specific pairs of journals. The metrics can be applied not only to estimate the similarity of a pair of journals, but to provide novel profiles of individual journals as well. For example, for each journal, one can define the MeSH cloud as the number of other journals that are topically more similar to it than expected by chance, and the author cloud as the number of other journals that share more authors than expected by chance. These metrics for journal pairs and individual journals have been provided in the form of public datasets that can be readily studied and utilized by others.


Supplementary materials.


Bioinformatics, Odds ratios, Metrics, Medical Subject Headings (MeSH)

US National Institutes of Health (R01LM010817, P01AG039347).


CC BY 4.0 (Attribution), ©2014 The Authors


D'Souza, J. L., and N. R. Smalheiser. 2014. "Three journal similarity metrics and their application to biomedical journals." PLOS One 9(12): e115681.