Organizing and tracking bibliometric data, such as Net of Science, arXiv, PubMed
WoS), we have been unable to include things like them totally. Nevertheless, we produced positive that the dataset from each and every database is representative of it with regards to papersPLOS 1 | DOI:10.1371/journal.pone.0127390 May well 18,two /Consistency of Databasesand citations (see Strategies). From each and every database we constructed 3 bibliometric networks using the following 3 network paradigms (categories): ?P ! P, directed paper citation network (nodes: papers, hyperlinks: 1 paper citing yet another), ?A A, directed author citation network (nodes: authors, links: 1 author cites an additional in a minimum of certainly one of his/her papers), ?A , undirected co-authorship network (nodes: authors, hyperlinks: co-authorship of a minimum of 1 paper). This provides us the total of 6+6+6 = 18 networks (12 directed and 6 undirected), to which we devote the rest of this paper. Our aim would be to study the consistency amongst the networks within every Raits should be related to these dimensions. Previous analysis indicates that single category with regards to their topologies, from which title= 1874285801105010000 we draw conclusions around the consistency among the databases. In Table 1 we summarize title= tx200140s the fundamental properties on the 18 examined networks. Numbers of nodes and links differ considerably, but are often larger than 104. WCC will be the fraction of nodes contained in the biggest connected element (weak connectivity for directed networks, see Methods). With exception of DBLP P ! P network, it generally contains at the least 80 of nodes (DBLP database consis.Organizing and tracking bibliometric data, such as Web of Science, arXiv, PubMed and so on. In addition, none of your datasets is absolutely free from errors, largely occurring due to distinctive referencing styles or typos in authors names (in certain names using non-English characters), which normally cause incorrectly recorded collaborations and citations. This in practice means that each and every bibliometric study in itself unavoidably carries some degree of bias, resulting from the decision of the database. On top rated of this comes the truth that various fields typically have different collaboration and citation cultures, which additional complicates situation of objectively comparing unique scientific fields. However, researchers is bibliometrics commonly function relying on the database at their disposal. Locating additional information is frequently hard and often high priced. We here conduct a detailed evaluation in the consistency among six important scientific databases, employing 3 unique paradigms (categories) of bibliometric networks (paper citation, author citation and collaboration). This amounts to a significant methodological and empirical extension of our earlier paper : more datasets and network paradigms are thought of, and findings confirmed by complementary analyses. Our outcomes consist of an approximate quantification of consistency amongst the six databases that hold within every title= journal.pone.0020575 network category. Our study aims at being useful to colleagues when deciding on by far the most suitable network paradigm.ResultsWe obtained the data on co-authorships and citations from the following six databases: American Physical Society (APS), Internet of Science (WoS), DBLP, PubMed, Cora and arXiv. Considering that some databases are very huge (e.g. WoS), we have been unable to include them completely. Nevertheless, we made confident that the dataset from every single database is representative of it with regards to papersPLOS One particular | DOI:10.1371/journal.pone.0127390 May well 18,two /Consistency of Databasesand citations (see Procedures).