JournalTOCs Blog

News and Opinions about current awareness on new research

Archive for the ‘Database Content’ Category

The JournalTOCs big metadata

without comments

Every day, we create approx 5 thousand new records; most of those records are metadata of journal articles published in the previous 24 hours. The following image represents the metadata that JournalTOCs has collected so far.

JournalTOCs Big Metadata

sad girl

The table illustrated at the left hand side is a sample of the data source for this big metadata. It represents the number of new articles per day found in the journal TOC RSS feeds in March 2013.

Roughly 70% of that metadata was gathered in the last two years alone since JournalTOCs was launched as a public service in May 2011. As today, this metadata represents data of 1,795 publishers, 10,200 Premium users from licensed institutions, 22,050 journals, over 100,000 tracked research interests collected from followed journals that are frequently visited by any user (free and Premium registrations) and near 8 million articles that were published in the last 5 years. This big metadata is more than a matter of size. It can be an opportunity to find insights in new and emerging types of research, to support or create library management systems, and to help to answer questions regarding research publications. JournalTOCs offers ways to harvest this opportunity. It uses web services and standard harvesting protocols to open the door to the possibilities given by this big metadata, including:

+ Harvesting the metadata of all the journals indexed by JournalTOCs, which includes title, ISSN numbers, access rights, subject classification, publisher, number of follower, last issue published date, the URL of the journal RSS feeds and the journal homepage

+ Harvesting the complete database of the metadata of 8 million articles, including all the content collected from their RSS feeds

+ Querying the metadata of specific journals by ISSN or keywords in the journal title

+ Searching for articles in the current issues or the backfile issues

Learn how JournalTOCs can bring this big metadata to you

Written by Santiago Chumbe

May 31st, 2013 at 8:05 am

Percentage of scholarly publishers that have adopted the Recommendations on RSS Feeds

without comments

It is now two years since the ticTOCs Best Practice Recommendation group, headed by CrossRef and consisting of members from Talis, Nature Publishing Group, Oxford University Press and Heriot-Watt University; published the “Recommendations on RSS Feeds for Scholarly Publishers.”

RSS feeds are designed to be aggregated and reused by other services and software applications. In general RSS feeds should always be created with this in mind. The Recommendations are in full agreement with this principle.

Back in 2009, two practices were noticed by the ticTOCs Project:

  1. there was a wide variation amongst the journal TOC RSS feeds produced by scholarly publishers, and
  2. in most of the cases the feeds’ content had very limited information on the articles, such as uniquely the title and the link to the article’s webpage.

Variations in the way publisher implement RSS feeds basically preclude the consistent and automated aggregation of feeds. At the same time, having little content to offer, limit the reusability and value of feeds for other services that want to create interesting applications by combining the feeds. The Recommendations were created to help publishers avoid the inconveniences created by those two practices, and to advocate good practice in the production and provision of TOC RSS feeds for scholarly journals.

There are signs that the Recommendations are gradually being embraced to a certain extent, but how many scholarly publishers have really implemented the Recommendations in their journal TOC RSS feeds? There’s no way to get an exact number, but we can get a good idea of the progress being made by taking a look at the number of journals that are using the four RSS 1.0 modules recommended by the group, namely Admin, Content, Dublin Core and PRISM modules.

Today we have examined the RSS feeds of the journals collected by JournalTOCs to get an approximate picture of how many publishers are making the move. Currently 17,112 journals from 917 publishers are being indexed by JournalTOCs.

Interestingly no journal uses the Admin module in their RSS feeds. Only a few hundreds of subscription journals make use of the Content module. However those two modules are not particularly relevant from the re-usability perspective (the Admin module is intended to be used by consumers of a feed to provide feedback on errors encountered in the feed and the Content module is used to include formatted HTML marked up content for browsers.) The modules that really can give us a good indication of the Recommendations’ uptake are the Dublin Core and PRISM modules.

8,025 journals are using Dublin Core, PRISM or both modules; but only 3,673 of those journals are using both modules.

If we put the figures from the number of publishers’ perspective, 425 publishers are using Dublin Core, PRISM or both modules; and 295 of them use both Dublin Core and PRISM modules.

Regarding Open Access Journals, there are 2,660 Open Access journals in JournalTOCs, and 708 of them have implemented either the Dublin Core or the PRISM module; but only 288 of Open Access journals use both Dublin Core and PRISM modules.

Publishers and journals are using Dublin Core and PRISM modules

In conclusion: There is still a long way to go. Only 31% of the publishers are using the two main modules and in some extend have adopted the Recommendations. This is equivalent to 22% of the journals. To make a real progress two things should happen: (1) Elsevier, Springer-Verlag and Taylor and Francis together publish over 6,000 journals. A significant step forward will only be made when those three large publishers adopt the Recommendations. (2) An inexplicable low number of Open Access journals have implemented the recommendations. Without proper orientation and guidance, the publishers of OA journals so far haven’t been able to grasp the benefits of adopting best practices and using standard modules for their RSS feeds.

Written by Santiago Chumbe

October 29th, 2011 at 3:01 pm


with 2 comments

RSS logo Last month all the TOC RSS feeds from both University of California Press (UCAL) and University of Chicago Press (UCPRESS) stopped working. As a consequence, the current content of almost 100 journal TOCs went out of date at JournalTOCs.

UCAL and UCPRESS explained us that since they have recently moved all their journals to JSTOR they were no longer producing their own TOC RSS feeds. So the problem was JSTOR 😉

When we visited JSTOR website we noticed that JSTOR was systematically not producing TOC RSS feeds for any of its hosted publishers! There were some exceptions, but even in those cases, the TOC RSS feeds were not easily available to anyone as the registration of an account was required to be able to request RSS alerts.

Fortunately UCAL, UCPRESS and JSTOR staff were very pro-active and happy to help. After exchanging a few emails and tweets, JSTOR quickly grasped the importance for publishers to have a TOC RSS feed for each of their journals they host. Exactly two weeks ago, John Holm, the JSTOR Technical Support Specialist, told us that TOC RSS functionality was going to be included in JSTOR‘s next platform update, scheduled to take place late this month.

Today, John Muenning, UCPRESS Publishing Technology Manager, advised us that JSTOR has reinstated the TOC RSS feeds for University of Chicago Press journals. We easily found the TOC RSS links for all the journals hosted at JSTOR under “Journal Tracking” on the right rail of the journal pages (TOCs and content) for each title.

Well done JSTOR!

Now JSTOR is systematically producing freely available TOC RSS feeds for all the journals and publishers hosted at its platform.

JournalTOCs harvester has started to detect the new TOCs from UCAL and UCPRESS via JSTOR and we expect that soon their current content will be up-to-date on JournalTOCs again.

Written by Santiago Chumbe

March 23rd, 2011 at 9:34 am

Érudit journals are added to JournalTOCs

without comments

We are pleased to announce that all the journals published by the multi-institutional consortium Érudit have been added to JournalTOCs today.

Érudit is a publishing non-profit consortium comprising the Université de Montréal, the Université Laval and the Université du Québec à Montréal. Érudit forms the “Quebec node” of Synergies, the dissemination of research platform of Canada. The Érudit platform provides access to several types of documents in the humanities and social sciences, as well as the natural science disciplines.

Érudit journals are added to JournalTOCs

Established in 1998 as a digital publishing site at the Université de Montréal, Érudit launched its online platform in 2002 based on its own XML standard (Érudit Article schema) developed to ensure the best conditions for the use and preservation of their digital documents.

Érudit makes use of the convenient OPML file format to maintain a permanent up-to-date list with all its journals. OPML is important for JournalTOCs and for any RSS aggregators to dynamically detect any change in the list of journals (i.e. new journals, title changes, etc.). Thus, when the OPML of Érudit would get updated, so would JournalTOCs, which prevents information on Érudit journals from growing stale at the JournalTOCs database. On the other hand, we have contacted Érudit to encourage them to enhance the quality of the metadata included in their journal TOC RSS feeds. In general, we do advise publishers to follow the TicTOCs-CrossRef Recommendations for scholarly TOC RSS feeds.

The 88 journals currently being published by Érudit will be available online from JournaTOCs from the First of March 2011.

Written by Santiago Chumbe

February 24th, 2011 at 1:28 pm

All journals published by RMIT are added to JournalTOCs

without comments

We are delighted to announce that today all the 354 journals published by RMIT Publishing have joined JournalTOCs. The TOCs of 27 of these journals are already available from JournalTOCs website via IngentaConnect RSS feeds and, the TOCs for the rest 327 journals will be available via the own RMIT RSS feeds, from next Tuesday First of March 2011.

Journals published -through the Informit brand of RMIT Publishing– by 249 institutions including all the universities of Australia and New Zealand will be making their TOCs available via JournalTOCs.

RMIT journals added to JournalTOCs

RMIT Publishing is the leading provider of online research specialising in content from Australia, New Zealand and the Asia Pacific region. First established in 1989 within the library at RMIT University, it’s a wholly owned subsidiary of RMIT University based in Melbourne, Australia. With a broad range of content across the humanities and applied sciences, RMIT Publishing serves a wide user community of over 500 regional partners.


Written by Santiago Chumbe

February 23rd, 2011 at 12:43 am