OPERAS Tools Research and Development White Paper, July 2018
- Method and Criteria for Choosing Tools
- Peer Review Tools
- Authoring Tools
- Publishing Tools
- Annex 1: “101 Innovations in Scholarly Communication” Project
- Annex 2: Comparison Table of Annotation Tools (by Hypothesis)
- Annex 3: Poster of the Tools Research and Development Working Group presented at the OPERAS Conference “Open Scholarly Communication in Europe. Addressing the Coordination Challenge”, 31 May – 1 June 2018, Athens
This white paper has been elaborated by the Tools (R&D) Working Group, one of the 7 Working Groups launched by the OPERAS research infrastructure. The Working Group goal was to set up a list of tools and development which need to be done, to improve their usability for the OPERAS partners.
The approach in OPERAS emphasizes the importance of building the open science scholarly communication infrastructure in Social Sciences and Humanities on community driven tools. In this perspective, the development of Open Source tools and the setup of a toolbox appear to be appropriate answers to the existing needs and evolutions in scholarly publishing.
Following a first discussion in the Working Group, participants discussed the partners’ practices and needs to help focus the Working Group objectives on three functions:
- Peer review: interest in emerging practices such as open peer review, peer review tracking
- Authoring: interest in simple and all-in-one services, especially online and collaborative authoring
- Publishing: in particular, simple tools needed by small academic journals
The main results of the Working Group are:
- Notes on observed trends
- A common approach and criteria for choosing tools
- A list of relevant tools, detailing features and functionalities
- An analysis of the current needs of the partners
For Peer Review, the reviewing workflow is implemented in most Open Source software like Open Journal System (OJS) but developments are still needed to match the commercial software services. Similarly, the review tracking data available via services such as Publons is currently not open. The emerging trend for Open Peer Review represents an innovative area, both in terms of usage and tools.
For Authoring, we see a bloom of new online and collaborative tools. Promising Open Source software for editing structured scholarly content are being developed and are near to production, alongside commercial tools such as Authorea or Overleaf. One of the main challenges, in this case, is to obtain a continuous production environment through interoperability.
For Publishing, several Open Source software solutions are already used in production, but, as the level of service expected from a publication service is rising and includes a growing number of third-party services, the community is considering ways of working together to combine their effort to be comparable with the state of the art of the commercial solutions.
The Operas partners are willing to go beyond this working group and consider engaging in follow-up projects, notably to help create a resource centre dedicated to providing the community with current information and support on scholarly communication software and tools, and to contribute to the effort in developing Open Source tools.
This white paper has been created by the Tools (R&D) Working Group, one of the seven Working Groups set up for the OPERAS research infrastructure. The Working Group goal was to create a list of tools, noting any development that needs to be made, to improve their usability for the OPERAS partners.
This Working Group is linked with other Working Groups, in particular with the Best Practices Working Group.
The technical mapping1 done by OPERAS encompasses many functions of the scholarly communications process.
Following a first discussion in the Working Group, participants discussed the partners’ practices and needs which helped to focus the Working Group objectives on three functions:
- Peer review: interest in emerging practices such as open peer review, peer review tracking
- Authoring: interest in simple and all-in-one services, especially online and collaborative authoring
- Publication (dissemination): in particular, simple tools for small academic journals
It is important to stress that the activities of OPERAS members are very diverse, as the landscape study and technical mapping have shown. A first distinction can be made between the partners who have traditional editing activities and those who are providers of publishing services. The first case requires, for instance, specific tools for peer reviewing. In the second case, publishing activity goes from editing to dissemination, and this leads to another distinction. Whereas editing is mainly related to copy-editing and typesetting in a broad sense, the dissemination raises specific questions regarding both hosting and distribution. Apart from authoring, reviewing, editing and disseminating tools, other tools which are also part of the scholarly communication process without being related to the production of the published document, need to be examined.
Tool types and characteristics can be very different, as a tool can be:
- A technical brick (software libraries or frameworks) which still needs to be adapted and / or integrated into a wider application to be utilised (e.g. Texture)
- A software application which still needs to be configured, installed and maintained (e.g. OJS, Lodel)
- A ready-to-use software as a service fulfilling one or more functions of the scholarly communication activities (e.g. Scholastica, Publons)
Note that software as a service tools include end-user applications and technical services (APIs) to be used by other software, via the network.
The table below proposes a generic non-exhaustive typology of the tools considered in this paper.
|Table 1: Generic Typology of Tools|
|EditorialManager||not open||service||PR workflow, publishing|
|F1000 research||not open||service||Open PR, publication|
|Hypothes.is||open||service||authoring, Open PR, publication|
|Libero (elife Continuum)||open||application / component||publishing|
|ManuscriptsApp||to be opened||application||authoring|
|OJS||open||application||PR workflow, publishing|
|OMP||open||application||PR workflow, publishing, books|
|Peer Community in||not open||service||Open PR|
|PeerageOfScience||not open||service||PR tracking|
|Publons||not open||service||PR tracking|
|Rua||open||application||PR workflow, books|
|ScholarOne Manuscripts||not open||service||PR workflow|
|SciELO||open||application / component||publishing|
|Science Open||not open||service||post publication PR|
All functionalities may be more or less integrated or available in separate software products which may need more or less custom development or configuration, so as to interoperate and form a complete publication chain.
The tools considered in this Working Group address functions that are part of the researcher workflow according to R. C. Schonfeld (Schonfeld, 2017): mostly writing, collaborating, reviewing, and publishing. Nevertheless, as recent trends indicate (Schonfeld, 2018), the major commercial players in the academic publishing sector are integrating more and more functions and services to cover the workflow more completely, and this presents for the researcher community the serious risk of being locked in a particular suite of tools.
This lock-in risk also legitimates one of the key assumptions of the OPERAS consortium: scholarly publishing is part of research activity, and the SSH community (including the OPERAS partners) should have a certain control over tools and contribute to tool development. In other words, we believe that scholarly communications tools should be community driven. This is why the Working Group is called “Tools (R&D)” and not simply “Tools”. This is also why we will have a particular focus on Open Source tools, as they can (at least potentially) be adapted and extended by the community. However, we will also mention closed source software that are widely used or have interesting features.
The CEO of Hindawi publishing, Paul Peters, stresses the risks of relying on proprietary scholarly communications infrastructure and promotes the move towards an open scholarly infrastructure, which will be challenging. In his views, “in order to prevent private companies from owning and controlling this infrastructure a radically open approach to its development is required” (Peters, 2017). The proposition is to ensure simultaneously Open Source, Open Data, Open Integrations, and Open Contracts. In fact, not only the data should be open but also the infrastructure managing them and the implementation type of the services (Neylon, 2015). The publication software editor and OA journals publisher Scholastica (Scholastica, 2017 and 2018) stresses the importance for the academic community of having a consistent toolbox in order to take back control of the publication process.
In fact, and more precisely, ‘open source’ is not necessarily a guarantee in the sense that the startup who produced the software may be bought by a larger company and the licence may evolve overnight towards a closed source license (Pooley, 2017). Although a community may still fork the initial (Open Source) code, in practice it means that the “true” openness criterion is that the tool should be managed by an open community.
Alongside the governance issues, the use of many publishing tools in many environments also implies Interoperability challenges. Such challenges are common to virtually all tools: how to enable a user move (easily) data and documents from one tool, platform, environment to another. Interoperability will be partly addressed by another Working Group dedicated to Standards. However, the question will also be considered here as it represents a specific aspect of the practical issues faced by many publishers, especially small ones.
The goal of the Working Group is to identify the tools that OPERAS partners want to focus on and then work together to adopt them and adapt them to partner needs.
The main results expected from this work are:
- Notes on observed trends
- A common approach and criteria for choosing tools
- A list of relevant tools, detailing features and functionalities
- An analysis of the current needs of the partners
However, this working group does not aspire to provide a complete tool catalogue or a detailed comparison / benchmark of the tools most adapted to each purpose. There is indeed a wide variety of actors, and a large diversity of business needs; thus, it is a real challenge to give recommendations, as the adequacy of a tool to a particular need is very dependent on each particular situation. Furthermore, such a work requires real investments in human resources, which may take place in OPERAS RI but is not possible within the framework of this Working Group.
We propose following this simple method:
- The first step is to have a clear idea of the requirements. This is often not easy, and the Best Practices Working Group can help to clarify the requirements. In the case of OPERAS, the question may be not only to cover the particular needs of one partner, but to find an Open Source tool that can be reused and adapted to cover the needs of several organisations, at least the needs of the publishers. The requirements can be summarised as a list of criteria, which can be grouped into technical, functional, usage and governance.
- It is also necessary to be knowledgeable of the “tool landscape” (or market) to be able to select candidate tools to examine more closely. This is where a list of tools grouped by function can be useful. The business need is often complex and not limited to a single well-defined functionality. The available software or services may not cover the need completely, or, to the contrary, may cover much more than what is needed.
- It is then possible to compare candidate tools to assess which is best suited to our needs and to evaluate what further development still needs to be done to meet requirements.
Some criteria are common to all tools; some are of course specific (features). Also, and more importantly, the border between tools is not always clear: As said above, many services or platforms include several tools, so an authoring tool, or a publishing tool might be associated to one platform (or worse, cannot be used outside this platform); or several tools may be part of a software suite and are designed to work together and cannot be used separately (without a large adaptation effort, when they are OS).
The technical analysis helps to narrow down the choices and can address these questions:
- What type of tool is it? A technical brick, an application software, a running service?
- Is the tool mature?
- Is the tool based on open standards? E.g. which structured document formats are supported? Does the tool follow NISO standards2?
- On which technology (e.g. language, framework) is it based?
- Is the tool part of an integrated tool suite (risk of vendor lock-in)?
- How does the tool perform3 (e.g. response time…)?
The questions about usage should help to define the service provided by the tool and to assess its quality:
- Is the tool easy to use?
- Is it easy for a newcomer to understand what the tool really does (this is far from always the case, when first visiting a tool’s website)?
- How large is the user community?
- What is the scope of the tool: e.g. for which kind of publication is it intended: journals, books, both, other kind of documents?
- Is it well documented? Is it internationalised?
- Is the tool accessible via an existing platform with a good quality of service? Or does it need to be installed and operated by the user’s organisation?
Governance criteria are key for assessing perennation of the tool:
- What is the software licence?
- Who owns the software? is the tool owned by a private company, an institution, or is it governed by a community?
- Are the governance rules defined somewhere?
- Does the tool have a roadmap? Is the development active?
- Is the software editor a member of an industry coalition (such as AAK for annotation, etc.)?
Functional criteria (features)
Of course, features are very dependent on the kind of tool (peer review, authoring, publication).
A feature list can be established based upon the feature list of existing software and services on a tool’s website. See “Annex 2: Comparison Table of Annotation Tools (by Hypothesis)” on page” for an example using annotation tools.
Structured and uniform peer-review practices are a prerequisite for creating a standard for scholarly material that works across platforms, academic sub-disciplines, publishers and geographic regions. The aim to increase the accountability of research within humanities and social sciences is dependent on publishers and libraries continuing to develop services for authors and readers and making them digitally accessible and searchable to a greater extent, especially if they want to catch up with publishing within sciences, technology and medicine and with the quickly developing journal platforms. This section is meant to outline available tools for managing peer review and spot the gaps or challenges with available services.
First and foremost, there are standards describing the outline of the peer review process, such as guidelines for editors, reviewers and authors on an international level provided by the Committee of Publication Ethics or national initiatives like the Belgian GPRC mark (Guaranteed Peer Reviewed Content). In this report, we focus on the more technical aspects of peer review to facilitate the standard, such as systems for peer review (including open peer review), peer review tracking, and tools for standardising paper submission workflow to different publishers (to ensure a smooth review process).
The process of peer review can be anonymised, partly anonymised or completely open, depending on the academic subject area and the scholarly community available within that realm. The different types of peer review and the application within academic disciplines have been well described in the article ‘A multi-disciplinary perspective on emergent and future innovations in peer review’ (Tennant, J et al., 2017). Most proprietary and Open Source publishing platforms, such as OJS, for management of academic journals include a module for peer-review as a part of their core services. These systems allow editors and management to maintain a structured process and to create an archive for editorial processing to enable transparency. Most commonly used systems for peer review management, such as Editorial Manager or ScholarOne include sophisticated modules for reporting on user activity, automating process and measuring quality of submitted reviews. However, the need for more powerful reporting about editorial activity in Open Source software seems to be on the wishlist for many editors. There are plugins to tools like OJS with reporting tools, but they require a lot of manual work to be useful for analysis.
The peer review process for books is handled slightly different from journals, as this is an evaluation process that in the past was managed under the discretion of the academic publishers. With the growing movement among university presses, where a lot of emphasis has been on creating spaces for Open Access monographs, there has also been a push for developing tools for peer review of such publications. Examples of systems supporting the evaluation process are the Public Knowledge Project platform Open Monograph Press (OMP), and the Rua platform provided by Ubiquity Press. These systems both provide management platforms for the entire editorial process related to monographs and edited volumes, including a module for conducting structured peer review, but also the production and distribution of electronic books. Both OMP and Rua are Open Source systems available for free download and adaption by users. Publishers are already experimenting with annotation tools like Hypothesis to provide a more transparent editing processes and open peer review for books.
To get further acquainted with ideas on how to develop better peer review tools, please consult documentation from the Peer Review Transparency Workshop. This group of scholarly publishers, academic librarians, and IT experts is working to establish peer review standards and possible peer review labels comparable to the CC license labels.
Funders of research and academic institutions are currently aiming towards a higher level of transparency within the scholarly communications arena, to take back some of the control over the current quality assurance process for articles from publishers who have been criticised for not doing a proper job. This process has also been the purview of commercial publishers for a long while, as they have had the resources to develop tools for improving procedures. Following the development of more open practices within scholarly communication, such as open access to publications and research data, as well as the increased use of preprint servers to release early stage works for critique, it seems natural to also open up the peer review process to scrutiny. This is called open peer review (OPR) or post-publication review. Most academic publishers already have systems in place to manage peer review, but few have yet opened up the peer review process for readers to access the information from the process.
Open peer review means that the item is published online first, and reviewers are invited to publish their comments online. Usually, this procedure also includes versioning of the item to allow the author to submit subsequent revisions based on the reviewer comments. Using open peer review could potentially address several perceived problems with the current practice of scholarly quality control, such as unreliability and inconsistency, as well as a lack of incentives for peer reviewers (Ross-Hellauer, 2017).
Current platforms offering open peer review are, for example, F1000 Research or ScienceOpen where the open review procedure is considered as an integral part of the publishing service. Entire open peer review networks are emerging, for example https://peercommunityin.org, where the creators aim to develop an open community for researchers interested in OPR, to develop best practices, and to provide a list of potential experts who can be invited. There are also subject-specific networks, such as the first community in evolutionary biology https://evolbiol.peercommunityin.org, where authors can upload their preprints and get comments from peers before they submit to journals. The Open Review Toolkit enables anyone to convert a book manuscript into a website that can be used for Open Review using the Markdown format. Developed at Princeton (relying on Pandoc and hypothesis), this software takes a book manuscript (currently formatted only in Markdown, which is quite limiting), converts it to HTML, and enables an Open Peer Reviews for that document. Other examples of innovations or platforms for developing and opening up the peer review process are ‘Peerage of Science’, ‘Publons’ and ‘F1000 Research’. These three services are described in the article ‘What’s next for peer review?’ (Research Information, 2016).
Anonymised peer review is a challenge for many editors, as this is something that has to be done at an individual article level where the author has uploaded a document, and thus most of the work is done manually in the software used for writing. A tool for anonymisation would need to include checking references for self-citing and reviewing the linkage data in the actual text, as well as the user settings in each document for information that would reveal the author’s identity. A truly anonymised work is in practice extremely hard to achieve, especially in small academic fields where many researchers already know each other from meeting at conferences or other networking. There seems to be a need, however, to develop such a tool, so this could be something to consider, like for example building a plug-in to OJS to check for the author name being mentioned in the submitted material. Another useful tool to preserve author/reviewer integrity would be to use automated checks for conflicts of interest between authors and reviewers (answering questions like: have they collaborated on the same project or worked together in the same department).
Peer review is a critical mechanism for the scholarly communications landscape to function. The added value of peers who donate their time to evaluate potential publications for consistency and accuracy is enormous. Most of this work is done by researchers without any guarantee of recognition or reward, as the work is considered to be intrinsic in what it means to be an academic. In recent years there has been an ongoing discussion within academia that questions the added value for those who spend considerable time commenting on the work of others. Digital practices in publishing allow more opportunities, however, to do something about this lack of information on the number of completed reviews per year per researcher. We have, therefore, seen an emerging trend of tools being developed to better track and ease peer review activity (Tattersall, 2014). This would, however, demand that systems for peer review be aligned with tools to recognise users with unique identifiers, such as ORCID. The integration of data about peer review activity is already being used by F1000, American Geophysical Union (AGU) and Publons. These services had by June 2017 added information to 9,800 ORCID records, to add to users’ personal pages.4 The challenge with the tracking of peer-review on a wider basis is that it requires a digital workflow standard5 that not all systems deliver at the moment.
Many of the tools we found in this category seem to be proprietary in one way or another, apart from the platforms for managing the editorial process for books. There seems to be an open market for tools to enhance the editing process, where the paid-for services appear to be most used for the time being. Publons is, for example, free to use for researchers, but publishers have to pay for the service to be integrated in their systems as well as for extracting data, which many smaller organisations may not be able to afford. The partners would ideally like to have a similar more open tool but to ensure that data can be collected for all research output. Actually, the main challenge is to collect information in a reusable database of peer reviewers; if the data were available, the software itself could be developed with an Open Source licence. Such a large database would, however, need to take into consideration the integrity of its users in relation to legislation connected to GDPR and exhaustion of reviewers who may have to turn down too many invitations to review. Both ethical and practical guidelines should therefore be developed to meet the requirements of the GDPR on how such data should be used and processed.
The web publication domain is very active: the W3C has a dedicated Publishing Working Group and Open Source software are flourishing. In fact, within recent years, a large number of native web authoring tools have been developed, even within the academic environment.
This seems to be a promising and important trend, as it may greatly facilitate the authors’ work and transform the editing process. A key feature that goes along with online authoring is access to collaborative features (synchro, version control, etc.), as, in principle, any authorized user can edit a document concurrently with another user.
In a broader prospect, collaborative editing capabilities can impact the whole publishing workflow. Authoring software is usually still Microsoft Word, and the peer review process is done on the Word document and managed by a workflow to produce a PDF publication. However, online collaborative tools could greatly modify this process by enabling online Writing or Typesetting, and especially collaborative Peer Review. Publishing functions (see “4. Publishing Tools”) also may be impacted and become more seamless when linked to an online authoring tool. In that sense, online and easy-to-use tools could be a critical opportunity to move “away from PDFs” and the traditional publishing process (Scholastica et al., 2017).
In fact, online tools operate through a specific workflow, especially as far as formatting or typesetting is concerned. In a traditional workflow, when the article or the book is ready for publication, it is usually converted to an exchange format. The exchange format often uses a markup language such as XML/TEI for SSH, JATS for medicine and biology (and for SSH, in the case of Scielo), LaTeX for maths and physics. In the case of online tools, as they are natively based on these exchange formats, the conversion challenge is solved in great part. However, it is still needed to ensure interoperability between different exchange formats, especially when tweaked versions are used.
When the article or the book is ready for publication, it is usually converted to an exchange format. The exchange format often uses a markup language such as XML/TEI for SSH, JATS for medicine and biology (and for SSH, in the case of Scielo), LaTeX for maths and physics.
With these kinds of technologies, it is then possible to achieve or envisage interoperability use cases such as:
- The conversion between structured formats Markdown, LaTeX, XML (TEI, JATS…), Word/Office styles
- Exchange with Peer Review tools and with publication tools (e.g. FidusWriter and OJS, Lodel and OJS etc.)
In fact, interoperability between various tools is of major importance for the community as it is able to build a continuous environment of production. The example of Fiduswriter providing a formatted content used in the OJS Peer Review process could, in turn, inspire integration of the same Peer Review process with Lodel’s XML-TEI file generation.
Tools which exemplify the new trends mentioned in this introduction, in particular Open Source software are listed and briefly described in what follows.
MarkDown & LaTeX
In-browser editors for scholarly publishing are not so mature, but development is active:
- Manifold is developed and used by the University of Minnesota. It allows for books edition.
- ProseMirror is an Open Source toolkit for building collaborative text editors, used in two projects in the scholarly publication community:
- FidusWriter is funded as a German research project and has interesting plugin features which include an OJS plugin and the ProseMirror editor.
- MIT’s PubPub editor is both an Open Source editor and a publishing community.
- Sciflow is free for single users but not for organizations. It proposes advanced editing and collaborative tools. It is also based on the ProseMirror toolkit and uses HTMLBook format as its pivot format (https://www.sciflow.net/en/faq). It uses Open Source bricks at the moment and should be fully Open Source in the future.
All three rely on MarkDown and support LaTeX.
- ManuscriptsApp is an authoring tool for Mac users, which will go Open Source in 2018. At that time it will move to web editing capabilities, and thus will also be usable from Windows and Linux machines.
Texture and the Substance Consortium (SciELO, PKP, Érudit, eLife) have first-class support of structured XML documents (JATS), but the tool is still in beta. Texture 1.0 was released on March 2018, and the official 1.0 release is planned for September 2018. The Texture desktop application for Mac or Windows is based on the Electron framework. Further JATS XML tools to support authoring, production, and display are available (for example on JATSWiki).
The California-based Collaborative Knowledge Foundation aims at building an Open Source edition and publication tool framework, which has so far led to the development of a book production platform: Editoria (xpub being the journals publication platform). Editoria is built on PubSweet, the tool framework developed by CoKo, which includes a collaborative editor, Wax (still under development), which derives from Texture.
Closed source tools may be somewhat more mature, e.g. the proprietary platform LeanPub. Specializing in book authoring, the Leanpub web editor allows direct export to PDF, EPUB and Mobi. The service also comes with a selling storefront.
Here is for instance the official Authorea feature list:
- Service: Hosted installation, 24×7 support;
- Data management: Host Data for tables and figures, Mint a DOI, Version control (Git);
- Authoring: History view, Templates for leading conferences, institutions, and journals, Collaborate and manage co-authors, Comments, Equations editor, Interactive figures;
- Publishing: Multiple markup languages (add blocks of Markdown and LaTeX to your document as needed), Advanced export and journal styles, Direct submissions to a growing number of journals.
Overleaf offers similar services and is based on LaTeX and Rich Text. For its range of services, Authorcafé seems to be more a platform than a specific tool but doesn’t make its technical environment very specific.
Commercial tools assist authors in finding the best journal for publishing their research and/or adaptation of their article to the submission rules of the journal, such as:
- APA Style Central: http://apastylecentral.apa.org (no OA/Free version)
- Manuscript Matcher: http://endnote.com/product-details/manuscript-matcher (free version only as trial)
- F1000 Workspace: https://f1000workspace.com/ (free version only as trial)
- Overleaf (LaTeX writer): https://www.overleaf.com (available in gratis version)
Reference management tools are a key aspect of authoring and detailed information can be found on a dedicated Wikipedia page.6 Among the Open Source tools, some are already well known, like for instance Zotero or BibSonomy. The already mentioned FidusWriter also includes as one of its main features the reference management. There are currently innovative tools such as recite (beta version), that allows to check the consistency of the references against the text’s content.
Related to another aspect of researchers’ authoring activity, MECA7 is a proposed mechanism (ZIP folder with JATS-like XML files) to simplify transfer of manuscripts across publishers. Participating organizations (and systems) include Clarivate Analytics (ScholarOne), Aries Systems (Editorial Manager), eJournal Press (GEMS), HighWire (BenchPress). Although the use case for MECA is in STM and Biology, and the SSH context is quite different, the Working Group found that this kind of solution may be of interest in our context.
As said in the introduction, this Working Group does not aspire to provide a complete tool catalogue, nor a detailed comparison / benchmark of the tools most adapted to each purpose.
We have mentioned the “101 Innovations in Scholarly communication” project, where a pioneering approach was developed for comparing and articulating the various tools used in the research workflow, at a very broad level. On a more focused scope, we also have added in the appendix an interesting example of a comparison table for collaborative web annotation tools, which shows us a possible way in which the working group might include in a follow-up project.
Based on the general criteria listed in “2.1 Peer Review Tools Overview”, and the analysis thereof, here is a draft proposal of a comparison table for Open Source tools.
A way to go further could be simply to let each software development community complete its column.
|Table 2: Comparison of Open Source Tools|
|Type of Attribute||Criteria / Feature||Texture||Libero||Editoria||…|
|1. general||alternative to|
|1. general||remarks||very open, a lot of code on github including the lens viewer: https://github.com/elifesciences/lens|
|2. functional||type of tool||editor based on the substance js library||publishing+hosting platform||writing/editing platform|
|3. usage||maturity||alpha||operational for elifescience||v 1.0 since 2017|
|3. usage||user base||tests||https://elifesciences.org/ journal||online|
|4. technical||code||http://github.com/substance/texture||Based on Pubsweet|
|4. technical||standards compliance?|
|4. technical||pivot format||JATS||JATS||XML|
|5. governance||ownership||substance consortium||elife Science||Univ of California press|
Once a manuscript has been reviewed and typesetting is complete, it is quite straightforward to make the publication available on the web; any CMS would do that (e.g. WordPress), and the content could then be found on search engines or from the web sites known to the researchers of the particular discipline. This was the state of the practice maybe 10 or 15 years ago, notably in SSH. However, the standard services expected from a scholarly publishing platform are now much more demanding and tend to increase each year. A good example of a publishing tool widely used for this purposes by smaller journals is OJS, which can either be self-hosted or be hosted by PKP. OJS includes more features dedicated to academic publishing than a generic CMS like WordPress, for example an end-to-end submission to publishing workflow, tools to easily organise content into issues before publication, automated export of articles for DOI registration, and OAI-PMH end-points.
The publication functions of a hosting platform include Content Management (version control, status), quality checks, metadata annotation (authors, affiliations, keywords), bibliographic reference management, linking citations to standards such as ORCID, Funder Registry, DOI…), format production (PDF, EPUB, print, HTML, XML…), metrics and altmetrics, fee processing (may be relevant even for open access e.g. for APC or for a freemium model where HTML is free and PDF is not) and so on.
An important feature of the publishing process is making the content discoverable beyond the publishing website – this is often through active distribution – pushing to indexes like Crossref, PubMed, or passive methods like OAI-PMH, and presenting metadata for Google Scholar. There are some indexes (Web of Science, Scopus) which will harvest content manually, so a site just needs logically structured pages. As seen above, the functionalities may be more or less integrated or available in separate software products or services which may need more or less custom development or configuration to interoperate and form a complete publication chain.
Publishing software comparison criteria and definitions:
As proposed for authoring tools (“3.5 Toward a Comparison Between Tools”), a first classification step can be based on existing feature lists, e.g. OJS/OMP features, and also commercial software such as, for instance, PubFactory , Scholastica or Literatum.
A comparison table would then contain information on these aspects:
- Website: Search, Browse (by type, by date, by collection, by most popular, by similar items etc), Entry Display (including support for different file formats), Social Networking and Collaboration Tools, Other Plug-in Tools, Personalization and Custom Publishing, Distribution
- Administration management: E-commerce integration, Revenue Model Support, Access Control, CMS, Content Ingestion / Publication Management, Library Features, Reporting, Digital preservation
It is interesting to note that the COAR (Confederation of Open Access Repositories) association has worked on Next Generation Repositories in a Working Group which published a report in 2017.8 A dedicated website has been created: http://ngr.coar-repositories.org/.
This prospective work is interesting because (1) many functionalities are common to repositories (archives) and publishing platforms, (2) it relies on a preliminary use case study, and (3) it is implementation-oriented and envisages a modular standards-based architecture. It would certainly be relevant to do similar work for publishing platforms, but it is important to note that open archives are essentially public or non-profit organisations, whereas publishing platforms are not.
Here are the user stories for NGR, which are also largely relevant for a publishing platform:
- Discovering metadata that describes a scholarly resource
- Discovering the identifier of a scholarly resource
- Discovering usage rights
- Commenting, annotating, and peer-review
- Automated recommender systems for repositories
- Providing a social notification feed
- Data mining
- Supporting researchers’ workflows
3. System Management:
- Recognizing the user
- Resource syncing and notification
- Comparing usage
But, there are additional features unique to a publishing platform:
4. More advanced peer review workflows:
- Creation and curation of external reviewer database
- Configurable review workflows, with paths for each decision type
- Reporting, and automatically chasing where delayed
- Communication and decision archiving
5. Editorial work:
- Copyediting workflow tracking and reporting
- Typesetting workflow tracking and reporting
- Archiving communication
- Automating quality checks and providing enrichment tools (e.g. anti-plagiarism checks, XML validation, looking-up DOIs for reference list)
6. Distribution and marketing:
- Exporting content to mirrors and archives
- Integrating with social media
- Announcements and content alerts (to readers, authors, institutions, funders)
7. Version of record:
- Collating and presenting metrics from different sources
- Providing final citation
- Version management (e.g. through Crossmark)
8. APC finance management:
- Waiver workflow and reporting (discretionary waivers, country waivers, article-type waivers)
- Billing workflow and reporting (e.g. billing author’s institution, invoice splitting, pre-payment tracking)
- Integration with third party payment and accounting systems
Some academic tools/software lists exist, either with a broad scope such as Utrecht University Library’s work9, or focused on Open Access publication, as for instance, the list established by Radical OA in the UK or the one published recently by the Scholarly Kitchen.
It is interesting to note that the life science publisher elife has made a call to the community of open source publishing tools.10 Also, two conferences addressed the Open Source tools in Spring 2018: the Library Publishing Coalition Pre-Conference on open tools in Minneapolis https://librarypublishing.org/owned-by-the-academy-preconference/ and the Open Source Bazaar pre-conference at the Society for Scholarly Publishing Annual Meeting in Chicago. On the same topic, a group called Joint Roadmap for Open Science Tools (JROST) has just been launched.11 Their objective is to come up with joint roadmaps for Open Source tool providers.
- OpenEdition’s publishing software, Lodel.
- MIT’s PubPub is a new collaborative edition and publication software designed for academic communities.
- Hyrax is a web front-end for the Samvera Open Source digital repository framework (formerly known as fedora/hydra); the samvera community seems quite active in the USA. The platform is developed in Ruby on Rails. It also includes a discovery tool called Blacklight, which is a web front-end for the SolR search engine. The majority of use cases lie in academic library and repository applications; however, Samvera has been recently adapted by the University of Michigan for setting up Fulcrum, a publishing platform; Heliotrope is the name of the software adaptation of Hyrax to meet publishing needs. It is interesting to note that the LeverPress project of a peer-reviewed, open access, scholarly, digitally native Press is based also on Fulcrum.
- Birkbeck Center for Technology and Publishing’s Janeway journal platform.
- The SciELO platform (https://github.com/scieloorg), OS, well documented but large and complex.
- elife, a UK-based non-profit biomedical publisher, has developed interesting Open Source software (https://github.com/elifesciences/):
As previously stated, the publishing process is part of an ecosystem of interdependent services and platforms, which are not all part of the core CMS software but are provided by external service providers. Therefore, the publishing platform has to provide “hooks” enabling those services to be available (or third party software to be installed and to interoperate with the platform).
Given the wide range and the sometimes high specificity of these types of tools, it would be outside the scope of this paper to propose an exhaustive list. However, here are some examples of the functionalities and the challenges related to publications’ integration:
- Dissemination through data identifier:
- DOI registration agencies such as Crossref allow for the published objects (books, chapters, articles) and also parts of the content (supplementary material, figures, tables) to be identified. After DOI registration, Crossref supports publishers adding them to reference lists either via the Crossref API, via authoring tools in production (e.g. eXtyles) or just by pasting the reference list into Crossref’s Simple Text Query.
- Dissemination through author identifiers:
- Dissemination through funder identifier:
- Dissemination to indexes and mirrors (indexes are the most important discovery route, Gardner and Inger 2016)
- Google Scholar: pulls books and articles from the HTML metadata (DublinCore) on the publisher’s site
- Subject-specific repositories – there are more subject-specific repositories as there are subjects, some notable examples are given here:
- PubMed / PubMed Central (Biomedical journals only): Publisher pushes JATS-based article package to PubMed Central FTP (OJS supports PubMed export for journals which do not have JATS)
- PsycINFO (Psychology): a database of abstracts which can be licensed through providers such as EBSCO
- HeinOnline (Law): publisher emails PDFs on completion of issue
- Country/language-specific repositories or indexes include:
- Latindex (journals from Latin America, the Caribbean, Spain, Portugal only): publisher pushes metadata via online form.
- CNKI (journals from everywhere, which are then localised for a Chinese audience): publisher pushes PubMed XML metadata to FTP.
- Oasisbr (portal aggregating metadata from all Brazilian institutional archives).
- DOAJ (journals only): publisher pushes DOAJ XML metadata to DOAJ either manually via upload form or via API
- DOAB (books only): publisher pushes metadata via online form or file upload
- OAPEN (books only): publisher pushes metadata spreadsheet and book PDF to FTP, OAPEN has onward distribution for example to DOAB, ExLibris Primo, Worldcat, Portico
- Scopus (journals and books): publisher either pushes PDFs to Scopus FTP or (for journals) Scopus pulls content from publisher’s article browse pages (no technical requirements, other than logical browse pages)
- Web of Science/Web of Knowledge (journals and books): WoS/WoK pulls content from article browse pages, books and book series are submitted for evaluation
- JSTOR Open (books): publisher pushes metadata and PDFs to JSTOR FTP
- Institutional repository: often this is done manually by the author or the repository manager – any more automated solution (eg via SWORD) depends on the repository
- Portico (journals and books): publisher pushes medata and PDFs to Portico FTP
- CLOCKSS/LOCKSS (journals and books): pulls metadata and content from LOCKSS manifest pages on the publisher site
- Commenting and annotation, the many third-party tools for this include:
- Google Analytics – supported by most publishing systems; reports on views, downloads, interactions, and user flow
- Crossref Event Data – currently in Beta; enabled through API; reports on social media coverage, citations, annotations via Hypothes.is, etc.
- HIRMEOS – project currently under development to complete by end 2018; will report on social media coverage, citations, annotations via Hypothes.is, views and downloads
Tools and systems for improving the quality and speed of the peer review process should also be considered as a key success factor for the future of scholarly communication. This is especially important for the academic books sector, where quality assessment processes need to be made more transparent, perhaps via systems for open peer review, and streamlined to serve the research community better.
We see a number of online collaborative tools for authoring from both Open Source and proprietary providers. Commercial tools appear to be more mature but with a level of interoperability hard to assess. Adoption of authoring tools is complicated by differences between technical flows, content types and discipline: various conversions are needed as documents move through the editing and typesetting processes. Interoperability between tools will therefore be a key factor to consider for the future.
We observe there is a high number of publishing systems, and also an existing trend for more Open Source development of these systems. At the same time, they need to integrate with an ever-increasing set of third party tools/enhancements and discoverability services (see the examples above). It is outside the scope of this white paper to evaluate which tools and services are available immediately, and the exact method that each would have – but anyone evaluating a publishing platform should consider these. However, a general recommendation would be to simplify the distribution process through a service/tool which could receive a feed of data and files, and would automatically distribute to all the appropriate locations for that publisher/journal.
- JATS: Journal Article Tag Suite (https://en.wikipedia.org/wiki/Journal_Article_Tag_Suite)
- SSH: Social Sciences and Humanities
- TEI: Text Encoding Initiative (https://en.wikipedia.org/wiki/Text_Encoding_Initiative)
- XML: Extensible Markup Language (https://en.wikipedia.org/wiki/XML)
List of Images
Figure 1: The Scholarly Communication Process (© Laetitia Martin)
Figure 1 (Annex 1): 3 General Goals (Good Efficient Open)
List of Tables
Table 1: Generic Typology of Tools
Table 2: Comparison of Open Source Tools
Annex 1: “101 Innovations in Scholarly Communication” Project
The “101 Innovations in Scholarly communication” project has a much broader scope than ours, but with similar goals of identifying and assessing tools for the scholarly workflow in an Open Science context. The project ended in 2016, but the effort is continuing through the Force11 Scholarly Commons working group (https://www.force11.org/scholarly-commons/).
The project encompasses all tools used by researchers for their scholar work, whereas our Working Group focuses on authoring, publication tools and peer review. An interesting aspect of the work is that it proposes a 3 general goals (Good Efficient Open) which are also relevant for OPERAS.14
Also interesting is the interactive sheet on tools often used together (results from a survey): https://tinyurl.com/ycyxtbb7.
Annex 2: Comparison Table of Annotation Tools (by Hypothesis)
As an example of what we intend to do for PR, authoring or publishing tools, the table below compares the attributes and features of 6 annotation tools. It has been established by Heather Staines for Hypothes.is.
|Table 1 (Annex 2): Comparison of Annotation Tools|
|Works everywhere||Yes||No||No||Only for personal notes||Yes||No|
|Open source||Yes||No||partially (front end)||No||Yes||?|
|W3C standard – data model||Yes||No||In progress||Claimed||Yes||?|
|W3C standard – protocol||In progress||No||In progress||No||No||?|
|Groups||Yes||Yes (Open, Closed, or Secret)||Channels||Yes (but unclear how this could work with annotator vetting)||No|
|Share an annotation||Yes||No||Yes||Share seems to be for articles only||No||?|
|Replies||Yes||Not on annotations||Yes||Yes||Yes||?|
|Direct links||Yes||Not on annotations||Yes||No||No||No?|
|Annotate over publisher content||Yes||No||No (widget)||Yes||Yes||No|
|Publisher Moderation||Yes||No||In progress||No||No||No|
|Search||Yes||Yes (but doesn’t seem to be limited to annotations)||Yes (but only own annotations)||No (only people)||Yes||Yes: across articles|
|Advanced search||No||yes (publisher article/full text)|
|HTML<>PDF cross format||Yes||No||No||Claimed, not verified||No||No|
|Runs the industry conference||Yes||No||No||No||No||No|
|Member of AAK coalition||Yes||No||Yes||Yes||Yes||No|
|Customization to fit publisher platform||Yes||N/A||Yes (widget)||No||Yes||No|
|Annotation License (Public)||CC-BY-2.0|
|Indexed (Crossref Event Data)||No|
|Different highlight colors||No – planned||No||no||No?||No||?|
|Follow||No – planned||No||articles (not people)||Yes (person)||No||Yes: Friends|
|Social Login||No – planned||No||Yes||Yes, LinkedIn||Yes: Facebook and Google||Yes: Yahoo and OpenID|
|Image Annotation||No – planned||No||No||No||No|
Annex 3: Poster of the Tools Research and Development Working Group presented at the OPERAS Conference “Open Scholarly Communication in Europe. Addressing the Coordination Challenge”, 31 May – 1 June 2018, Athens
This White Paper has been prepared by the OPERAS Tools Research and Development Working Group under a CC BY 4.0 license
Associazione Italiana Scienza Aperta (AISA)
Francesca di Donato
Institute of Literary Research of the Polish Academy of Sciences (IBL PAN)
Luxembourg Centre for Contemporary and Digital History (C²DH)
Stockholm University Press
University of Turin (UniTo)
- OPERAS Consortium. (2017, October 12). Technical Mapping of OPERAS Consortium – Annex to OPERAS Design Study. Zenodo. https://doi.org/10.5281/zenodo.1009561.
- See for example http://www.niso.org/standards-committees/ebmd or http://www.niso.org/standards-committees/odi.
- This criterion is often not very critical nowadays, but can be important depending on the use case.
- See https://scholarlykitchen.sspnet.org/2017/08/17/meca-new-manuscript-exchange-initiative. See also this presentation for more details.
- Links to the presentations: https://elifesci.org/ossoapbox
- See: http://blogs.lse.ac.uk/impactofsocialsciences/2015/11/11/101-innovations-in-scholarly-communication/ and this presentation: https://figshare.com/articles/_/5627503.