What is Scholarly Publishing’s Hoverboard? What’s on Our Wish List?

Several events these past few weeks have had me thinking about my wish list for scholarly publishing’s future. This is, which of today’s problems, frictions, irritants and opportunities would I put at the top of a list to be resolved?

I would like to hear what’s on your list!   I’ll be giving a talk in January and would like some ‘far-out’ additions to my own list.

I was asked to catalog my own wish list as part of my ALPSP/CCC “Masters of Publishing” webinar recently.   I observed at the time that nobody had ever asked for my wish list before!   And then with last week’s Back to the Future anniversary, I was thinking that undoubtedly some things on my list would seem “quaint” thirty years from now (e.g., the way 2015’s Marty McFly was fired by fax). That thought almost kept me from writing anything that might be found via The Wayback Machine years from now. But if HighWire’s original 1995 ‘design’ for the Journal of Biological Chemistry homepage is there, what should I be afraid of?

My wish list has a lot of what is referred to in telecommunications and supply-chain terminology as “last mile” [links] problems I’d like to see solved. And some I’d call “first mile” problems too. These are barriers to moving content into the workflow (submission, peer review, media management) and getting it out (linking, discovery, accessing).   Industries seem most successful at central processes running at scale – like freeways in transportation – but neglect the on-ramps and off-ramps, the first and last mile, so to speak. My list focuses largely on the technology-publishing intersection. If your role is as a publisher, you might well have a list that centers on sustainability of new business models, for example. I’m interested in a list broader than my own, so please contribute yours!

Here’s my list, in approximate “workflow” order:

A “common app” for manuscript submissions

Authors should only have to format and keyboard metadata for a manuscript once for submitting to multiple journals in series.

Experiments in peer review

There’s a lot of room for improvement in peer review – the process is time consuming and exhausting for all parties – and we need to discover what actually are improvements through experimentation and data gathering, not just religious debate.

Embeddable data objects, in sciences and in digital humanities

The complex and sometimes interactive data and media objects that are being developed in labs and studies do not fit easily into today’s flat publishing containers, or must be multiply-versioned for HTML and PDF containers.

Can we end ‘composition lag’ and go direct to HTML

Can we eliminate the lag and cost of composition? So many tools today are built on the lingua franca of the web, HTML. Yet we drive all of scholarly-publishing’s documents through a time-consuming and expensive markup process, still based on markup from 20+ years ago. Could we take another path?

No-lag dissemination: instant preprints

So many of the handoffs across systems have lag times, with the largest lag in the front-end process of review-then-publishing.   Can this “information float” be reduced by broad preprint adoption in many disciplines?

Article as a hub, with links to the “gray literature”

Authors create many public objects that make research articles more accessible to a broad audience beyond the experts: blog posts, podcasts, tweets, journal clubs, lab or departmental web page, meeting presentations, etc. Yet these are not formally linked to from the research literature.

Annotation that can be shared across people, and media

Annotation exists already, of course, but it is personal only, and only via proprietary tools or formats. We are still in the Web 1.0 world when it comes to annotation. Yet critical notes are part of the way scholars advance (or sharpen) each others work.

Sub-article referencing and linking

I should be able to cite and link to a section or paragraph of an article, not just to top of an article. We reference pages in books, why not something at least as granular in the journal literature.

Better indexing of book sections

Discovery of published scholarly articles is highly-evolved. But scholarly books (and book chapters) still need work. This is challenging for many reasons, but it’s a wish list, after all!

Better indexing of images

Let’s make it possible to search scholarly images by searching figure legends, or text in a figure or table, or closed caption in a video. Google already provides a basic web-image search. Perhaps if publishers would provide Google Scholar with rights to display low-resolution article images – the visual equivalent of a snippet – we could have a scholarly version of image search.

Better off-campus access to institutional subscriptions

Readers frequently do literature study off campus, outside the network bounds of institutional subscriptions.   It is a tedious multi-step process to gain access, and typically it is easier to get to a free version than the version of record, so of course that’s what people do.  We can do better.

Most all of these wishes if fulfilled would provide benefits to authors, to readers, and to publishers – that is, all the stakeholders in our ecosystem would benefit!

There is work being done on many of these wishes – the work for some is open, for others is proprietary. Much is experimental or limited in scope – or simply unknown to me. I encourage readers to cite – or even plug! – efforts to accomplish these.

I would like to hear what’s on your list!   I’ll be giving a talk in January and would like some far-out additions to my own list.

(The Hoverboard in 2015 is real, but a bit limited.)

Do researchers use journal title to decide what articles to read?

(This is the first in an occasional series on findings from HighWire’s researcher interviews.)

In our series of over sixty interviews, we found remarkable consistency of answers to our questions about researcher workflow, even across disciplines. But on one question, we found a number of researchers answered ‘yes’, and a number ‘no’:

“Do you use journal name to decide what articles to read?”

There was no obvious demographic or other factor that let us see how to rationalize the positive and negative responses. E.g., it wasn’t always the postdocs who answered one way, and the lab directors who answered the other.

Finally, towards the end of our interviews, one person, a postdoc in neuroscience, answered,

“It depends.”

This cued the automatic follow up question:

“It depends on what?”

And then we heard

“On what the article’s about.”

We then understood what seems obvious now: if the reader is an expert in the subject of the article, then it matters less what journal it appears in than it does when the article’s subject is farther away from a reader’s expertise.

When reading “to be a well-informed scientist,” our particular interviewee said he “relies on Science, Nature and Cell to tell me what’s important or interesting.”

The logic for this researcher is that a journal’s peer review process is not essential if he himself is qualified to have been a reviewer on the article: “I can judge for myself, and prefer to” we were told.   Further, for an expert, a journal’s selectivity can be a barrier, since it removes both signal and noise, and slows the dissemination of information by the time taken in the review and revision process. The expert may wish to do his own “signal processing” (i.e., filtering for both novelty and importance), to gain access to a less-filtered and more rapid information stream.

We also heard that researchers felt they must read the materials in their own area of expertise, and mayread the literature of general interest, the big challenge is how to deal with the huge amount of material in between these two poles: expert filtering

An “expert self-filter” mode seems consistent with my own experience as a reader of the research literature: when I’m reading my Google Scholar alerts and recommendations – which are for my areas of expertise — I look at titles before I look at journal names, suggesting that the journal brand is not the key, but one signal.

An “expert mode,” in which I ignore other experts’ advice is also, perhaps strangely, consistent with my use of a car’s navigation system: when I am driving in my home area, I use the navigation system only to give me an ETA so I can tell my spouse when I’ll be home – I’m an “expert” because I know the local roads better than a mapping program. But when I’m driving in another city, I follow the navigation system slavishly.

In other researcher interviews, we did not, I hasten to add, hear experts telling us they have stopped reading journals in their area of expertise and instead rely solely on database searches and keyword alerts. While this behavior would seem to be consistent with what we had heard, it is also possible that readers continue to rely on major, broad-subject-focused to keep generally informed in their larger subject, as their own research is likely highly specialized. If we saw fewer people reading email TOCs of major subject-focused journals, this might be a sign of a behavior shift. (We are studying this now.)

The rising interest in life sciences in preprint servers such as BioRxiv from Cold Spring Harbor Labs Press could be a tool to address the expert’s need to tap an unfiltered source that is further “upstream” from the published literature. As well, “megajournals” such as PLOS One, and newer journals such as Royal Society Open Science, F1000 Research, and PeerJ Preprints — all based on the principles of “objective review” rather than using novelty as a filtering criterion – could address the expert’s interest in access to less-filtered, more rapid, sources.

Online Indexing of Scholarly Publications: Part 2, What Happens When Finding Everything is So Easy?

The transformation in discovery – and its consequences – was the topic of the opening keynote at the September 2015 ALPSP Annual Meeting. Anurag Acharya – co-founder of Google Scholar – spoke and answered questions for an hour.   That’s forever in our sound-bite culture, but the talk was both inspirational — about what we had collectively accomplished — as well as exciting and challenging – about the directions ahead.   Anurag’s talk and the Q&A is online as a video and as audio in parts one and two

This post is in two parts: Part One covered Anurag’s presentation of what we have accomplished. The present post, Part Two, covers the consequences. Anurag has agreed to address questions about this post that readers put in the comments.

Here is my take on the key topics from Anurag’s keynote.

In Part One, I highlighted the factors that have transformed  scholarly communication over the last 10-15 years:

  • Search is the new browse
  • Full text indexing of current articles plus significant backfiles joined with relevance ranking to change how we looked and what we did.
  • “Articles stand on their own merit”
  • “Bring all researchers to the frontier”
  • “So much more you can actually read”

In the Part Two of this post, I cover Anurag’s view of What Happens When Finding Everything is So Easy? Continue reading

Online Indexing of Scholarly Publications: Part 1, What We All Have Accomplished

“Let no one tell you that ‘Scholarly communication hasn’t changed’”

HighWire conducted its first extensive user studies in 2002. Since then, several things have completely altered the workflow of the researcher:

  • full text of most current journal articles is centrally indexed;
  • back archives of a significant fraction of the full text research literature is online, and centrally indexed as well.

“Centrally indexed” was a watershed point.  In 2002, Google’s web search (i.e., google.com) started indexing the full text of journal literature — including the portion behind paywalls – starting with HighWire and its publishing partners. HighWire saw the use of journal article content go up by one and in some cases two orders of magnitude following this! And then, in 2004 Google Scholar was born, scholar.google.com, recognizing that the workflow and goal of a researcher is not best-supported by a general-purpose internet search engine, no matter how good its ranking algorithms are.

Now, a decade after our first user studies, users report to us that “Finding is easy; reading is hard.”

This transformation in discovery – and its consequences – was the topic of the opening keynote at the September 2015 ALPSP Annual Meeting. Anurag Acharya – co-founder of Google Scholar – spoke and answered questions for an hour.   That’s forever in our sound-bite culture, but the talk was both inspirational — about what we had collectively accomplished — as well as exciting and challenging – about the directions ahead.   Anurag’s talk and the Q&A is online as a video and as audio in parts one and two

This post is in two parts: the present Part One covers Anurag’s presentation of what we have accomplished. Part Two, to be posted on Monday, October 12, covers the consequences. Anurag has agreed to address questions that readers put in the comments.

Here is my take on the key topics from Anurag’s talk. Continue reading