Publications

Journal Publication
Shawn M. Jones, Michael L. Nelson, and Herbert Van de Sompel.
International Journal on Digital Libraries. March 2018. Volume 19, Issue 1.

In this paper, we explore the use of Memento with the Internet Archive as a means of avoiding spoilers in fan wikis. We conduct two experiments: one to determine the probability of encountering a spoiler when using Memento with the Internet Archive for a given wiki page, and a second to determine which date prior to an episode to choose when trying to avoid spoilers for that specific episode.

Journal Publication
Shawn M. Jones, Herbert Van de Sompel, Harihar Shankar, Martin Klein, Richard Tobin, and Claire Grover.
PLOS One. December 2016. Volume 11. Issue 12.

A reader who visits a web at large resource by following a URI reference in an article, some time after its publication, is led to believe that the resource’s content is representative of what the author originally referenced. However, due to the dynamic nature of the web, that may very well not be the case. We reuse a dataset from a previous study in which several authors of this paper were involved, and investigate to what extent the textual content of web at large resources referenced in a vast collection of Science, Technology, and Medicine (STM) articles published between 1997 and 2012 has remained stable since the publication of the referencing article.

Conference Poster
Herbert Van de Sompel, Martin Klein, and Shawn M. Jones.
Proceedings of the 25th International Conference Companion on World Wide Web. 2016.

We quantify the extent to which references to papers in scholarly literature use persistent HTTP URIs that leverage the Digital Object Identifier infrastructure. We find a significant number of references that do not, speculate why authors would use brittle URIs when persistent ones are available, and propose an approach to alleviate the problem.

Technical Report
Shawn M. Jones and Harihar Shankar.
arXiv:1602.06223 [cs.DL] 2016.

In the course of conducting a study with almost 700,000 web pages, we encountered issues acquiring mementos and extracting text from them. The acquisition of memento content via HTTP is expected to be a relatively painless exercise, but we have found cases to the contrary. For the benefit of others acquiring mementos across many web archives, we document those experiences here.

Preprint
Shawn M. Jones and Michael L. Nelson.
arXiv:1506.06279 [cs.DL] 2015.

Enterprising readers might browse the wiki in a web archive so as to view the page prior to a specific episode date and thereby avoid spoilers. We find that when accessing fan wiki pages in the Internet Archive there is as much as a 66% chance of encountering a spoiler.

Masters Thesis
Avoiding Spoilers on MediaWiki Fan Sites Using Memento.
Shawn M. Jones.
Old Dominion University 2015.

Enterprising readers might browse the wiki in a web archive so as to view the page prior to a specific episode date and thereby avoid spoilers. We quantify how the current heuristic used for choosing an archived web page based on a date is inadequate for avoiding spoilers, analyzing data collected from fan wikis and the Internet Archive. We find that when accessing fan wiki pages in the Internet Archive there is as much as a 66% chance of encountering a spoiler.

Technical Report
Bringing Web Time Travel to MediaWiki: An Assessment of the Memento MediaWiki Extension.
Shawn M. Jones, Michael L. Nelson, Harihar Shankar, and Herbert Van de Sompel.
arXiv:1506.06279 [cs.DL] 2014.

We have implemented the Memento MediaWiki Extension Version 2.0, which brings the Memento Protocol to MediaWiki, used by Wikipedia and the Wikimedia Foundation. Test results show that the extension has a negligible impact on performance.