In this paper, we explore the use of Memento with the Internet Archive as a means of avoiding spoilers in fan wikis. We conduct two experiments: one to determine the probability of encountering a spoiler when using Memento with the Internet Archive for a given wiki page, and a second to determine which date prior to an episode to choose when trying to avoid spoilers for that specific episode.
A reader who visits a web at large resource by following a URI reference in an article, some time after its publication, is led to believe that the resource’s content is representative of what the author originally referenced. However, due to the dynamic nature of the web, that may very well not be the case. We reuse a dataset from a previous study in which several authors of this paper were involved, and investigate to what extent the textual content of web at large resources referenced in a vast collection of Science, Technology, and Medicine (STM) articles published between 1997 and 2012 has remained stable since the publication of the referencing article.
We quantify the extent to which references to papers in scholarly literature use persistent HTTP URIs that leverage the Digital Object Identifier infrastructure. We find a significant number of references that do not, speculate why authors would use brittle URIs when persistent ones are available, and propose an approach to alleviate the problem.
In the course of conducting a study with almost 700,000 web pages, we encountered issues acquiring mementos and extracting text from them. The acquisition of memento content via HTTP is expected to be a relatively painless exercise, but we have found cases to the contrary. For the benefit of others acquiring mementos across many web archives, we document those experiences here.
Enterprising readers might browse the wiki in a web archive so as to view the page prior to a specific episode date and thereby avoid spoilers. We find that when accessing fan wiki pages in the Internet Archive there is as much as a 66% chance of encountering a spoiler.
Enterprising readers might browse the wiki in a web archive so as to view the page prior to a specific episode date and thereby avoid spoilers. We quantify how the current heuristic used for choosing an archived web page based on a date is inadequate for avoiding spoilers, analyzing data collected from fan wikis and the Internet Archive. We find that when accessing fan wiki pages in the Internet Archive there is as much as a 66% chance of encountering a spoiler.
We have implemented the Memento MediaWiki Extension Version 2.0, which brings the Memento Protocol to MediaWiki, used by Wikipedia and the Wikimedia Foundation. Test results show that the extension has a negligible impact on performance.