Symposium on Saving The Web at the Library of Congress
On June 16, 2016, the Library of Congress hosted a one day Symposium entitled Saving the Web: The Ethics and Challenges of Preserving What’s on the Internet.
On June 16, 2016, the Library of Congress hosted a one day Symposium entitled Saving the Web: The Ethics and Challenges of Preserving What’s on the Internet.
While analyzing mementos in a recent experiment, we discovered problems processing archived content. Many web archives augment the mementos they serve with additional archive-specific information, including HTML, text, and JavaScript. We were attempting to compare content acro...
I was fortunate to present a poster at the 25th International World Wide Web Conference, held from April 11, 2016 - April 15, 2016. Though my primary mission was to represent both the WS-DL and the LANL Prototyping Group, I gained a better appreciation for the state of the art...
by Herbert Van de Sompel, Martin Klein, Shawn M. Jones
We quantify the extent to which references to papers in scholarly literature use persistent HTTP URIs that leverage the Digital Object Identifier infrastructure. We find a significant number of references that do not, speculate why authors would use brittle URIs when persisten...
Recently, we conducted an experiment using mementos for almost 700,000 web pages from more than 20 web archives. These web pages spanned much of the life of the web (1997-2012). Much has been written about acquiring and extracting text from live web pages, but we believe that ...
by Shawn M. Jones, Harihar Shankar
Text extraction from web pages has many applications, including web crawling optimization and document clustering. Though much has been written about the acquisition of content from live web pages, content acquisition of archived web pages, known as mementos, remains a relativ...
The LANL Library Prototyping Team recently received correspondence from a member of the Wikipedia team requesting Python code that could find the best URI-M for an archived web page based on the date of the page revision. Collaborating with Wikipedia, Harihar Shankar, Herbert ...
After successfully defending my Master’s Thesis, I accepted a position as a Graduate Research Assistant at Los Alamos National Laboratory (LANL) Library’s Digital Library Research and Prototyping Team. I now work directly for Herbert Van de Sompel, in collaboration with my ad...
by Shawn M. Jones, Michael L. Nelson
A variety of fan-based wikis about episodic fiction (e.g., television shows, novels, movies) exist on the World Wide Web. These wikis provide a wealth of information about complex stories, but if readers are behind in their viewing they run the risk of encountering “spoilers” ...
by Shawn M. Jones
A variety of fan-based wikis about episodic fiction (e.g., television shows, novels, movies) exist on the World Wide Web. These wikis provide a wealth of information about complex stories, but if readers are behind in their viewing they run the risk of encountering spoilers” –...