Recent Research Work

Software Project

Raintale is a utility for publishing social media stories from groups of archived web pages (mementos). Raintale uses MementoEmbed to extract memento information and then publishes a story to the given storyteller, a static file or an online social media service.

Blog Post
Hypercane Part 1: Intelligent Sampling of Web Archive Collections
Web Science and Digital Libraries Research Group Blog

Yasmin AlNoamany experimented with summarizing a web collection by choosing a small number of exemplars and then visualizing them with social media storytelling. This is in contrast to approaches that try to account for all members of the collection. When I took over the Dark and Stormy Archives project from her in 2017, the goal was to improve upon her excellent work. Her existing code relied heavily upon the Storify platform to render its stories. Storify was discontinued in May 2018. We discovered that other platforms rendered mementos poorly, so we developed MementoEmbed to render individual surrogates and later Raintale to render whole stories. We discovered that cards are probably the best surrogate for stories. We now publish stores to the DSA-Puddles web site on a regular basis. Up to this point, we have relied upon sources such as Nwala's StoryGraph or human selection to generate the list of mementos rendered in our stories. The document selection is key to the entire process. What tool can we rely on to automate the selection of mementos for these stories and other purposes? Hypercane.

Social Cards Probably Provide For Better Understanding Of Web Archive Collections
From the Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Used by a variety of researchers, web archive collections have become invaluable sources of evidence. If a researcher is presented with a web archive collection that they did not create, how do they know what is inside so that they can use it for their own research? Search engine results and social media links are represented as surrogates, small easily digestible summaries of the underlying page. Search engines and social media have a different focus, and hence produce different surrogates than web archives. We hypothesize that groups of surrogates together are useful for summarizing a collection. We want to help users answer the question of "What does the underlying collection contain?" But which surrogate should we use? We evaluate six different surrogate types against each other. We find that the type of surrogate does not influence the time to complete the task we presented the participants. Of particular interest are social cards, surrogates typically found on social media, and browser thumbnails, screen captures of web pages rendered in a browser. At p=0.0569, and p=0.0770, respectively, we find that social cards and social cards paired side-by-side with browser thumbnails probably provide better collection understanding than the surrogates currently used by the popular Archive-It web archiving platform. We measure user interactions with each surrogate and find that users interact with social cards less than other types. The results of this study have implications for our web archive summarization work, live web curation platforms, social media, and more.

Featured Research Work

Improving Understanding of Web Archive Collections Through Storytelling - PhD Candidacy Exam
Presented at Old Dominion University

With web archives, journalists find evidence and information to back up their stories, historians store information for later users, and social scientists can study the actions of humans during specific time periods. These different groups gain value not only from creating their own collections but from using the collections of others. As users, we currently have no efficient way of understanding what is in each collection without manually reviewing all of its items. While past work has used mementos for studying how web resources change over time or evaluated the changes to various industries, there is still theoretical work to be done in improving the usability of web archive collections. Our goal is to help collection creators and the public at large to make better use of these collections through improvements to collection understanding. We build upon the work of AlNoamany by using visualizations from social media storytelling. In this work, we provide background on the problem, analyze previous work in this area, and highlight our preliminary work before providing a plan for future research.

Blog Post
Raintale -- A Storytelling Tool For Web Archives
Web Science and Digital Libraries Research Group Blog

Raintale is the latest entry in the Dark and Stormy Archives project. Our goal is to provide research studies and tools for combining web archives and social media storytelling. Raintale provides the storytelling capability. It has been designed to visualize a small number of mementos selected from an immense web archive collection, allowing a user to summarize and visualize the whole collection or a specific aspect of it.

The Many Shapes of Archive-It
From the Proceedings of the the 15th International Conference on Digital Preservation

We examine different collections at the web archive collection service Archive-It. From here we demonstrate the use of several different structural features that can be used to predict the type of collection.

I have worked in industry for more than 18 years, participating in many aspects of systems and software engineering.

Now I am a Graduate Research Assistant at Los Alamos National Laboratory working for the Research Library Prototyping Team. I am also a Ph.D. candidate at Old Dominion University majoring in Computer Science under the guidance of Dr. Michael L. Nelson. My focus area is digital preservation, specifically web archiving.

Above, you can find out more information about my journey through academia.

Below, you can follow me on social networking.