preprint

Lost in OCR Translation? Vision-Based Approaches to Robust Document Retrieval

Lost in OCR Translation? Vision-Based Approaches to Robust Document Retrieval

by Alexander Most, Joseph Winjum, Ayan Biswas, Shawn M. Jones, Nishath Rajiv Ranasinghe, Dan O'Malley, Manish Bhattarai

Retrieval-Augmented Generation (RAG) has become a popular technique for enhancing the reliability and utility of Large Language Models (LLMs) by grounding responses in external documents. Traditional RAG systems rely on Optical Character Recognition (OCR) to first process scan...

Read More
Avoiding Spoilers in Fan Wikis of Episodic Fiction

Avoiding Spoilers in Fan Wikis of Episodic Fiction

by Shawn M. Jones, Michael L. Nelson

A variety of fan-based wikis about episodic fiction (e.g., television shows, novels, movies) exist on the World Wide Web. These wikis provide a wealth of information about complex stories, but if readers are behind in their viewing they run the risk of encountering “spoilers” ...

Web mentions

Read More