Automated fact-checking on a large-scale is a challenging task that has not been studied systematically until recently. Large noisy document collections like the web or news articles make the task more difficult. We examine the performance of a three-stage automated fact-checking system using various evidence retrieval and selection methods. We demonstrate that hybrid passage retrieval using sparse and dense representations leads to much higher evidence recall in a noisy setting. We also propose two sentence selection approaches, an embedding-based selection using a dense retrieval model, and a sequence labeling approach for context-aware selection. The embedding-based selection achieves very high recall across two different datasets, while the sequence labeling model achieves higher precision and improves the verification accuracy compared to context-agnostic sentence selection approaches. Using the same three-stage architecture, we built Quin, a large-scale fact-checking system for the COVID-19 pandemic.
Passage retrieval is a part of fact-checking and question answering systems that is critical yet often neglected. Most systems usually rely only on traditional sparse retrieval. This can have a significant impact on the recall, especially when the relevant passages have few overlapping words with the query sentence. In this work, we show that simple training of a dense retriever is sufficient to outperform traditional sparse representations in both question answering and fact-checking. Our model is incorporated in a real world semantic search engine that returns snippets containing evidence related to questions and claims about the COVID-19 pandemic.