TREC-COVID: Rationale and Structure of an Information Retrieval Shared Task for COVID-19

Kirk Roberts, Tasmeer Alam, Steven Bedrick, Dina Demner-Fushman, Kyle Lo, Ian Soboroff, Ellen Voorhees, Lucy Lu Wang, and William R Hersh
JAMIA  2020

Tl;DR: This article presents a brief description of the rationale and structure of TREC-COVID, a still-ongoing IR evaluation. TREC-COVID is creating a new paradigm for search evaluation in rapidly evolving crisis scenarios.

  • s2 View and cite on Semantic Scholar
  • PDF View PDF

SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search

Tom Hope, Jason Portenoy*, Kishore Vasan*, Jonathan Borchardt*, Eric Horvitz, Daniel S. Weld, Marti A. Hearst, and Jevin D. West
preprint  2020

Tl;DR: SciSight is a novel framework for exploratory search of COVID-19 research that integrates two key capabilities: first, exploring interactions between biomedical facets (e.g., proteins, genes, drugs, diseases, patient characteristics); and second, discovering groups of researchers and how they are co... nnected.

High-Precision Extraction of Emerging Concepts from Scientific Literature

Daniel King, Doug Downey, and Daniel S. Weld
SIGIR  2020

Tl;DR: A novel, unsupervised method for extracting scientific concepts from papers, based on the intuition that each scientific concept is likely to be introduced or popularized by a single paper that is disproportionately cited by subsequent papers mentioning the concept.

Building a Better Search Engine for Semantic Scholar

Sergey Feldman
blog  2020

Tl;DR: 2020 is the year of search for Semantic Scholar, a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. One of our biggest endeavors this year is to improve the relevance of our search engine, and my mission beginning at the start of the year was to figure o... ut how to use about 3 years of search log data to build a better search ranker.

SLEDGE: A Simple Yet Effective Baseline for Coronavirus Scientific Knowledge Search

Sean MacAvaney, Arman Cohan, and Nazli Goharian
preprint  2020

Tl;DR: We present a SLDEDGE, a search system that utilizes SciBERT to effectively re-rank articles related to SARS-CoV-2. SLEDGE achieves state-of-the-art results on the TREC covid search round 1 benchmark.

  • s2 View and cite on Semantic Scholar
  • PDF View PDF

CORD-19: The Covid-19 Open Research Dataset

Lucy Lu Wang, Kyle Lo, Yoganand Chandrasekhar, Russell Reas, Jiangjiang Yang, Darrin Eide, Kathryn Funk, Rodney Kinney, Ziyang Liu, William Merrill, Paul Mooney, Dewey Murdick, Devvret Rishi, Jerry Sheehan, and 10 more...
ACL, NLP-COVID workshop   2020

Tl;DR: The Covid-19 Open Research Dataset (CORD-19) is a growing 1 resource of scientific papers on Covid-19 and related historical coronavirus research. CORD-19 is designed to facilitate the development of text mining and information retrieval systems over its rich collection of metadata and structured fu... ll text papers.

  • s2 View and cite on Semantic Scholar
  • PDF View PDF

TREC-COVID: Constructing a Pandemic Information Retrieval Test Collection

Ellen M. Voorhees, Tasmeer Alam, Steven Bedrick, Dina Demner-Fushman, William R. Hersh, Kyle Lo, Kirk Roberts, Ian Soboroff, and Lucy Lu Wang
preprint  2020

Tl;DR: TREC-COVID is a community evaluation designed to build a test collection that captures the information needs of biomedical researchers using the scientific literature during a pandemic.

  • s2 View and cite on Semantic Scholar
  • PDF View PDF

GrapAL: Querying Semantic Scholar's Literature Graph

Christine Betts, Joanna L. Power, and Waleed Ammar
NAACL, Demo   2019

Tl;DR: We introduce GrapAL (Graph database of Academic Literature), a versatile tool for exploring and investigating scientific literature which satisfies a variety of use cases and information needs requested by researchers.

  • s2 View and cite on Semantic Scholar
  • PDF View PDF

Content-Based Citation Recommendation

Chandra Bhagavatula, Sergey Feldman, Russell Power, and Waleed Ammar
NAACL  2018

Tl;DR: We embed a given query document into a vector space, then use its nearest neighbors as candidates, and rerank the candidates using a discriminative model trained to distinguish between observed and unobserved citations.