A digital library and cyberinfrastructure facilitating the discovery and utilization of data & knowledge in published documents

>12,500,000 documents

~200,000 added this month

~50,000 added this week

~8,000 added in the last 24 hours

Enabling TDM

In collaboration with our UW Library staff team members, xDD negotiates agreements with publishers that allow programatic downloading and mining of published content.

All documents are securely stored on an access-controlled server at the heart of our digital library infrastructure (xDD team members and our collaborators do not have access to original content via our infrastructure). UW-Madison's Center for High Throughput Computing supplies the computational power for processing documents using NLP, OCR, and other software tools useful for TDM tasks, which also allows for deploying new tools quickly against all existing documents.


End-user Workflow

Have an idea

A question that can be answered by mining the scientific literature. Over 13 million documents are currently available.

Explore the xDD API

Generate summaries of journal coverage, search for and analyze relevant terms from document full text.


We are always looking to support creative new work and to form new collaborations on projects that lead to new scholarly work. All xDD output is licensed under CC-BY-NC.

Get in touch

Whether you're a publisher interested in contributing your content to our infrastructure, a scientist interested in collaboration, or just curious to know more, let us know!


The Team

The xDD team is based at the University of Wisconsin - Madison and is made up of domain experts in both the Geosciences and Computer Sciences, librarians, infrastructure developers, with participation of undergraduate, graduate, and postdoctoral researchers

Project Leads

Shanan Peters

Project Lead


Miron Livny

Project Lead

Morgridge Institute for Research

Shivaram Venkataraman

Project Lead

Computer Science

Brian Bockelman

Infrastructure Lead

Morgridge Institute for Research

Ian Ross

Lead Developer

Computer Science

Core Team

Aimee Glassel

Academic Librarian


Brian Aydemir


Morgridge Institute for Research

Cannon Lock

Web Developer

Morgridge Institute for Research

Daven Quinn

Research Scientist (Geoscience)

Jeff Peterson

Systems Administrator

Morgridge Institute for Research

Matt Westphall

Research Cyberinfrastructure Specialist

University of Wisconsin-Madison

Mohil Patel

Software Developer

Graduate Student

Xuxiang Sun

Software Developer

Undergraduate Student

Yuxiao Qu

Research Software Engineer

Morgridge Institute For Research


Andrew Zaffos

Senior Research Scientist

Arizona Geological Survey

Benjamin M. Gyori

Research Fellow

Harvard Medical School

Brandon Serna

Software Developer


Ce Zhang

Assistant Professor

ETH Zurich

Chao Liu

Postdoctoral App Builder

Carnegie Institute

Christoper Ré

GeoDeepDive former project lead

Stanford Computer Science

Clayton Morrison

Associate Professor

University of Arizona School of Information

Daniel Wieferich

Physical Scientist


Erika Ito

Data Integrator

KoBold Metals

John Czaplewski

Developer (former)

Jon Husson

Assistant Professor

University of Victoria

Julia Wilcots

App Builder

Princeton Graduate Student

Micheal Cafarella

Principal Research Scientist


Pascale Proulx

Director, Visual Analytics Research

Uncharted Software

Theo Rekatsinas

xDD former Project Lead


Valerie Syverson

Postdoctoral Researcher