A digital library and cyberinfrastructure facilitating the discovery and utilization of data & knowledge in published documents

>12,500,000 documents

~200,000 added this month

~50,000 added this week

~8,000 added in the last 24 hours

Enabling TDM

In collaboration with our UW Library staff team members, xDD negotiates agreements with publishers that allow programatic downloading and mining of published content.

All documents are securely stored on an access-controlled server at the heart of our digital library infrastructure (xDD team members and our collaborators do not have access to original content via our infrastructure). UW-Madison's Center for High Throughput Computing supplies the computational power for processing documents using NLP, OCR, and other software tools useful for TDM tasks, which also allows for deploying new tools quickly against all existing documents.


End-user Workflow

Have an idea

A question that can be answered by mining the scientific literature. Over 13 million documents are currently available.

Explore the xDD API

Generate summaries of journal coverage, search for and analyze relevant terms from document full text.


We are always looking to support creative new work and to form new collaborations on projects that lead to new scholarly work. All xDD output is licensed under CC-BY-NC.

Get in touch

Whether you're a publisher interested in contributing your content to our infrastructure, a scientist interested in collaboration, or just curious to know more, let us know!


The Team

The xDD team is based at the University of Wisconsin - Madison and is made up of domain experts in both the Geosciences and Computer Sciences, librarians, infrastructure developers, with participation of undergraduate, graduate, and postdoctoral researchers

Project Leads

Shanan Peters

Project Lead


Miron Livny

Project Lead

Morgridge Institute for Research

Theo Rekatsinas

Project Lead

ETH Zurich

Shivaram Venkataraman

Project Lead

Computer Science

Brian Bockelman

Infrastructure Lead

Morgridge Institute for Research

Ian Ross

Lead Developer

Computer Science

Core Team

Daven Quinn

Research Scientist (Geoscience)

Brian Aydemir


Morgridge Institute for Research

Carl Edquist


Computer Sciences

Jeff Peterson

Systems Administrator

Morgridge Institute for Research

Aimee Glassel

Academic Librarian


Cannon Lock

Web Developer

Morgridge Institute for Research

Mohil Patel

Software Developer

Graduate Student

Xuxiang Sun

Software Developer

Undergraduate Student


Christoper Ré

GeoDeepDive former project lead

Stanford Computer Science

Jon Husson

Assistant Professor

University of Victoria

Andrew Zaffos

Senior Research Scientist

Arizona Geological Survey

Julia Wilcots

App Builder

Princeton Graduate Student

John Czaplewski

Developer (former)

Valerie Syverson

Postdoctoral Researcher


Ce Zhang

Assistant Professor

ETH Zurich

Erika Ito

Data Integrator

KoBold Metals

Chao Liu

Postdoctoral App Builder

Carnegie Institute

Daniel Wieferich

Physical Scientist


Brandon Serna

Software Developer