Where to Find Millions of Books and How to “Read” Them: HathiTrust and HTRC

Speaker, Ryan Dubnicek, Digital Humanities Specialist, HathiTrust Research Center

This workshop will introduce attendees to the HathiTrust Research Center’s tools and services for utilizing the massive HathiTrust Digital Library for computational text analysis. The HTRC leverages the scope and scale of the HathiTrust corpus to allow researchers the opportunity to perform text data mining. Topics that will be covered include:

  • How the HTRC makes HathiTrust volumes available for text mining.
  • How to identify relevant volumes and build worksets (collections) of content for analysis.
  • How to use HTRC off-the-shelf tools for text analysis and visualization.
  • How to access HathiTrust data and metadata via provided APIs, request procedures, and open datasets.

Digital Scholarship Conversation@IAS, www.ias.edu/digital-scholarship, ds@ias.edu