Introduction to the Constellate Text and Data Mining Resources

Constellate is a text and data mining (TDM) service that provides access to nearly all text content in JSTOR as well as significant content from Portico and several other smaller collections. In addition to access to these data, Constellate allows users to perform basic text analyses directly from its web platform, as well as to perform more complex analyses using various Python packages through its cloud-based Jupyter notebook interface.

In the first session of this do-it-yourself series, users will learn how to log in to access the enhanced features of the platform that are enabled by Vanderbilt’s subscription, build a dataset, and perform standard text analyses on it. In the second session, we will demonstrate how to start up an example Jupyter notebook using their server, then walk through a Python-based workflow to analyze a corpus built from Constellate data. The freely-available notebooks provided by Constellate may be helpful as examples even if you intend to ultimately analyze your own text corpus.

No background is necessary for the first session. Familiarity with Python and Jupyter notebooks is helpful, but not required, for the second session. Please register in advance using the links above if you plan to attend either or both sessions.

Sponsors: Center for Digital Humanities, The Digital Commons, and the Digital Scholarship and Communications (DiSC)
Facilitator: Steven J. Baskauf, Data Science & Data Curation Specialist, DiSC
Session 1: Wednesday, September 14, 2:30pm – 3:30pm. Register here.
Session 2: Wednesday, September 21, 2:30pm – 3:30pm. Register here.
Location: 1101 19^th Ave S., Room 118