Data Science, Text Mining, and Natural Language Processing

In 2008, Professor Guldi became the first person to hold a position in the field of digital history, when she took a post by that name at the University of Chicago. She has since taught and published widely about the promise of “digital history” — the methodological questions about how to detect change over time aided by computational means. The History Manifesto (Cambridge University Press 2014), written in collaboration with historian David Armitage, is a sweeping reflection upon how the accumulation of digitized texts in the contemporary world requires new methods in the humanities. Guldi and Armitage explored how historians might become the designers of tools for mining digital texts that are specifically designed for examining nuanced and complex historical questions.

Professor Guldi’s latest book, The Dangerous Art of Text Mining (Cambridge University Press, June 2023), brings together many of the papers and talks she has given on the delicate reconciliation between the powerful modern technologies and the rigorous nuanced approach of traditional historians.

Professor Guldi founded and directs the digital humanities minor at SMU. As a member of the faculty of the committee on data science at SMU, she teaches courses in Python and R. She has also led an interdisciplinary hack-a-thon program called Think-Play-Hack: https://blog.smu.edu/think-play-hack/people/

Professor Guldi is the PI of a $1,000,000 NSF grant for applied text mining to questions from history. She is the director of Democracy Lab, a venue in which graduate students and undergrads apply digital methods to the study of democratic debates. Student projects include text mining the story of how debates about the environment, gender, and sexuality have changed over years and decades, using the debates of the U.S. Congress, the Dallas and Houston city council minutes, the U.K. Parliament, and Reddit.

 

Relevant Articles and Talks


“Critical Search,” John Hopkins Workshop in Digital History, John Hopkins University, Baltimore, MD, July 31, 2020.

“The Dangerous Art of Text Mining,” Keynote, Conference, Roy Rosenzweig Center for History and New Media, October 4, 2020.

“The Dangerous Art of Text Mining,” Keynote, Stanford Data Practices Conference, October 2, 2020.

“Topic Modeling the History of Infrastructure in Nineteenth-century Great Britain,” Technology and Culture (forthcoming 2019)

“Critical Search: A Procedure for guided reading in large-scale textual corpora,” Journal of Cultural Analytics (December 2018). Includes code and data.

“Topic Modeling the History of Technology,” University of Oklahoma, April 13, 2018.

“The Digital Humanities in 2025” and “An Introduction to Topic Modeling,” keynotes for the Digital Humanities Road Tour, various universities across Finland, January 27-February 3, 2018.

“Frontiers of Text Mining,” Amazon Ideas Conference, Seattle, WA, Oct 16-18, 2018.

"From Critical Thinking to Critical Search: Working between microhistory and macrohistory with big data," Conference on Humanities and the Arts in the Age of Big Data, University of Illinois Urbana-Champagne, Oct. 4-5, 2018.

“Topic Modeling,” Talk at the Fondren Library Prism Panel, Southern Methodist University, October 23, 2017.

Nov 2012 - MITH - Topic Modeling Workshop: Guldi and Johnson-Roberson

Topic Modeling Workshop, MITH, University of Maryland, College Park, November 3, 2012 http://vimeo.com/53078693