CIDR Python Introduction to Text Analysis
Date and Time
February 8, 2022
2:00pm to 4:00pm
Location
Zoom
Audience
Faculty/Staff
Students
Event Sponsor
Stanford University Libraries
Contact
muzzall@stanford.edu
This two-part workshop series will introduce you to working with text data using the spaCy and textacy Python libraries. You will learn to effectively streamline text preprocessing, understand word tokenization, sentence segmentation, part-of-speech tagging, and named entity recognition using a single document (a short excerpt from H.G. Wells's A Short History of the World) and across a corpus of legal documents.
Prerequisites: The Introduction to Python CIDR workshop or equivalent experience.
- Materials: https://github.com/sul-cidr/Workshops/tree/master/Text_Analysis_with_Python
- This workshop will use Jupyter Notebooks. Make sure you have Python Anaconda Distribution 3.6+ installed before the start of the workshop: https://www.anaconda.com/products/individual