"On the Books in South Carolina: Mining for Jim Crow Laws" by Kate F. Boyd, Vandana Srivastava et al.
 

Document Type

Paper

Abstract

On the Books in South Carolina: Mining for Jim Crow Laws is a collections-as-data and machine learning project by the University of South Carolina Libraries (USC), sub awarded by the University of North Carolina at Chapel Hill (UNC), and made possible by The Andrew W. Mellon Foundation, for the period of May 2022 - December 2024. Following UNC’s steps from their first year of the grant, the USC project created a text corpus of South Carolina state legislature acts passed in the period from Reconstruction through the Civil Rights Movement (1868-1968). The USC team then utilized machine learning techniques to create a model classifying the laws as either Jim Crow or not.

Products from the project include a web site, Tableau dashboard, text file, and CSV file listing all acts during the period that were identified as those likely to include Jim Crow language. All programming work was done using Python and is documented on the project’s GitHub. This white paper describes the methods and workflows used to create the corpus, the machine learning techniques applied to identify the Jim Crow language in South Carolina, and the project’s findings and limitations.

Rights

© 2025, The Authors. This work is licensed under CC BY-NC-SA 4.0 .

APA Citation

Boyd, K., Srivastava, V., Frear, C., DuPre, L., Gupta, N., & Donaldson, B. (2025). On the Books in South Carolina: Mining for Jim Crow Laws [White Paper].

PredLabels_corpus-final-with-review-column.xlsx (65937 kB)
PredLabels_corpus-final-with-review-column

Share

COinS