Grant will expand University Libraries’ use of machine learning to identify historically racist laws

January 19, 2022

Back and white photographs of Pauli Murray and protesters with page of typed laws behind the photos

Since 2019, experts at the University of North Carolina at Chapel Hill’s University Libraries have investigated the use of machine learning to identify racist laws from North Carolina’s past. Now a grant of $400,000 from The Andrew W. Mellon Foundation will allow them to extend that work to two more states. The grant will also fund research and teaching fellowships for scholars interested in using the project’s outputs and techniques.

On the Books: Jim Crow and Algorithms of Resistance began with a question from a North Carolina social studies teacher: Was there a comprehensive list of all the Jim Crow laws that had ever been passed in the state?

Finding little beyond scholar and activist Pauli Murray’s 1951 book “States’ laws on race and color,” a team of librarians, technologists and data experts set out to fill the gap. The group created machine-readable versions of all North Carolina statutes from 1866 to 1967. Then, with subject expertise from scholarly partners, they trained an algorithm to identify racist language in the laws.

“We identified so many laws,” said Amanda Henley, principal investigator for On the Books and head of digital research services at the University Libraries. “There are laws that initiated segregation, which led to the creation of additional laws to maintain and administer the segregation. Many of the laws were about school segregation.” Other topics included indigenous populations, taxes, health care and elections, Henley said. The model eventually uncovered nearly 2,000 North Carolina laws that could be classified as Jim Crow.

Henley said that On the Books is an example of “collections as data”—digitized library collections formatted specifically for computational research. In this way, they serve as rich sources of data for innovative research.

The next phase of On the Books will leverage the team’s learnings through two activities:

  • The grant will directly support several research and teaching fellows. Research fellows will pursue their own projects, making use of On the Books products: laws, workflows, scripts and the project website. Teaching fellows will develop and deliver college-level instructional modules featuring materials from On the Books. Calls for proposals will be issued in spring 2022.
  • A portion of the funding will also be regranted. Recipients will use On the Books workflows, scripts and tools to identify Jim Crow language in laws from two other states and will receive detailed guidance and mentoring from the UNC-Chapel Hill team. The call for proposals for partner states will be issued in January 2022.

“We’ve gained a tremendous amount of knowledge through this project – everything from how to prepare data sets for this kind of analysis, to training computers to distinguish between ‘Jim Crow’ and ‘not Jim Crow,’ to creating educational modules so others can use these findings. We’re eager to share what we’ve learned and help others build upon it,” said Henley.

On the Books began in 2019 as part of the national Collections as Data: Part to Whole project, funded by The Andrew W. Mellon Foundation. Subsequent funding from the ARL Venture Fund and from the University Libraries’ internal IDEA Action grants allowed the work to continue. The newest grant from The Mellon Foundation will conclude at the end of 2023.

Learn more about On The Books through the project website and the 2021 article, “Algorithms of resistance reveal extent of North Carolina’s Jim Crow laws.”