ASIS&T 2013 Annual Meeting 
Montréal, Québec, Canada | November 1-5, 2013

 
Data Mining for “Big Archives” Analysis: a Case Study

Maria Esteva, University of Texas at Austin
Weijia Xu, University of Texas at Austin
Jeffrey Tang, University of Texas at Austin
Karthik Padmanabhan, University of Texas at Austin

Tuesday, 10:30am


Summary

The use of computational methods for archival analysis has not been thoroughly explored by the archival community. In this paper we present a case of archival analysis using a combination of data mining methods. The team of researchers, composed by archivists and computer scientists, used a collection of declassified Department of State Cables as a case study. The methods implemented included Support Vector Machine (SVM) and Association Rule Mining. Combined in an analysis workflow, the results of the different methods allowed the team to understand how security classification changed over time and to generate descriptions for the cables in each security class based on patterns found inductively in the collection. The application of data mining methods for archival analysis has the potential of changing the way in which big digital archival collections are processed.