START Conference Manager    

ASIST 2012 Annual Meeting 
Baltimore, MD, October 26-30, 2012 

Mining Classifications from Social-Ecological Databases
Scott Jensen, Miao Chen, Xiaozhong Liu, Beth Plale and David Leake

Monday, 6:30pm


Social-ecological research is characteristic of long-tail science, with many region-specific studies of social and ecological phenomena that collectively yield a large volume of highly heterogeneous, small data sets. This variability makes it difficult to determine the applicability of a particular data set for a new research question, hindering the reuse of data that has been often collected through extensive effort. In this paper we present results of automatic classification of social-ecological data into categories defined by a domain model called the SES Framework. We have applied our methods to the classification of a relational database containing over 18 years of research on forest systems. Our preliminary results suggest that decision tree-based classifiers along with textual features perform well at this task.  Furthermore, social-ecological data sets are found to exhibit distinct classification features in that the results are promising even for classes that comprise a relatively small portion of the database.