|START Conference Manager|
ASIST 2012 Annual Meeting
Baltimore, MD, October 26-30, 2012
Mining Classifications from Social-Ecological Databases
Scott Jensen, Miao Chen, Xiaozhong Liu, Beth Plale and David Leake
Social-ecological research is characteristic of long-tail science, with many region-specific studies of social and ecological phenomena that collectively yield a large volume of highly heterogeneous, small data sets. This variability makes it difficult to determine the applicability of a particular data set for a new research question, hindering the reuse of data that has been often collected through extensive effort. In this paper we present results of automatic classification of social-ecological data into categories defined by a domain model called the SES Framework. We have applied our methods to the classification of a relational database containing over 18 years of research on forest systems. Our preliminary results suggest that decision tree-based classifiers along with textual features perform well at this task. Furthermore, social-ecological data sets are found to exhibit distinct classification features in that the results are promising even for classes that comprise a relatively small portion of the database.