Follow the Data: How Astronomers Use and Reuse Data
Ashley Sands, Christine L. Borgman, Laura Wynholds and Sharon Traweek

Libraries are collecting many types of materials other than traditional, formal publications. Many research libraries have begun to curate, preserve, and provide access to datasets. Our research assesses what new infrastructures, divisions of labor, knowledge, and expertise are necessary for the proper care of data. Between May 2011- February 2012, we conducted fourteen interviews employing Sloan Digital Sky Survey (SDSS) data use as the focus. SDSS is a multi-faceted, multi-phased data-driven telescope project with hundreds of collaborators and thousands of users of the open data. The Follow the Data interview protocol identifies a single publication authored by each interviewee and uses it as a lens looking backward and forward to identify data uses leading into and out of the publication.

We have four preliminary findings from this research. Astronomy has three distinctive kinds of publications: traditional research articles, technical papers, and data description papers. One finding is the distinct publications reflect different practices associated with data citation in astronomy. Whether or not these practices scale has implications for information scientists working on establishing data citation standards. Second, individual research articles may draw upon multiple distinct datasets including: catalogs, source lists, data releases, value-added catalogs, cross-match catalogs, simulation outputs, data papers, technical papers, as well as data contained within science papers. A third finding is there are multiple ways astronomers locate and obtain data for use, including both formal and informal methods. Fourth, personal contact still ranks as an important method for obtaining data despite open technologies. ASIS&T members should understand the changing science and the more nuanced ways information retrieval and data curation need to be considered. Our continued research into large sky surveys, particularly the SDSS data practices, will be used to inform the design of open data projects and digital libraries.