Bulletin, June/July 2006

Growing Vocabularies for Plant Identification and Scientific Learning


by Jane Greenberg, Bryan Heidorn, Stephen Seiberling and Alan S. Weakley

Jane Greenberg is associate professor and director of the SILS Metadata Research Center at the University of North Carolina at Chapel Hill. She can be reached at janeg<at>ils.unc.edu. 

Bryan Heidorn is associate professor at the University of Illinois at Urbana-Champaign. He can be reached at pheidorn<at>uiuc.edu

Stephen Seiberling and Alan S. Weakley are affiliated with the Herbarium at the University of North Carolina at Chapel Hill. They can be reached, respectively, at sseiber<at>email.unc.edu and weakley<at>unc.edu

Today, institutions and partnerships of national and international stature such as the Royal Botanical Gardens, Kew and the San Diego Zoo are digitizing scientific specimens and targeting students at all levels of learning, both academic and lifelong. They want to reach audiences beyond the seasoned scientists and enrich science curricula. Web connectivity and digitization alone are not sufficient for student access to primary scientific resources. This is because the student’s knowledge base – and thus working vocabulary – differs greatly from the scientist’s professional vocabulary. Student vocabulary is generally a mix of common terminology and newly learned scientific terms, whereas scientists communicate with a fine-grained, descriptively rich vocabulary and scientific terminology.

The student/scientist vocabulary gap must be bridged in order for students to take advantage of digital initiatives containing scientific specimens. The University of North Carolina Plant Language Team (U-PLanT) has been addressing this need over the last few years by developing a series of vocabulary tools for students learning scientific vocabulary and conducting plant identification. Through Project OpenKey, U-PLanT and the University of Illinois at Urbana-Champaign (UIUC) have also begun to study the complexity of descriptive vocabulary for plants. This article reports on U-PLanT’s vocabulary solutions and activities directed at closing the student/scientist vocabulary gap. 

Plant Identification Vocabulary Needs
Plant identification is a common part of school curricula because of the historical and present significance of plants and their critical role in harnessing the sun’s energy to support animal life. The object of plant identification is to determine the plant’s scientific name (species taxon). Plant identification is generally conducted with a tool known as a plant key.

Plant classification can be traced back more than 2000 years to the Greek philosopher Aristotle (384-322 BC), who classified plants by characteristics such as shape and habitat. More familiar is the hierarchical Linnaean system emphasizing plant morphology, developed in the mid-1700s by the Swedish naturalist Carl von Linné. Modern plant taxonomy has an intricate vocabulary organized under the seven-level hierarchy of kingdom, phylum, class, order, family, genus and species.

Plant identification also requires descriptive vocabulary – terms providing interpretative information representing plant characters and character states. A plant character is a property that can be observed, measured, counted or examined in some fashion. Examples include growth habit, leaf shape, leaf margin (the edge of the leaf) and stem type. Character states are expressions or states of plant characters – they are the possible values for plant characters and reveal the fine-grained terminology that botanists use to describe plants. For example, a student may describe a leaf margin as being “smooth edged” or “toothed,” whereas a botanist would identify different levels of serration as illustrated in Example 1. Descriptive vocabulary is required for identifying and distinguishing specimens and for learning plant taxonomy. Plant taxonomy and descriptive vocabulary present challenges central to U-PLanT’s work. 

Example 1. Leaf Margin Examples


Doubly Serrate

U-PLanT: Research and Development
U-PLanT is a partnership of UNC’s Herbarium and the School of Information and Library Science, the North Carolina Botanical Garden and, more recently, the Graduate School of Library and Information Science at the University of Illinois at Urbana-Champaign. U-PLanT formed to address vocabulary issues central to the educational use of digital specimens for plant identification. Three key projects catalyzing U-PLanT’s formation are BOTNET, a digital herbarium for plant specimens; the Plant Information Center (PIC), a digital learning center; and Project OpenKey, a collaboration between UNC and UIUC providing access to botanical resources through polyclave plant keys visually capturing the way botanical experts identify species.

These three projects have focused on plant identification, specifically North Carolina trees. OpenKey, which was initiated at UIUC, has also focused on the identification of Illinois prairie plants. Primary research and development questions underlying these projects are:
     1. What tools can help students learn botanical terminology?
     2. What steps aid vocabulary development for plant keys
         supporting plant identification?
     3. What principles guide descriptive vocabulary development for
         plant identification?

Research Methods and Inquiry
A number of research projects have been conducted through BOTNET, PIC and OpenKey. Several of these research projects addressed vocabulary issues more intimately than others, although all of the research has helped U-PLanT to better understand vocabulary issues specific to science education and digital initiatives containing primary resources. Research activities have included the following:

  • Interviews with botanical experts about plant vocabulary
  • Meta-analyses of a range of plant vocabulary sources
  • Experiments testing the feasibility of the Dublin Core metadata standard for specimen representation
  • Three PIC usability studies
  • Three metadata creation studies involving non-experts
  • An analysis of the general public’s botany-related frequently asked questions (FAQs), logged at the North Carolina Botanical Garden.

Development Solutions
U-PLanT has implemented several solutions to help address the student/scientist vocabulary gap and make primary scientific resources accessible to students. These solutions include 1) a suite of vocabulary tools, 2) a process model outlining vocabulary development steps and 3) the identification of guiding principles for vocabulary development.

A suite of vocabulary tools. U-PLanT’s efforts have led to the development of a series of vocabulary tools. Vocabulary terms and definitions presented in these tools have been obtained from close to 20 bibliographic sources and from U-PLanT members with expertise in botany. Development has been incremental, and all of the tools are Web accessible via www.ibiblio.org/pic or www.ibiblio.org/openkey. The tools are listed here.

  1. Technical Plant Glossary. A highly technical plant glossary linking to a digital version of Vascular Plant Systematics by Radford, Dickison, Massey & Bell, published in 1976 by Harper and Row.
  2. Student Botanical Dictionary. A dictionary that includes 157 terms fundamental to studying botany. 
  3. UNC-OpenKey Glossary of Botanical Terms. A glossary with close to 600 terms defining vocabulary used in the Common Trees of the North Carolina Piedmont Polyclave Plant Key, produced as part of Project OpenKey. Terms have been extracted from the UNC-UIUUC Conceptual Table of Descriptive Vocabulary Terms – www.isrl.uiuc.edu/~openkey/docs/OpenKey_Taxon_Data_Sheet.doc.
  4. Botanical Dictionary. A comprehensive dictionary containing approximately 2000 terms for higher education botany students. 

A process model. U-PLanT has identified seven general steps underlying its vocabulary development, which form the basis of a process model. These steps continue to guide the iterative development of U-PLanT vocabularies as digitization extends to new sets of taxa. These steps may be useful to other digitization efforts focusing on student access to primary scientific resources.

Descriptions of the seven steps defining the process model follow below:

  1. Identify. Identify existing vocabularies that are useful to project goals. The cliché, “Why reinvent the wheel?” is applicable to vocabulary development.
  2. Evaluate. Use practical and economic measures to identify what is useful and what is not useful in existing vocabulary tools. Vocabulary sources requiring limited revision should be adapted to project needs. Recognize, however, when it is more efficient to build a new tool, rather than expend resources on vocabularies requiring too much revision.
  3. Modify. Enhance, extend and delete vocabulary in existing tools to meet project needs. Existing vocabulary tools will likely require modification to fit project goals, particularly when dealing with primary resources.
  4. Transform. Make the vocabulary suitable for the access environment by, for instance, encoding it in X/HTML or XML for Web access.
  5. Implement. Make the vocabulary tool operational after transformation to the desirable format(s). 
  6. Test/evaluate. Once implemented, evaluate the vocabulary’s functionality. Usability studies noted in the U-PLanT’s research and development activities above have permitted the evaluation of vocabulary tool effectiveness. 
  7. Revise. Vocabulary development is organic and revisions are required due to new discoveries and general collection growth. 

Example 2. Leaf Complexity

2a. Flowering Dogwood
(Cornus florida)
Leaf complexity=simple

2b. Green ash
(Fraxinus pennsylvanica)
Leaf complexity=complex


Guiding principles for descriptive vocabulary development. U-PLanT in collaboration with UIUC has identified principles driving the creation of descriptive plant vocabulary. One guiding principle requires that all characters are coded using the most specific scientific terminology available. Consider the plant character leaf complexity: flowering dogwood leaves (Example 2a) develop from a separate bud, so leaf complexity is simple, whereas a green ash leaf (Example 2b) is considered compound because the specimen appears to have multiple leaves, although it is a single leaf developed from a single bud.

Each vocabulary entry must also be unambiguous and discreet. Unambiguous means there can be only one meaning for any particular word while discreet means concepts have well defined boundaries, making it evident if a specimen exhibits particular character states (or not). In some cases a specimen may possess more than one character state value. For example, a flower may be either “red” or “blue” or both “red” and “blue.” The definitions of red and blue should be clear so that these values are unambiguous and discreet. Despite these principles, there are continuous character states, such as “height.”

The last guiding principle is that vocabulary must have consistency – that is the same term must have the same meaning each time it is used. U-PLanT and UIUC have been able to accomplish this through a process of continuous refinement and negotiation. 

U-PLanT was formed to address the student/scientist vocabulary gap and to facilitate student access to digital plant specimens. This paper identifies vocabulary challenges related to specimen identification and a series of solutions. Vocabulary development has been integral to functionality and success of U-PLanT’s projects – BOTNET, PIC and Project OpenKey. In the larger world, vocabulary is essential to any educational digital initiative providing access to scientific specimens or other primary resources.

As education becomes increasingly global via the Web, Edmund Wilson among others, states in his 2003 article, “Trends and Ecology,” in Encyclopedia of Life, that people are increasingly calling for a single Web resource to describe all life on earth. If we want to move in this direction, engage students in the process of scientific discovery and inform people about the natural world on a global scale, we need to continue to study vocabulary challenges and share our knowledge and experience. 

A fuller version of this paper was presented at the Dublin Core 2005 conference, Madrid, Spain, and can be found at www.slais.ubc.ca/PEOPLE/faculty/tennis-p/dcpapers/paper14.pdf.

We acknowledge the Institute of Museum and Library Services (IMLS) for funding support. Thank you also to Evelyn Daniel, SILS/UNC; Peter White, North Carolina Botanical Garden and UNC/Biology; and Kenneth R. Robertson, Illinois Natural History Survey for their contributions to U-PLanT.