Editor’s Note: This paper has been edited for the Bulletin from Mr. Lu’s much longer submission to the SIG/III 2002 International Paper Competition, in which it was awarded first place. The original article contains numerous references to support the statistics and data presented here, but space precludes including them in the Bulletin. Mr. Lu presented his paper at the 2002 ASIST Annual Meeting.

Creating Special Literature Resource Databases in Western China Under a Digital Environment
by Lu, Ji

Lu, Ji is affiliated with the Yunnan Province Library in Kunming, P. R. China 650031. He can be reached by e-mail at luji368@msn.com

The new technological revolution with digitization, networking and information as its major features is sweeping across the globe and exerting great influence upon social, political, economic and cultural activities as well as people’s daily lives. Sharing global resources requires many diverse databases. Ever since the end of the 1980s, based on their faster economic development, Southern, Eastern, Northern and Central China have started digitization of traditional Chinese information resources and have achieved a certain level of progress. There were only 806 (mainly bibliographic) databases in 1991, but they had increased to over 8000 in 2000.

However, Western China, which is rich in historical and cultural resources, has lagged behind in information digitization and has been unable to turn its resource advantage into a development advantage. Western China has started construction of digital libraries, but only some traditional catalogs (with various forms of entries) and a limited number of small graphical and textual databases for local tourism or about minority ethnic group traditions have been made available. These databases also suffer from problems of uneven quality, unnecessary duplication and low rates of access and utilization. There is no authorized agency for coordination of information resource acquisition or digital base construction in Western China.

This paper focuses on the special characteristics and position of Western China literature resources; their distribution in major libraries in Western China; existing problems in collecting, organizing and providing access to these special resources; governmental policies and investments for information resource development; and possible approaches for developing the special resource digital databases in Western China.

Western China Literature Resources

Western China is composed of the following 12 provinces, municipalities and autonomous regions: Shaanxi, Gansu, Qinghai, Ningxia, Xinjiang, Sichuan, Chongqing, Yunnan, Guizhou, Tibet, Guangxi and Inner Mongolia. This vast area includes 53.8 million square kilometers (56 percent of the country) and 358.46 million people (in 1999), or 23 percent of the total.

Western China boasts very rich natural and cultural resources. Among China’s 56 diverse nationalities, 44 of them are in Western China. Western China is also the location of such important and symbolic historical cultural remains as the Mogao Grottoes in Dunhuang, the terra cotta figures of warriors in the First Emperor’s Mausoleum, the ancient Loulan Kingdom, the Yuanmou Man site, the Potala Palace and the southern and northern Silk Routes.

Western China Literatures as discussed in this paper refers to the following resources:

  • Local literatures – the information of specific localities that assumes the role of carrier of their culture, comprehensively recording and maintaining the historical conditions and events related to the local politics, economy, culture, education and other important information
  • The literature of local nationalities or ethnic groups collected and maintained in the public libraries.
  • Other multimedia literatures to be developed in the Western China.

The second category, the literature of nationalities (or ethnic groups), is the totality of the specially-featured literature resources about their history, geography, humanism, natural surroundings, economic conditions, culture and so forth formed and accumulated in specific historical periods and specific regions. Due to the special features of the social histories, economies and cultures of the minority nationalities, their historical literatures can be divided into some basic types:

Orally Transmitted Literatures: A few ethnic groups did not develop writing scripts of their own in history, and their understanding of the natural world and their society was passed on generation by generation usually by means of the oral dictation. Even those ethnic groups who had their own scripts commonly resorted to oral transmission for passing on information, experience and knowledge.

Simple Graphic Symbol Literatures: A few of the minority nationalities used simple graphic patterns to record and transfer cultural information. For instance, those on woodcuts, bamboo carvings, stone carvings, gravestone inscriptions, sculptures or bronze ware reflect the production, life, historical events and historical figures, religious legends and religious dances of the minority nationalities. This kind of literature can often be collected and maintained by means of replicating, rubbing or photographic copying.

Textual Literatures of Minority Nationalities: Some ethnic groups used or are currently still using their own writing scripts for recording and transferring their cultural information. For example, among the 26 nationalities in Yunnan, there are 11 nationalities that formerly used 24 writing scripts. Through the reform of their writing system, currently there are 21 systems available for these 11 minority groups to use. Historical literatures written in such scripts are rather plentiful, forming, for instance, the Dongba literature in the Dongba pictographic characters of the Naxi people, the Yi literature, the Tibetan literature, the Mongolian literature, the Bai literature.

Literatures Containing Cultural Information about Specific Ethnic Groups Written in Chinese or the Writing Systems of Other Nationalities: In China the writing system of the majority Han nationality is dominant, and it is directly and indirectly used for recording the cultural information of minority nationalities or ethnic groups. The huge volume of such writings is very valuable.

And there are many other materials that are the object of study for scholars interested in the distinctive cultures, history and geography of Western China. These include archeological or other preserved sites and areas, as well as artistic and historical artifacts such as the oracle bone scripts, bamboo tablets, wooden tablets, silk books and their hand-sheets, block-printed copies, ancient calligraphy works and paintings, and inscription rubbings.

Special Literature Resources in Major Libraries in Western China

According to incomplete statistics, various libraries in China hold more than 2.2 million volumes of “rare” books made before 1794; 26.45 million volumes of  “ancient” books published before 1911; and 6300 periodicals, magazines and newspapers published before 1949. A part of these materials has been micro-processed in libraries for protection and utilization. The statistics by the end of 1999 show that 36.3 million camera shots had been done in the whole country, involving 2349 rare books, 2160 newspapers and 8325 periodicals. Another part of these materials will be processed for safekeeping by using the digital techniques. But, only the National Library and the local libraries of Shanghai City, Zhejiang Province, Guangdong Province and Shenzhen City currently do the above micro-processing.

In Western China, only Guangxi Library has established a certain number of database entries, including 560,000 catalog records, and a small number of the small-scale, specialized full-text databases. However, the provinces and autonomous regions that are extremely rich in literature resources such as Sichuan, Shaanxi, Gansu, Guizhou, Yunnan and Inner Mongolia have not yet started to carry out literature information digitization. Take Inner Mongolia Autonomous Region Library as an example. Its collections of Mongolian literature are extremely plentiful and are in an independent system. Among them are 7800 volumes of ancient Mongolian books and 71,200 volumes of Mongolian books published after the founding of the People’s Republic of China, in addition to 6300 volumes of Slav-Mongolian books, 14,600 volumes of ancient Tibetan books and 3400 volumes of ancient Manchu books. In these collections, there are the full sets of the Tibetan version of “Ganjur” scripture and the Mongolian versions of both “Ganjur” and “Danjur” scriptures. Currently, there are 44,730 volumes of Tibetan and Chinese local literature books in Inner Mongolia that are carefully kept in the library. Through scores of years of collation and processing, the book collections in the major libraries in Western China have formed comparatively complete systems in their structures, having considerable scale and special features. They are among the most important collections of Western China literature held by libraries and our first choice for the digitization.

At present, however, the level of construction and management for China’s libraries is still low in comparison with that of the developed countries, and this gap is prominent in Western China. By the end of 2001 in China, there were altogether 2689 public libraries above the county level, of which 963 were in the 12 provinces and regions in Western China (2001 statistics). They held 136.8 million books of around 400 million books in the whole country. In these areas there were 13 provincial libraries keeping 24.78 million books of 132 million in all the provincial libraries in China. The problems of inadequate book collections, low book quality, poor library facilities and shortage of book purchase funds at various levels of libraries have not been solved yet. In the year 2000, 738 or 27.6 percent of the libraries in China did not purchase even one new book for a whole year; 70 percent of these libraries are located in Western China.

Governmental Policies and Investments in Information Resource Development

As the pace of economic globalization increases, the Chinese government has been emphasizing the importance and urgency of speeding up information system construction to raise China’s comprehensive competitiveness. The government has promulgated relevant rules and regulations and has instructed the ministries and commissions concerned to work out detailed implementation guidelines and applicable measures. The government of China has started to realize that digital libraries are key to construction of the digitalized China.

Starting with the implementation of the Ninth Five-Year Plan in China, the Chinese central government has increased its investments in the construction of information resource databases, and the local governments allot large amounts of funds for basic research and development of information databases in concerted actions with the central government, which has made special-purpose financial appropriations available. By the end of 2001 the central government of China and the local governments involved had respectively made investments of 8.16 billion and 14.916 billion Yuan RMB (about 1 billion and 1.8 billion US dollars) to improve hardware in local libraries and support digitization. The National Library and the Shanghai Library consecutively launched digital library projects, and the provinces and regions in Western China also made positive responses. Nonetheless, almost all of the above support has been concentrated in the Eastern China areas. Only one project has been supported in Western China at Yunnan Provincial Library with funding in excess of 50 million Yuan (about 6.05 million USD).

In May 2000 the Chinese Ministry for Cultures (Proposals Concerning Western China Development Strategies and Strengthening Western China’s Cultural Construction) put forward 15 suggestions to promote the construction of a public library network system and digital libraries in the Western China areas. In August 2000 the Ministry of Science and Technology pointed out the importance of digitalization of Western China (Proposals Concerning Scientific and Technological Work in Western China Development). The General Planning for the Western China Development in the Period of the Tenth Five-Year Plan clearly pointed out the importance of vigorously pushing forward information system construction in the large and medium-sized cities, perfecting the computer information networks and developing the public information service platforms (Leading Team for Western China Development, State Council. General Planning for Western China Development During the Tenth “Five-Year Plan,” July 2002). “Proposals Concerning Several Policies and Measures for Western China Development” (Leading Team for Western China Development, State Council, August 2001) also emphasized the favorable policies adopted for the use of the state special-purpose subsidy funds both for cultural facility maintenance and for the cultural units at and above the county level in the border areas in Western China. Besides, the provinces and regions in Western China, when formulating the development guidelines adopted in the Tenth Five-Year Plan, all took information system construction as the priority for their near future work, in which resource digitization again stands at the fore.

In May 2002 the Ministry of Culture and the Ministry of Finance issued the “Circular Regarding Application of the All-China Cultural Information Resource Sharing Projects.” The circular also pointed out that the first batch of the 25-million-Yuan (3 million USD) special purpose funds for the “Sharing Projects” arranged by the central government as well as the relevant complementary funds to be allotted by the local governments for the year should mainly be used to support Western China and other underdeveloped areas for grassroots center construction. As of May 2002, 12 libraries in Western China had been enlisted among the member libraries covered by the digitization projects.

Creating Literature Resource Databases with the Characteristics of Western China

In China the state information infrastructure construction has taken shape, and the main communication network has also been established, which includes the data network, the optical fiber trunk network, the ATM network, the SHD synchronous digital serial network and the optical fiber linkage network. The wide-band networks under construction in some large and medium-sized cities in Western China will provide the necessary communication platform to carry out Western China literature resource digitization.

According to Xu Wenbo, the head of China Digital Library Development Strategy Group, the targets to be achieved for the construction of databases in the China Digital Library Project include that the system should be distributed with uniform standards, be able to work on unified network platform and be expandable (Xu, W. Thoughts on Creating Digital Libraries. May 2002. Available at www.ccnt.com.cn/library/luntan/show.htm?id=20010302002).

The digitization of Western China information resources should stress selection and quality rather than scope or amount of development at the initial phase. It has a very clearly defined target to enrich the Chinese network resources to allow the country’s characteristics to be more fully and perfectly exposed through the global Chinese resource networks. The establishment of a special resource database in Western China is an indispensable link in the formation of the all-China Chinese database groups and similar to the construction of special databases in other regions or localities. Such databases in Western China could not be implemented independent of the general framework of the state distributed asynchronous system of the Chinese resources.

The Selection of Database Resources

Since the Western China provinces and regions are vast in area with relatively poor communications and ill-balanced development in library facilities, each library involved in the special database development and construction should assign relevant personnel to investigate local resources as their conditions allows. They should study and collect relevant information material in such sources as the cultural organizations, art groups, museums, nationality communities, religious communities and geological departments, as well as the villages in all the districts, prefectures and cities. The construction of the resource databases should be phased and layered to establish the multiple hierarchies of the database protection systems step by step. Participating libraries and enterprises should start with small special projects that can eventually converge into the state-level special information resource database group services. Some steps necessary to ensure eventual merger include the following:

  • Each library should finish their investigations within a specific period and work out “White Paper Books for the Construction of Special Local and Minority Nationalities Literature Database.” They should develop plans for “Special Resource Catalogs” at that point and proceed to build up the catalog databases for special books for their own libraries. Finally, they should sum up the raw materials, do the online checks for duplicate records to avoid repeated construction and ensure the completeness of the materials collected.
  • Based on the China Combined Catalogs of the Local Literatures, China Catalogs of Rare Books, and the Catalogs of Literatures for Minority Nationalities, as worked out by each library with support from the literature information organizations that have completed catalog database construction for standard books, the special book catalog databases of Western China should be organized and established as soon as possible. These catalogs can be the basis of the local combined catalogs, can perfect and standardize the construction of the catalog data, promote inter-library loan and resource sharing and greatly reduce duplication of effort in database construction.
  • The projects should comprehensively utilize the database resources established and under current construction by various literature information organizations. Asynchronous database platforms for the special resources should be constructed and the unnecessary duplication of effort by libraries should be avoided so as to form a number of special literature resource database groups in a short period of time. If a project for a specialized database has been started, it is necessary to pay special attention and ensure that the general design method and general framework planning for this digital system is in appropriate coherence with the state construction plan for digital libraries.
  • Each library should investigate and analyze the original data available from the special full-text databases in existence or under the current construction as well as from electronic publications. There should be an appropriate way to use and catalog those non-digital media resources from radio broadcasting stations and television stations as well as from research units of various kinds, artistic groups and personages. The relevant software such as text-retrieval systems or optical character recognition could be used to do the necessary transformation and processing. The first step would be to make a great effort to produce multimedia databases with high quality content that relate to ecological tourism, ethnic culture, rare species, special minerals, flowers, butterflies, and ethnic dances, plays, operas and costume as the trial project to introduce the resources of a certain area.
  • The resources on the Internet should be effectively organized and utilized in support of the key projects and the key disciplinary development direction to build up all the specialized databases.

Organization of Databases

Scientific, standardized and normalized digital database construction is the fundamental element of network information dissemination. Standardization in libraries in Western China has lagged behind that in other parts of the country, especially with respect to seeking, collecting, cataloging, indexing and managing local literatures. The literatures of nationalities and their ancient books, in most cases, were handled by different libraries in different ways. Up to now, nearly all the classification systems seen or heard of in other parts of China such as “Four-Division Classification,” “Liu Guojun Classification,” “Pi Gaopin Classification” and the “Dewey Decimal Classification” have been adopted to classify the collections of special literatures in the libraries of Western China, including those at the provincial level.

To implement the digitization of the ancient book resources as national legacies, the first problem that must, therefore,  be solved is the normalization and standardization of cataloging methods. In China, the State Standard Commission should provide the solutions ahead of the schedule for the digitization. For instance, it should work out the ancient book and literature classifications, ethno-nationality literature cataloging and indexing methods for unified use across China.

In addition to meeting the literature processing standards, database construction must also be in compliance with the relevant network transfer protocols such as X.25, TCP/IP, ATM and DTM, as well as utilizing standard digital literature formats such as MARC, JPEG, GIF, PNC, PDF, MPEG-X, TXT, REALMAIDIL and MOV. It also needs to employ such mature network information processing tools as XML and the Z39.50 information retrieval protocol, which are in global use.

Special database construction involves a great amount of ancient literatures and various kinds of unique and rare copies of books, and the information carrier forms include bronze-ware and stoneware, bamboo and wooden tablets, silk books, hand-copied books and mimeographed books. Therefore, in the process of digitization, besides the problems that the libraries in Western China may encounter with modern literature – technical equipment, standard format selection, copyright and protection – they may also face special problems. These include data format standard adoption and stipulation for the relevant ancient books, switching between the databases for literature written in the traditional Chinese characters and those in the modern ones, automatic segmentation, and expression choice and classification for the ancient Chinese articles. Additionally, in the digitization of the local ethno-nationality literatures, the normalized and standardized software for processing nationality or ethnic writing scripts (ancient character) is also indispensable.

The compulsory adoption and implementation of international, state, professional and de facto standards have a significant and fundamental role in standardizing database construction in Western China and for raising its quality and level. Compulsory adoption can avoid the occurrence of repeated construction caused by incompatibility and can reduce production costs and lay the foundations for data interchange and exchange between libraries and countries in the future. Along with the continually changing technological revolutions and innovations, new standards and criteria will also be generated and established. Therefore, all aspects of literature digitization, network design, system integration, software and hardware configuration should be done in strict and standard ways, and adequate margin and space should be left for expanding and updating in the future.

Requirements for Expansion and Appraisal

Special literature databases are the important parts forming the general databases. Their functional modules have a direct interface with database users, and their kernel data table spaces and data files should have very strong expansion capabilities. In the database programmatic construct, adequate consideration should be taken for database space planning and database management flexibility. Rational configuration of the user schema, data types, data integrity and data dictionaries will be very important to future effective database management.

After the digitization of the literatures and their protogenic cultural resources, the final products will be put out in the form of networked databases. There is no significant difference between these databases and other information systems in respect to the standards for assessing them. All require powerful capabilities, operational stability, efficient system resource use, the ability to run on various operating system platforms, open source codes and promptness in updating. The databases are created for use, after all, so they must also demonstrate very good retrievability and usability.

Distributions of the Special Databases

Generally, each digital product has two choices for its final safekeeping. The first approach is centralized management as in the case of the All-China Higher Education Literature Resource Guarantee System. The second is the distributed management approach. Each member library is itself the product storage place, and it uses the networks to complete interconnection and data sharing. The limitation of this mode is that, due to the unbalanced local software and hardware basis, some libraries’ choice of the raw material for their databases and their implementation of specifications and processing may not be of satisfactory quality. The centralized management approach is just the opposite of the distributed management mode in terms of the nature, advantages and disadvantages.

The distributed management approach should be adopted because it is much more suitable for the current and future distributed database development trends. Distributed computation for constructing digital library resource databases in China is needed both because of technological considerations and the actual conditions in China, too. There is no need for supercomputers that can run central databases. There are no special demands at the client terminals, either. A distributed approach enables the users to make use of the free time and storage space available on thousands and thousands of computers within the networks to complete library processing work that requires great amounts of computation. Distributed database objects exist anywhere on the networks, accessible by remote clients through invoked services. The programming languages and compilers used by distributed objects are user-transparent, and the actual location of distributed objects on the networks and the operating systems in use are all transparent to clients. Meanwhile, distributed objects have dynamic characteristics, movable anywhere on the networks.


Resource creation is the most sensitive part of digital library construction, and it is also where the value of the digital information platforms lies. The development of digital libraries can

    • Promote the optimization of each library’s resource structures and spur the construction of databases that bear their special features.
    • Exert effective protection for literature resources and prevent them from being lost.
    • Rescue large numbers of protogenic resources on the verge of extinction.
    • Converge and integrate various kinds of media resources spread around different areas and regions, thus strengthening the concept of resource sharing.
    • Promote the standardization and normalization of the literature technology.
    • Effect the unprecedented large-scale application of network and computer technology in library construction and development.

Based on these considerations, the author of this paper, through his observation and analysis of the existing conditions in Western China, including problems encountered while participating personally in library construction here, has formed the following proposals:

    • More investment inputting should be made.
    • Unified construction planning should be worked out.
    • Efforts should be made to converge and integrate the raw material of the resources.
    • Technical methods should be standardized.
    • Construction should be stepped and layered.

Thus, through an all-sided effort we may be able to better seize the opportunities and meet the challenges encountered in the construction of the digital libraries in Western China.

We should be aware that in Western China there are very rich natural, humanitarian and ecological resources as well as their derived literature, especially the protogenic resources. Nevertheless, with the further development and transformation of economy, society and cultures, especially with the expansion and increase in the scope and depth of the exchange of people, the protogenic resources in Western China are facing the same threats of loss and extinction that exist in other developing countries. Therefore, we should utilize all modern means to record, maintain, develop and promote the excellent Western China resources, to turn its resource advantages into the development advantages and to add universality to the great richness of Chinese information resources.

