Adding contextual information enhances the content and value of communications, yet it can also introduce risk and threaten privacy. A common piece of contextual information is location, but context extends to identity, user profile, e-mail address, time and more. Understanding context from the standpoint of privacy awareness requires a systematic conceptualization of the concepts of privacy and context, personally identifiable information and the ways information flows, from processing and creation through transfer and acceptance. Numerous examples illustrate the potential chain of connections that could be revealed between a personal subject and context. Such information, made explicit, can undermine privacy policies. Integrating context- and location-aware services in software should be approached cautiously and with full understanding of the implications. Diagrammed scenarios provided can inform considerations and software specification building.

contextual information
privacy
personal information
location based services
software engineering
information flow

Bulletin, December/January 2012


Awareness of Context and Privacy

by Sabah Al-Fedaghi

Context awareness refers to linking changes in the environment with systems. Context is an important factor in improving computer-human communication. In ubiquitous computing, users work in a more dynamic context, and they can access services in a wide range of possible situations. A better understanding of context will help application designers determine which context-aware behaviors to support in their applications.

After reviewing many definitions of context, Dey et al. proposed the following definition: 

Context is any information that can be used to characterize the situation of an entity. An entity is a person, place or object that is considered relevant to the interaction between a user and an application, including the user and applications themselves. [Italics added] [1, Section 2.2]

According to Drost,

This definition is not really consistent, because context is not information. From the broadest view possible, context is anything that could improve or influence the behavior of the application according to the environment in which the user operates. However [this] definition [is] one of the most widely used. [2, p. 4]

In context-aware applications, context information is input when delivering a service. This information can be segregated into categories. A categorization of context types helps application designers uncover the pieces of context that will most likely be useful in their applications. Many such categorizations have been proposed. Ryan et al. [3] segregate primary context types into four categories: location, environment, identity and time. Dey et al. replace environment with activities, since environment “is a synonym for context and does not add to our investigation of context. Activity, on the other hand, answers a fundamental question of what is occurring in the situation” [1, Section 2.3]. From primary context we can find secondary context (for example, identity may lead to e-mail address, birthdate, friends and location, which may lead in turn to other people in the same location and activities occurring nearby). Again, according to Dey et al., 

This characterization helps designers choose context to use in their applications, structure the context they use and search out other relevant context. The four primary pieces of context indicate the types of information necessary for characterizing a situation[,] and their use as indices provide[s] a way for the context to be used and organized. [1, Section 2.3]

In this paper we concentrate on studying the notion of context in the area of privacy. According to Tatli [4], “Considering the risks, the users should be in the position to control their location privacy. Even though location is the most used context, there are other context data that increase functionality.” He notes that previous works on context privacy consider a small subset of context like location, date and time, while Drost notes that “There have been various investigations on privacy in context aware systems, but many address only one threat in the field of privacy or they regard only one type of contextual information” [2, p. 2]. Topics discussed in these works include separating location and identity [5] and applying privacy rules to applications [6].

Schmidt et al. [7] proposed a context data model (Figure 1) for context data stemming from the user himself and his surroundings. Tatli [4] enhanced Schmidt et al.’s model with two categories: protected context and evaluated context. Protected context includes user identity, user profile, physical conditions and location. Evaluated context includes user morale, infrastructure, social environment, user tasks and time. According to Tatli,

The context data in the protected context group are distributed to other principals in order to increase the functionality and therefore require privacy protection. Any context data in this group can get benefit of blurring. The context data in the evaluated context group are not sent to other principals, but affect user’s privacy concerns and therefore are used to evaluate the privacy of the context data. [Italics added] [4, Section 4]


Figure 1
Figure 1. Schmidt et al.’s user context data model [4]

The resultant privacy dependence of context data in this privacy-aware model is illustrated in Figure 2.

Figure 2
Figure 2.
Context interaction in privacy-aware data model [4]

This review of these models is not meant to be fair and complete; rather it is intended to give a flavor of some approaches that conceptualize a privacy-aware context. In this paper, rather than proposing yet another enhancement of these models, we elect to develop an entirely different approach since these models of privacy-aware context do not provide a systematic conceptualization of the notions of context and privacy. As further preparation, the next section scrutinizes the underlying concepts utilized in these models. 

Underlying Concepts in Context-Aware Models
As mentioned previously, in context-aware applications, context is defined as “any information that can be used to characterize the situation of an entity” [1]. This raises the issue of the difference between information used to characterize the situation of an entity and information used to characterize the entity itself. A situation and context are sometimes defined in terms of each other, for example, a situation is “the context that a person or organization is operating within at a specific point in time” [8, n.p.]. 

The terms context and situation are widely used in different study areas. In psychology, Harwood et al. [9] define “situation awareness” as having four components: where (spatial awareness), what (identity awareness), when (temporal awareness) and who (responsibility or automation awareness). Burke [10] conceptualized a situation as a scene (as in theatre) where agents enact behaviors (acts). Yet situation is mostly viewed as a dynamic or structure that initiates actions. It unfolds actions in a plot. In the area of context diagrams, “The 'context' of any situation is the external environment in which it exists… In process terms, the context contains other processes that provide inputs and outputs to and from the process in question” [11, n.p.].

Comparing these uses of context with “any information that can be used to characterize the situation of an entity” as defined by Dey, it appears that information about “the situation of an entity” is different from information about the entity (for example, user).

In this paper, we adopt a more systematic modeling methodology based on the notion of flow and apply it to context-aware systems. The examples given later will clarify this approach.

Interestingly, works on context privacy such as Tatli [6] do not define privacy or private information although Tatli mentions privacy-aware data. According to Drost [2], “[p]rivacy is hard to define because it differs from person to person. Therefore many definitions of privacy exist” [p.5]. Nevertheless, he notes that informational privacy is the “most relevant when dealing with a context-aware system,” defined as “the right to know what is done with a person’s personal data and which personal data is being gathered” [p.6].

In this paper we adopt a more objective definition of personal information and apply it to context-aware systems. Thus the next two sections are reviews of published works about privacy and flow-based modeling as a preface to using them to develop a systematic conceptualization of the notions of context and privacy.

Privacy
What is personal information (PI)? According to Cavoukian and Tapscott [12], PI can be defined in many ways, including

  • any information associated with or linked to an identifiable individual (for example, personal preferences, beliefs, opinions, habits, family and friends)
  • information about an individual provided by third parties (credit reports).

While this description of personal information encompasses most meanings of PI, it mixes or does not clearly distinguish personal identifiable information (PII) from personal information that does not embed a person’s identity. Furthermore, it does not clarify how information is personal if it is about more than one identifiable person. 

In the United States, the Personal Data Privacy and Security Act, S. 1332, 109th Cong., in Sec. 2, Findings, uses the terms personal identifiable information, identity, personally identifiable information and personal information. The term personally identifiable information is defined in Sec. 3, Definitions, to mean “any information, or compilation of information, in electronic or digital form serving as a means of identification, as defined by section 1028(d)(7) of title 18, United States Code” [13].

Different types of information pertinent to this paper are shown in Figure 3. So-called personal information is a type of information that includes PII and personal non-identifiable information (NII).

Figure 3
Figure 3. Different types of information pertinent to this paper

Personal non-identifiable information is called “personal” because its owner (a person) has an interest in keeping it private, even though it does not embed his or her identity. This information is owned by the person, as in the expression “personal belongings,” for example, a personal collection of research papers or songs.

From a security point of view, PII is more sensitive than an equal amount of NII (“equal amount” will be discussed later). With regard to policy, PII has a more policy-oriented significance than NII (see, for example, EU Directive 95/46/EC [14]). With regard to technology, there are unique PII-related technologies such as the W3C Platform for Privacy Preferences Project (P3P) (www.w3.org/P3P/)) and techniques such as k-anonymity (http://spdp.dti.unimi.it/papers/k-Anonymity.pdf) that revolve around PII. Additionally, PII possesses an objective definition (to be introduced later) that provides a means (identities of its proprietor) for separating it from other types of information, thus facilitating organization in a manner not available to other types of information.

PII involves special relationships with proprietors (persons about whom the information communicates something) that it does not have with non-proprietors (persons who have other persons’ PII) and non-persons such as institutions, agencies and companies. For example, a person may possess PII of another person, or a company may have the PII of someone in its database; however, proprietorship of PII is reserved only for its proprietor regardless of who possesses it.

From the informational perspective, the proprietor’s PII is the “person.” According to Floridi [14], my “personal information is a constitutive part of me-hood,” while the proprietor’s NII and “others’ (for example, friends’) PII” form his or her context.

Reference as a Base for defining PII. To base personal identifiable information on firmer ground, we turn to establishing some principles related to such information. For us, personal identifiable information is any information that has referent(s) to uniquely identifiable persons [16, 17]. In logic (correspondence theory), reference is the relation of a word (logical name) to a thing. Every PII refers to its proprietor(s) in the sense that it leads to him/her/them as distinguishable entities in the world. This reference is based on his/her/their unique identifier(s). The relationship between persons and their own PII is called proprietorship [16, 18].

A piece of information is PII if at least one of the objects to which it refers is a singly identifiable person. Any singly identifiable person in the PII is called a proprietor of that information. The proprietor is the person about whom the PII communicates information. If exactly one object exists of this type, the PII is atomic PII; if more than one singly identifiable person exists, it is compound PII. An atomic PII is a piece of information about a singly identifiable person. A compound PII is a piece of information about several singly identifiable persons.

Any compound PII is privacy reducible to a set of atomic PII. For example, John and Mary are in love can be privacy reducible to John and someone are in love and someone and Mary are in love

Identifiers and PII. Consider the set of unique identifiers of persons. Ontologically, the Aristotelian entity/object is a single, specific existence (a particularity) in the world. For us, the identity of an entity is its natural descriptors such as tall, brown eyes, male or blood type A. These descriptors exist in the entity/object. Height and eye color, for example, exist as aspects of the existence of an entity. We recognize the human entity from its natural descriptors. Some descriptors form identifiers. A natural identifier is a set of natural descriptors that facilitates recognizing a person uniquely. Examples of identifiers include fingerprints, faces and DNA. No two persons have identical natural identifiers. An artificial descriptor is a descriptor mapped to a natural identifier. Attaching the number 123456 to a particular person is an example of an artificial descriptor in the sense that the number is not inherent to the (natural) person. An artificial identifier is a set of descriptors mapped to a natural identifier of a person. By implication, no two persons have identical artificial identifiers. If two persons somehow have the same Social Security number, then this Social Security number is not an artificial identifier because it is not mapped (does not refer) uniquely to a natural identifier.

A basic principle in this definition of PII is as follows: Identifiers of proprietors are PII. Such definition is reasonable since the mere act of identifying a proprietor is a reference to a unique entity in the information sphere. Every unique identifier of a person is a basic PII in the sense that this identifier cannot be decomposed into more basic PII. 

The second principle defines PII in general: Any personal identifier or piece of information that embeds identifiers is personal identifiable information.

Thus, identifiers are the basic PII that cannot be decomposed into more basic PII. Furthermore, every complex PII includes in its structure at least one basic identifier. Note that the concern here is not issues of flexibility or narrowness of PII definitions, which is a matter that can be settled after developing a precise definition that encompasses all types of PII. For example, the USA Personal Data Privacy and Security Act (Text of S. 1332, 2005 [19]) limits PII by introducing the notion of sensitive PII.

PII and Non-PII. Consider the PII Alice visited clinic Y. It is PII because it represents a relationship, that of the proprietor Alice with an object, the clinic. Information about the clinic is contextual information about Alice. It may or may not be privacy-related information. For example, year of opening, number of beds and other information about the clinic is not privacy related. Thus, such information about the clinic is not related to Alice’s PII; however, when the information is that the clinic is an abortion clinic, then Alice’s PII is related to this non-identifiable information about the clinic. That is, the statements {Alice visited clinic Y, Clinic Y is an abortion clinic} give Clinic Y is an abortion clinic privacy-related significance. Thus contextual information may have privacy significance.

Flow Systems
In this section we turn to the types of actions that can be performed on information including PII. While such operations as collecting, accessing, transmitting, using, storing or processing have been mentioned in many studies about information, a systematic framework for relating such operations in an organized manner has never been developed. 

According to Al-Fedaghi [19], information is a flowthing. A flowthing refers to types of things that flow, hence, are processed, created, released and transferred, arrive and are accepted. Figure 4 is a state transition diagram of information flow showing six states of information.

Figure 4
Figure 4.
Flowsystem, assuming that no released flowthing is returned.

A flowthing model (FM) is

  • a flow system (flowsystem) that represents stages and 
     
  • things that “flow.” The flowsystem is a state diagram that includes

- six (and only six) states (also called stages) of information: processed, created, released, transferred, arrived and accepted, with possible substates. 

- flows among these states represented by solid arrows 

- triggers that activate a different flow, represented by dashed arrows, as will be described later. 

A process in FM is any operation that does not produce new flowthings. For the sake of simplicity, when appropriate, we will merge the arrival and acceptance stages into one stage called receive.

Proprietor and Context 
Since privacy, according to our approach, is the informational privacy of an individual person (the proprietor, say, Bob), then the (informational) context of this proprietor is as shown in Figure 5. Thus, context information (CI) is all information that is related to the proprietor, excluding his or her own PII. Notice that CI may include PII of others in the context of the proprietor.
Nevertheless, in terms of flowsystems, Figure 5 can be viewed as information flowsystems. Suppose that the context of proprietor Bob includes one person (say, Alice) and a computer. This can be conceptualized as shown in Figure 6.

Figure 5
Figure 5. The context of a person (shaded area) and its types of information

Figure 6
Figure 6.
An example of a proprietor and his information context

Flowsystems of Context
Suppose that awareness of Alice’s actions in Bob’s context is somehow an important factor in the privacy-aware context of Bob. Figure 7 expands Figure 6 to include this fact. Thus, in general, we can model the relation between a proprietor and his or her context in term of flows and triggering between the proprietor’s flowsystems and the flowsystems of his or her context. Triggering, denoted by a dashed arrow in Figure 7, signifies activation.

This conceptualization gives designers a map of context and different streams of interactions between the context and users (proprietors). Different types of context and subcontext can be categorized in terms of different types of flowsystems and subflowsystems.

Figure 7
Figure 7.
Alice’s actions affect Bob’s information flowsystem

Examples
Blount et al. [20] analyzed the task of satisfying preferences for exposing context data to authorized applications and individuals. They developed a role-based, context-dependent privacy model for enterprise domains. An example of privacy policy is as follows:

President (subject) grants (releases) information to White House staff (requester) about his location when both the president and the requester are in the White House (context).

In this case, a context-dependent policy database is constructed, as shown in Table 1.


Table 1. Sample context-dependent policy database [20]

The flow model representation for this example is shown in Figure 8. The requester creates (circle 1) a request that flows to the president. It is received (circle 2), but its processing is blocked until triggered by the White House. The White House records the arrival (receiving) of persons, and it initiates triggering only if the requester and the president are received (in the White House). The context is represented explicitly: the requester and the White House. Conceptually, the White House is an entity like the requester and the president.

Figure 8
Figure 8.
Context of President includes requester and White House

In comparison, Table 1 is a shorthand notation that embeds ambiguity. For example, there is nothing saying that the request is directed to the president (implicitly it is assumed that the receiver is the subject). The textual description complements the table. Suppose that only the table is available to the designer. He or she can interpret the table to mean that the request goes to an employee (for example, the president’s secretary). The semantics of “context” in Table 1 are also troublesome. Who is keeping track of whether the requester and the president are in the White House? “White House” in Figure 8 explicitly expresses that an entity (for example, the security unit) is responsible for that. 

Flow conceptualization represents a complete description characterized by contiguity of the privacy policy.

As another example, consider Kjærgaard et al.’s [21] modeling of complex and interwoven sets of context-information “by extending ambient calculus with new constructs and capabilities” [21, p. 1]. They give the following scenario, where a query for information is issued, and the response is adopted based on current context. The scenario is summarized in the following description in terms of the Aware-Phone application:

A … nurse needs to contact a more experienced doctor to consult him on some issue. So the Aware-Phone is queried for where the nearest doctor which is not occupied by some other work task is located. The application then returns the best suited doctor in the current context of the location and activities of the doctors on duty. [21, p.4] 

The example scenario is described using textual syntax as follows (illustrated in Figure 9):
Entity: [Awarephones[#AP1 | #AP2] ]
Person: [Doctor[#AP1] | Nurse[#AP2] ]
Status: [Busy[#AP2] | NotBusy[#AP1] ]
Location: [Hospital[ Ward1[#AP1 | #AP2] | Ward2[] ] ]

Figure 9
Figure 9.
Illustration of the example scenario (partially from [21])

Figure 9 exhibits fragmented representation when compared with the flow-based conceptualization shown in Figure 10, where there are spheres for nurse, doctor, Ward 1, Ward 2 and AwarePhone, which is the querying system that receives information about doctors and responds to nurses’ queries.

Figure 10
Figure 10.
Flow-based representation

The nurse has three flowsystems: location information, status information and queries. First she creates a query (circle 1) and sends it to the query flowsystem in AwarePhone. The query is processed and triggers a flow of response (circles 9 and 10) in the location and status information flowsystems, and the response is transferred to the nurse (circles 11 and 12).

The doctor’s sphere includes tasks and physical (body) flowsystems. As indicated by the arrow at circle 3, the doctor receives tasks from somewhere. Here there is ambiguity about the flow of these tasks that make the doctor busy or not busy. For example, is there a system that records a doctor’s medical sessions (e.g., start and finish times)? In Figure 10, tasks trigger (circle 4) the AwarePhone to create information about the doctor’s status. Similarly, the doctor’s physical flowsystem releases (circle 5) the (physical) doctor to Ward 1 (circle 6) and Ward 2 (circle 7). Notice that it is logical to add movement of the doctor between the two wards and outside. Note the convenience of the Release stage in representing such a situation as the doctor goes from Ward 1 to Ward 2; however, he waits for the transportation carrier between wards.

The presence of a doctor on a ward triggers creation of location information in AwarePhone (circle 8). Status and location information are stored and retrieved after being triggered by the query flow system (circles 9 and 10).

This flow-based representation is analogous to a story or plot in a comic book, where flow of events is continuous and without gaps, similar to, for example, accomplishing tasks, going outside or between wards.

Conclusion 
Thanks to a number of high-profile platforms and use cases, context- and location-aware services are now familiar concepts to consumers and developers. However, understanding of the potential risks and data protection implications has lagged behind. In this article, we have worked through a series of practical examples and demonstrated that a high-level, abstracted approach to conceptualizing knowledge awareness context/privacy systems is a useful component in reasoning about the sensitivity of personal information. Given the very real importance of the theme of data privacy in our everyday lives today, we suggest that this form of analysis can serve as a useful engineering technique, supporting the development of an initial software specification, and would be particularly applicable at the requirement phase of building context-aware and privacy-aware systems. 

Resources Mentioned in the Article
[1] Dey, A. K., & Abowd, G. D. (1999). Towards a better understanding of context and context-awareness. Proceedings of the 1st International Symposium on Handheld and Ubiquitous Computing, Karlsruhe, Germany, Lecture Notes in Computer Science, 1707, 304-307. Retrieved November 5, 2011, from www.it.usyd.edu.au/~bob/IE/99-22.pdf

[2] Drost, C. (2004). Privacy in context aware systems. University of Twente, Enschede, Netherlands, Department of Informatics, Federal University of Espirito Santo, Vitoria, Brazil. Retrieved November 5, 2011, from http://asna.ewi.utwente.nl/assignments/completed/drost.pdf

[3] Ryan, N., Pascoe, J., & Morse, D. (1997). Enhanced reality fieldwork: The context-aware archaeological assistant. In V. Gaffney, M. van Leusen, & S. Exxon (Eds.), Computer applications in archaeology [pp.34-45]. Oxford: British Archaeological Reports, Tempus Reparatum. Retrieved November 6, 2011, from www.cs.kent.ac.uk/projects/mobicomp/Fieldwork/Papers/CAA97/ERFldwk.html

[4] Tatli, E. I. (2006). Context data model for privacy. Position paper presented at the PRIME Standardisation Workshop, IBM Zurich, July 6-7. Retrieved November 5, 2011, from www.prime-project.eu/events/standardisation-ws/positionpapers/cdmfp.pdf 

[5] Smailagic, A., Siewiorek, D. P., Anhalt, J., Kogan, D., & Wang, Y. (2002). Location sensing and privacy in a context aware computing environment. Carnegie Mellon University. Retrieved November 5, 2011, from citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.791.

[6] Al-Muhtadi, J., Campbell, R., Kapadia, A., Mickunas, D., & Yi, S. (2002). Routing through the mist: Privacy preserving communication in ubiquitous computing environments. International Conference of Distributed Computing Systems (ICDCS 2002), 65-74. Vienna, Austria, July 3. Retrieved November 5, 2011, from www.cyberdudez.com/mist.pdf 

[7] Schmidt, A., Beigl, M., & Gellersen, H. (1999). There is more to context than location. Computers and Graphics, 23(6), 893-901.

[8] Power, D. (2010). What is a situation analysis? PlanningSkills.com. Retrieved November 5, 2011, from http://planningskills.com/askdan/20.php

[9] Harwood, K., Barnett, B., & Wickens, C. (1988). Situational awareness: A conceptual and methodological framework. Psychology in the Department of Defense Eleventh Symposium (Tech. Report No. USAFA-TR-88-1), 316-320. Colorado Springs, CO: U.S. Air Force Academy (AD-A198723).

[10] Burke, K. (1945). A grammar of motives. Berkeley: University of California Press. 

[11] Context diagram. (n.d.). In Improvement Encyclopedia. Retrieved November 5, 2011, from www.syque.com/improvement/Context%20diagram.htm.

[12] Cavoukian, A., & Tapscott, D. (October 17, 2006). Privacy and the Enterprise 2.0. New Paradigm Learning Corporation. Whitepaper. Retrieved November 5, 2011, from www.ipc.on.ca/images/Resources/priv-opennetw.pdf

[13] Personal Data Privacy and Security Act of 2005, S. 1332, 109th Cong. (2005). Retrieved November 5, 2011, from www.govtrack.us/congress/billtext.xpd?bill=s109-1332.

[14] Protection of individuals with regard to the processing of personal data and on the free movement of such data. EU Directive 95/46/EC of the European Parliament and of the Council, October 24, 1995. Retrieved November 5, 2011, from http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:31995L0046:en:HTML.

[15] Floridi, L. (1998). Information ethics: On the philosophical foundation of computer ethics. Ethics and Information Technology, 1, 37-56. Retrieved November 6, 2011, from www.philosophyofinformation.net/publications/pdf/ieotfce.pdf 

[16] Al-Fedaghi, S. (2011). Engineering privacy revisited. Journal of Computer Science, 8(1), 107-120.

[17] Al-Fedaghi, S. (2011). Toward a unifying view of personal identifiable information. Presented at the 4th International Conference on Computers, Privacy and Data Protection, January 25-27, Brussels, Belgium.

[18] Al-Fedaghi, S., Alwaraa, N., & Hussein, M. (2011). Experimentation with the design of databases of personal identifiable information. Journal of Convergence Information Technology, 6(4), 222-239.

[19] Al-Fedaghi, S. (2008). Systems of things that flow. Proceedings of the 52nd Annual Meeting of the International Society for the Systems Sciences (ISSS 2008), University of Wisconsin, Madison, July 13-18. Retrieved November 5, 2011, from http://journals.isss.org/index.php/proceedings52nd/article/viewFile/1001/354

[20] Blount, M., Davis, J., Ebling, M., Jerome, W., Leiba, B., Liu, X., & Misra, A. (2008). Privacy engine for context-aware enterprise application services. Proceedings of the 2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, 2, 94-100. Retrieved November 5, 2011, from http://internetmessagingtechnology.org/pubs/ContextPrivacyEngine.pdf

[21] Kjærgaard, M. B., & Bunde-Pedersen, J. (2006). Towards a formal model of context awareness. Proceedings of the First International Workshop on Combining Theory and Systems Building in Pervasive Computing (Pervasive 2006), 667-674. Retrieved November 5, 2011, from www.smartlab.cis.strath.ac.uk/CTSB/Pedersen.pdf 


Sabah Al-Fedaghi is an associate professor in the computer engineering department, Kuwait University. He can be reached at sabah<at>alfedaghi.com.