Genetics, biometrics and the “informatization of the body”

Introduction

In a work devoted to biometrics, it might not seem obvious to devote much attention to genetics. Genetics is a very special case within the category of biometric technologies; some would argue that due to its unique character, it doesn’t even belong in the same category. The term “genetics” covers a broad spectrum of theories, practices, and technologies, only some of which overlap with the practices and technologies of biometrics.
On the other hand, genetics can be seen as the epitome of a broader development to which all biometric technologies contribute. A development that can be characterized as the informatization of the body, a relatively new phenomenon in which the human body appears to be redefined as an entity constructed from information. It is in this phenomenon—the emergence of the body as information—that the observations about genetics in this chapter will refer to.

Related current developments. DNA banks: increasingly comprehensive databases.

The main use of genetics as a biometric identification technology is in criminal identification of suspects. DNA-based evidence has been accepted as legal proof in many countries, and police have been consistently collecting and storing DNA information from suspects and convicts.

Over the past decade, the trend has been for these databases to become increasingly comprehensive. For example, in the Netherlands, the criterion for inclusion in the national police DNA database was that someone be suspected of a crime carrying an eight-year prison sentence. In 2001, this was reduced to four years, and currently, proposals are being discussed to store cellular material and DNA from every individual who has been convicted, allowing the creation of suspect profiles and identifying appearance characteristics based on DNA traces. In the United Kingdom, the recognized “global leader” in criminal DNA databases, the inclusion criterion has evolved from those convicted of crimes, to those charged, and subsequently in 2004, simply to those arrested. Internationally, discussions on increasing cooperation in the areas of security, law enforcement, and policing include the idea of exchanging DNA data. According to a research announcement by the Wellcome Trust, a number of organizations are currently involved in developing and promoting DNA databases across the entire EU. For example: the European DNA Profiling Group (EDNAP), established in 1988, aims to establish systematic procedures for sharing data throughout the European Community; the Standardization of DNA Profiling in the European Union (STADNAP) promotes cooperation across the EU by utilizing DNA profiles to track “mobile serial offenders”; and the European Network of Forensic Science Institutes (ENFSI) has similar ambitions to standardize forensic practices in support of law enforcement across the EU. The EU itself provides funding (such as for STADNAP) to ensure best practices capable of facilitating increased data exchange among all relevant authorities.

The point is that, for the purposes of criminal investigation, such databases become more effective the larger and more comprehensive they are. This fact is the reason that, once begun, the collection of material from ever-larger segments of the population becomes more attractive; the most enthusiastic supporters argue that it would make sense to take a sample from the entire population, for example with a routine DNA collection from every newborn. However, by including the entire population, the effectiveness of databases for investigative purposes is practically reduced, since 99% of those registered will not be involved in criminal activities anyway, thus increasing the likelihood of producing a large number of false positives, and increasing the difficulty of selecting relevant matches from irrelevant ones. In the context of increasing (international) exchange and the provision of interoperability of distributed systems and databases, it is worth noting the existence of large biological sample banks in the medical field. In the course of various population examination programs, large amounts of blood, urine samples, cellular material, etc. are stored in hospitals and medical research centers around the world. Potentially, these can provide the material for DNA analysis, if this can ever be deemed necessary and lawful.

From fingerprint to profile creation (profiling)

The term “DNA fingerprint,” sometimes used instead of “DNA identification,” refers to the fact that the genetic information used in DNA identification technology does not provide information about the individual involved beyond identity matching. For this reason, many privacy and data protection issues were not considered relevant to DNA recognition technology. The polymorphisms used by this technology were thought not to encode specific traits or predispositions. However, with current technologies, this is no longer true. The recent “UK Police Science & Technology Strategy 2003-2008,” for example, confirms the commitment to developing the ability to “determine offender characteristics from DNA.” The British Forensic Science Service (FSS) has been investigating the possibility of predicting individuals’ physical characteristics for some time. They have created a “redhead database” that they claim identifies “84% of redheads,” and now offer police a “race prediction service” that claims to be able to distinguish – with an unknown degree of certainty – national origin from DNA profiles. FSS is currently investigating the recognition of a series of other phenotypic characteristics, such as facial features, height, and eye color. These ambitions are also connected to another important and rapidly expanding method of identifying characteristics from DNA provided by “haplotype mapping” – for example, the Y-STR database that attempts to make an “assessment of the stratification of male populations among European and global populations.” All these technological developments involving the examination of “encoded regions” of the human genome raise new policy and ethical issues for those involved in using genetic information for criminal investigations.

From monogenetic causality to polyfactorial probabilities and intentions

There is growth in the broader field of medical genetic research that offsets the risk of unwanted disclosure of genetic information to third parties, which are often related to the above. The earlier optimism in medical genetic research, of finding genetic causes of diseases by identifying them in specific monogenic deviations and defects, is gradually being replaced by the more realistic expectation that such discoveries will remain exceptions. On the contrary, most diseases are proven not to be caused by a single genetic defect, but by a complex set of causal factors, only some of which are genetic in nature. Moreover, even if the cause is definitely genetic, multiple genes are usually involved, which may be located close to each other or not. Apart from reducing hope for genetic therapies, this also diminishes the value of genetic profiles as predictors of medical conditions and personal characteristics. The fears of disclosing genetic information to third parties, for example employers or insurance companies, were largely based on the likelihood of someone being identified as part of a high-risk group, with reduced chances of insurance or employment. However, as new information shows, most of us will prove not to be distinct, which makes the likelihood and rationale of individual discrimination limited.

Even though the trend is to collect increasingly more predictive information from DNA samples, the characteristics now in research databases are very general, common from large segments of the population. Although this can help criminal investigations to exclude specific suspects, the chances of producing significant predictive medical information remain, most likely, limited.

The context of broader technological developments

The increased collection and storage of genetic information in electronic form in searchable databases is part of a broader development. Today we see a multiplication in technologies oriented towards the production, collection, processing and analysis of digital body data. In various sectors of society, technologies are used for various purposes, including the interconnection of human bodies with machines, through the conversion of certain physical characteristics into digital data. The set of technologies studied in the BITE project, which are collectively referred to as biometric identification technologies, clearly fall into this category. Similarly, with the “digital travel documents” (machine readable travel documents – MRTDs) with which biometric data are closely associated, we see biometrics being involved in the co-construction of digital bodies (machine readable bodies).

In general, biometric technology involves collecting, with a detection device, digital expressions of physiological characteristics that are unique to an individual, such as fingerprints, iris patterns, retinal patterns, vein patterns (e.g. in the hand), facial characteristics, hand shape, or voice patterns; it may also include behavioral patterns such as typing or signature. This digital representation of biometric data is usually converted through algorithms, to produce what is called a “template”. This algorithmic transformation is said to be non-reversible, which means that the biometric data itself cannot be derived from the template. These templates are stored in a central database to which access is granted when, in subsequent instances, the finger, hand, face, eye or voice is presented to the system. After the algorithmic transformation of this second biometric image, the comparison can be performed. If the template matches, the individual presented is “recognized” and approved by the system. It is also possible that the templates are not stored centrally, but in a circuit, for example in a passport. The user must then present the circuit and the body part to “prove” that he or she is the legitimate user of the passport. In addition, the field of healthcare is obviously heavily involved in what we refer to as “the informatization of the body”.

Many of the developments in the late 20th century, both in medical science and in the organization of healthcare delivery, are largely due to applications of information technology. From intake, through diagnosis, to the delivery of treatment and medications, and on to payment and reimbursement, the patient’s journey through the healthcare system today is mediated from start to finish by information technology. The complexities of advanced forms of cancer treatment, with their intricate protocols for administering precise doses of drugs and radiation, for example, would be impossible without at least partly redefining the process in terms of strategic information gathering and data management. And even at the level of lower-technology medical care, patient data are recorded in electronic files, collected in national registries, and transmitted among physicians, pharmacists, insurers, government institutions, etc.—or at least that is the intention. The result is an incredible volume of detailed and specific electronic data concerning the (psycho-)physical and embedded social existences of individuals across numerous databases.

Within the context of discussing genetics and its potential for identification practices, it is also useful to remember the existence of banks of human tissue, blood, cellular material, skin, gametes, and embryos. These may not exactly be recognized as electronic body data banks as they are, but potentially, and without clear regulation or legislation, a court decision may suffice to convert any of these biological samples into DNA profiles.
At first glance, these highly diverse technological practices may seem to have little in common. However, the reason we bring them together is to highlight a similarity that we consider to be of high cultural and ethical significance. Each of these, in one way or another, involves the transformation of specific expressions of physical existence into electronic data and digitally processable information; in short, each of them is involved in what we call the “informatization of the body.” To explain what we mean by this and why we believe it is important, a brief philosophical detour is necessary.

A classic dichotomy

What follows is a list of (in)famous modern dichotomies or dualisms: they originate from a philosophical worldview in which everything exists in pairs, and these two are also opposites of each other; more precisely, each pair signifies two fundamentally distinct “worlds”:

  • reality ↔ language
  • signifier ↔ signified
  • material ↔ immaterial
  • biological ↔ social
  • “the body itself” ↔ “personal data”
  • anatomy ↔ data entry / data search
  • inside / outside ↔ public / private
  • integrity ↔ privacy

The first four pairs are quite general and abstract; the next four derive from the first four and are more specific to our case. The basic duality under discussion here is that of the human body on one hand, and the personal data/information relating to that body on the other. This fundamental dualism corresponds to the distinction between reality and language, “the thing itself” and its representation. While the body itself is considered a material thing, the information about it is not; and while the body is considered to be a natural entity that pre-exists the social or the cultural, it is only the way we speak, write or otherwise express this body that is considered a socio-cultural issue.
The last three are not classical dichotomies; they relate to the body/data distinction in specific ways. The anatomy/inscription pair refers to elementary or defining technological practices in relation to the “body” and “personal data” respectively; the inside/outside versus public/private refers to the crucial boundary for defining the two objects respectively; and finally the integrity/privacy pair refers to the fundamental value involved in maintaining these boundaries.
It is this habit of dividing everything into two, and declaring that they belong to opposite “worlds” that is the issue at stake here. We propose that this distinction may prevent us from adequately recognizing the substantive nature of changes in the relationship between bodies and contemporary technological practices. The translation of so many expressions of the body into digital data, codes and information, undermines the fine distinction between the body, as something belonging exclusively to material reality, and the digital data derived from that body, as merely “representations” belonging to a completely separate field. We propose that the developments discussed here actually affect what we consider to be a body.

Genetics: the notable example

Thus, through the cumulative results of a broad spectrum of technologies, sciences and practices of everyday life, a change in our bodies’ self-knowledge is gradually taking place. From developments in the basic medical sciences of the twentieth century (endocrinology, immunology, reproductive and genetic science), to the data processing practices of today’s medical diagnosis, visualization, therapy, and recording techniques, to biometric identification and verification procedures, we increasingly find our bodies being defined in terms of “information.” Moreover, this “information” is of such a type that it can be processed as digital data. An analysis of the core of our physical existence yields an electronically generated genetic profile, while through our interaction with our environment, moving and touching detection devices, we leave traces that serve as computer input. We suggest that this should be considered something deeper than just another example of “personal information” collection. Rather, the human body is engaged in a process of co-evolution with technology, specifically information technologies. Within this co-evolution, the ensemble of technologies, sciences and practices of genetics constitutes the most exceptional example, the epitome of the new “body as information.” At both levels of scientific conception and practice there is a strong convergence between genetics and informatics. With emphasis on basic concepts such as “information,” “(de-)coding,” and ultimately “(re-)programming” and “(re-)combination,” one could argue that genetics has become a form of information science. Combined with the popular understanding of genes and DNA as the core, as the fundamental essence of our existence and identity (“we are our genes”), we see how the genetic body constitutes the most intense example of the informationalization of the body.
The question now is how to maintain the distinction between “the body itself” and “information” about this body, if the body itself, in some way, now consists of information? For example, in the process of biological samples, isolated DNA, DNA databases, STR profiles, full genetic profiles, (those considered today as) medical un-coded polymorphisms and those (known today as) “health-related regions”… where exactly is the transition from bodily matter to bodily data? Does it really make sense to presuppose a clear distinction?

Ethical implications
Privacy or integrity?

Issues such as this are not merely academic philosophical questions, but have practical significance. They are partly comparable to legal discussions regarding the status of prostheses, organ (donations), gametes or blood, which arise from previous forms of technological innovations relating to body management; and here too, questions arise about the way the boundaries of the body are determined.

In the case of the “body as information,” the problem is that we have very different concepts, practices, techniques, and institutions for protecting bodies from those that protect information from unfair access and intrusion, no matter how “personal” they might be. While in the first case the integrity of the body itself is at stake, in the second the concept of “informational privacy” applies, which carries less moral weight. But this “distinction of duties” presupposes that it is self-evident what belongs “to the body itself,” and where information about that body begins—in other words, this distinction is problematic here precisely because this crucial distinction has lost its self-evidence. How can we ensure the integrity of bodies when bodies are considered “information”?
This problem becomes increasingly acute due to the capabilities and specificities of digital data processing. The digital representation of bodies allows forms of processing, searching, and examining the human digital body in a way that resembles “body scanning.” Beyond privacy issues, the integrity of the individual, of the body itself, is at stake here. Legal and ethical measures and protections must therefore be adjusted accordingly to address “body scanning” and issues of physical integrity.

This issue is particularly related to a strange aspect of this new body: that it can be remotely controlled. The digitized body can be transferred to places that are very distant, both in time and space, from the individual to whom the body belongs. Databases can provide remote access via the network; they are designed to store information and allow retrieval after extended periods of time. A physical examination or check-up, until now, required the presence of the person involved—a condition so self-evident that questioning it would have been quite ridiculous. Moreover, this condition provided the basis for consent to any physical examination, at least as a practical possibility. Today, however, these matters are no longer so obvious.

Take again the example of DNA criminal examination. Lawyers and legal scholars have pointed out the seriousness of the violation of bodily integrity involved in taking DNA samples from suspects. Very strict legal rules have been created to protect the rights of suspects and convicts. But of course, saliva taken from the mouth or hair strands cannot pose a risk to bodily integrity. It is not the production of bodily data per se, but the information about the body that is thus collected, and all the analyses, processing, and knowledge about the individual that this information makes possible, that is concerning. Furthermore, storing this information allows investigation into suspects’ bodies indefinitely. With new analytical techniques constantly emerging, it will be very tempting to reopen old, closed cases to reanalyze the data. Under current legislation, such a search is simply calculated as a search for sensitive data, whereas perhaps we need to recognize that it actually constitutes a (new type of) “bodily inspection.”

In the medical context, it is also easy to imagine how, for example, a physical examination of someone can be conducted by a “third party” located elsewhere, with remote access to digital diagnostic images and data—and without the patient’s knowledge. Again, according to existing legislation, this is considered sharing (confidential) data among professionals, whereas it should better be regarded as a virtual “physical examination” of the patient’s body.

Identity and social categorization

Saved, retrievable and searchable from many different locations, simultaneously or even over extended periods of time, this “body data” can become part of information processing practices in ways that were not possible in the past, or create entirely new practices. The extensive capabilities for new forms of knowledge production, policy making and implementation, targeting and development of “prevention strategies” are welcome, but will also create new forms of surveillance that may not be so benign.

Biometrically identified bodies at the airport are automatically evaluated as known or unknown, legal or illegal, desired or undesired, low or high risk, with very specific consequences for the future of the people concerned. Similarly, the body defined in terms of its genetic profile, nicotine or drug intake, medical history, etc., becomes a body that is evaluated as physiological or non-physiological, as healthy or pathological, as low or high risk. Specific profiles can be compiled from large amounts of data and social identities can be assigned to individuals behind their backs, depending on whether they fit into a category or not. With increasing connectivity, the cross-referencing of databases and the sharing of information between organizations and institutions, whether in the public or private sector, these assigned identities can become like the shadow of a person: difficult to fight, impossible to get rid of.

Thus, the body’s data shapes identity and transforms its performance. Our “machine-readable” bodies reveal “who we are” in ways beyond our control, and possibly contrary to our interests and desires. In criminology, as well as in border and migration control, identity is determined by bodies, in ways that bypass what people themselves might express. You may claim you are the daughter of that woman from Sierra Leone, but your genetic profile says otherwise; you may say you are 14 years old, but X-ray machines tell us you are lying; you may want to convince us that you are healthy, low-risk, but our data shows otherwise. In all these examples, the outcomes can be reversed: you may ultimately prove your innocence, your right to enter the country or to work.

The “machine-readable” bodies are considered to be more honest than the speakers themselves, who, in the process of interrogation, are defined as “suspects.” These uses of bodily data may reintroduce forms of determinism, with the possibility that opportunities and rights in life depend on them.

Genetics, biometrics and the informatization of the body
Irma van der Ploeg, Ann Ist Super Sanita, 2007
translation W.