This article is part of the Global Perspectives, Media and Communication special issue on “Media, Migration, and Nationalism,” guest-edited by Koen Leurs and Tomohisa Hirata. In line with the focus of this issue, we are interested in the ways in which open data is progressively used to construct indicators of the state’s performance in the form of race-ethnic categories. These data initiatives are typical for the ongoing quantification and datafication of society. Through APIs (application programming interfaces), both governmental bodies and third parties are given direct access to data, as well as the ways in which these data are structured. This infrastructure affords the appropriation of statistics concerning the national origins of Dutch citizens for new purposes. Through this data, race-ethnic categories are repurposed to measure the living conditions in the Netherlands, effectively keeping people with non-Dutch roots in the migrant category for up to three generations. To show how this process unfolds in the Netherlands, we investigate two web applications, the Allochtonenmeter and the Leefbaarometer, that make use of race-ethnically constructed data. We will argue that for a more complete understanding of the processes at play in the Dutch “data assemblage,” we need to enrich critical data studies with a postcolonial perspective. In this article, we consider race to be a verb rather than a noun, signifying a process or an action, as this takes away the necessity to communicate a nonessentialist perspective on what is raced, since the object of racing can be different in each new location, situation, and technical context. Our focus is therefore on how human characteristics such as nationality, ethnicity, or class are raced through data-driven processes and in relation to a particular history and culture in the Dutch context. In this light, we find that datafied systems do not merely report on particular groups in society but rather actively produce hierarchical distinctions between these groups.
On January 25, 2007, the Dutch right-wing website Geenstijl published the Allochtoon-o-meter (see figure 1), a web application that provided “absolute figures on the number of ethnically Dutch and ethnic minority residents within a -future- postcode” (Prof. Hoxha 2007, authors’ translation). The title of the app refers to the stigmatizing term allochtoon, which is governmentally defined and has been part of both the official and the popular immigration and integration discourse since 1971 (see Verweij-Jonker 1971). The term signifies Dutch people with non-Dutch ancestry (we will discuss this term in more detail in section 2). Despite a long-standing cultural denial, not only has this category been used to convey categories of ethnicity and nationality, but it also carries particular racialized connotations (Essed and Nimako 2006; Wekker 2016, 7; Yanow and van der Haar 2013; Yanow, van der Haar, and Völke 2016). The Allochtoon-o-meter was created in response to public outrage about the practices of a realtor in the city of Almere. After a realtor published the number of ethnic minority individuals in certain neighborhoods of the city of Almere on his website, the Dutch National Society of Realtors (Nederlandse Vereniging van Makelaars) took it upon themselves to name and shame him (NRC 2009). Opposing the “political correctness” of such an action, Geenstijl responded by making an easy-to-use application to provide the same types of figures at the postcode level across the Netherlands, arguing that they were providing a public service (Prof. Hoxha 2007). After all, they argued, they were merely making existing governmental open data more accessible to the public. Although this app is no longer featured on Geenstijl, a similar commercial and nongovernmental website featuring the slightly changed name Allochtonenmeter continues the practice with nearly identical functionalities.
One year after the launch of the Allochtoon-o-meter, Ella Vogelaar, the Dutch minister of Public Housing, Spatial Planning and Environmental Management launched a similar, state-run web app called the Livability Barometer (in Dutch, “Leefbaarometer,” hereafter LB; see figure 21), which was envisioned as the main instrument to map the livability of Dutch districts and neighborhoods (Rijksoverheid 2008). Four main goals were pursued through this application. First, it aimed to facilitate early detection of possible livability problems. The second goal was to accurately monitor how the livability situation develops in areas that are perceived to be problematic. Third, the LB aims to provide a first diagnosis about developments in problem areas and areas that run the risk of “getting out of hand.” And the final goal is to provide policy evaluations, effect measurements, and “in-depth” research for policymakers in local governments. For this purpose, the official, governmentally contracted LB makes use of the contested category of “allochtoon” and related categories in its calculations of livability. Both the Allochtonenmeter and LB use the open data platform of the official statistics bureau of the Dutch government, Statistics Netherlands (Centraal Bureau voor de Statistiek, hereafter CBS); both make use of the same racialized classificatory system and the same open data infrastructure.
In this article, we address the workings of this infrastructure through a postcolonial data studies lens. This as of yet small field of studies combines a critical data studies approach with a postcolonial perspective. From our critical data studies approach, we understand the governmental data infrastructure combined with the connected systems as a “data assemblage” (Kitchin and Lauriault 2014). We enrich this perspective with Harding’s (1991) standpoint theory, as we elaborate on the embedded obfuscated perspective of these systems. A postcolonial studies perspective allows us to expose and challenge these normative and hegemonic assumptions. Postcolonialism denotes “something that one does: it can describe a way of thinking, a mode of perception, a line of inquiry, an aesthetic practice, a method of investigation” (McLeod 2010, 6). Coloniality in this sense is not merely tied to a particular time period or spatial territory but should be understood as a psychological, political, economic, social, epistemic, or ontological condition. Postcoloniality, then, is both an affirmation of the ongoing power of historical hierarchies and power relations as an attempt to move beyond these forces, independent of the modes and spheres in which they work. In particular, we focus on the indicators used to depict population characteristics. The constructed indicators reflect a transformation of citizens into data indicators to facilitate the state’s performance. Through datafication and categorization, citizens become objects of politics. These politics produce a population that, while having been part of Dutch society for over fifty years, continues to be seen, analyzed, and discussed in terms related to migration and “integration,” since their identity is continuously constructed in relation to the country of birth of parents and sometimes even grandparents (Boersma and Schinkel 2015, 1049). As we will argue in this paper, it is partly the performative work of the data systems under scrutiny that aids the construction and production of the identities of groups of Dutch citizens as migrants.
These data initiatives are typical for what has been theorized as the “datafied society” (Es and Schäfer 2017). Through APIs (application programming interfaces), both governmental bodies and third parties are given direct access to data. Open data indicators can then be freely used, modified, and shared by anyone and for any purpose, commercial or otherwise (Open Definition 2019). Open government data (OGD), then, is a system or set of technologies that makes it possible for third parties to inspect, use, and reuse data collected and processed with public money (Kitchin 2014a). This is done for reasons of governmental transparency and openness, both traditionally considered essential elements of “good governance” (Graham et al. 2013), and to stimulate the creation of “new services based on the Open Data” (Huijboom and den Broek 2011, 5). CBS’s open data platform can be seen as a development in which governmental actors embrace new technologies that change the traditional ways of reporting on their doings (Lathrop and Ruma 2010; Wohlers and Bernier 2015). Annual reports that communicate official statistics and demographics data are progressively being transformed into digital form. As Ruppert (2015, p. 129) notes, “each seeks to ‘liberate data’ from the confines of administrative offices and make them publicly available along with applications and tools for imagined publics to do their own analyses.” Providing citizens access to OGD is intended as an opportunity for citizens to critically engage with published information and to provide opportunities for holding the government accountable (Huijboom and den Broek 2011, 4). Simultaneously, data-driven instruments embed the dominant state ideology and hegemonic norms concerning the production of knowledge about the discursive and institutional “Other.” As such, these systems do not only report on particular “ethnic” groups in society but also actively produce them. While an argument could be made for the emancipatory potential of open data in the form of aiding the allocation of affirmative action or other humanitarian initiatives, the situation in the Netherlands happens to be particularly complex in this regard. Due to the overall belief that the Netherlands is a particularly equal society, a topic we will touch upon further in section 3, the affirmative potential of race-ethnic categorization is not realized enough (or at all), especially since the Dutch government ended “ethnicity-specific policy” (RMO 2012, 10).
In the governmental data system, the allochtoon functions as an aggregate, and since aggregates are produced in an institutionalized setting, they will become “real.” “It is their institutionalization that makes them real” (Rottenburg and Merry 2015, 15). This “making up people” (Hacking 1986) and contemporary “social sorting” (Lyon 2005) are understood here not merely as an epistemological issue but also as an ontological practice, producing what historians Michael Omi and Howard Winant call “racial formations.” This concept contests both the essentialist views on race as something objective, biological, and concrete and the social constructivist view on race as an “illusion” or “a very harmful fiction” born in social relations and discourse (Hall et al. 1978; M’charek 2000; Omi and Winant 2014, 68). Through the idea of racial formation, race can be understood as “an unstable and ‘decentered’ complex of social meanings constantly being transformed by political struggle” (Omi and Winant 2014, 68). Although Omi and Winant developed their theory of racial formation in order to analyze race relations in the United States from a historical point of view, it has shown to be relevant in the analysis of race relations in the German, French, and Dutch context as well (El-Tayeb 2011; Stoler 2008, 2016). We consider the top-down approach used to determine peoples’ classifications and the subsequent ascription of characteristics to these classifications to be a form of nonvoluntary racial formation. This process of “the extension of racial meaning to a previously racially unclassified relationship, social practice or group” is what Omi and Winant call “racialization” (2014, p. 142).
Racialization has become a widely used term to designate the process through which race is made. However, we would like to point at the construction of this word briefly, as it is a compound of the base “race” with the two suffixes -ize and -ation. According to Merriam-Webster.com, the suffix -ation is commonly used to designate an action or process; the suffix -ize can have a variety of quite similar meanings that can be summed up as (1) “to become,” (2) “to become like,” or (3) “to be subject to an action.” Whichever of these meanings is picked in the case of “racialization,” we feel that the “-ize” and the “-ation” are both superfluous, since “race” can function as verb, as a process or action, on its own (see Powell 1997). To use race as a verb also takes away the necessity to communicate a nonessentialist perspective on what is raced, since the object of racing can be different in each new location, situation, and technical context. The object of investigation in this paper is therefore not the essence of race but, rather, how human characteristics such as nationality, ethnicity, or class are raced through data-driven processes and in relation to a particular history and culture. In short, this is not an investigation of race and/as/after technology (Chun 2009; Coleman 2009; Benjamin 2016), but an investigation of racing and the way this process moves and is performed in/through/with technologies of datafication; it is not about technologically mediated racial being but about a datafied process of racing. In this light, the “racing through” in the title of this article should not be read as a sign of how this article will go through the Dutch data infrastructure quickly. Rather, it signifies the idea of following not an object but a process: how racing is constituted through this sociotechnical infrastructure.
In the following, we will first scrutinize the population indicators, their definitions, and their origins. Then we will trace their movement from CBS to the LB and Allochtonenmeter. Finally, we will analyze how these indicators are combined with other statistics and how they are raced through this process. This will enable us to perceive the Allochtonenmeter, LB, and their connection with OGD in a dual manner, scrutinizing the technical apparatus while simultaneously contextualizing it within a broader frame of governmentality.
1. Doing Postcolonial Data Studies
With the rise of e-governmental practices, open data has progressively been used to construct indicators of the state’s performance (Porter 1996; Ruppert 2015). Within these practices, indicators are used to index societal phenomena so that they can be assessed, compared, and ranked. As a consequence, open data has contributed to a new modus operandi as governmental actors provide data to, and simultaneously interact with, citizens (Lathrop and Ruma 2010; Wohlers and Bernier 2015). We understand open government data as a sociotechnological ensemble of networks that creates a new form of social capital consisting of knowledge, in this case concerning the livability of districts within the Netherlands, based on specified indicators. Bruno Latour’s (2005) actor-network theory (ANT) partly derives from this notion of networks as sociotechnological ensembles. ANT focuses on the interrelational materialistic and semiotic elements of objects/subjects and actors/actants (Latour 2005). Within the vocabulary of Latour, data systems cannot be interpreted as neutral intermediates that collect, calculate, and visualize statistical data without steering and influencing the transference of knowledge embedded in these processes. We interpret OGD as a mediator, translating certain norms and values of the state and presenting these through quantified indicators and data categories.
Quantitative reasoning is commonly regarded as a nostrum for societal issues. Critics of quantification in fields of the humanities and the social and natural sciences have often argued that blindly trusting numbers neglects the deeper, contextual understanding of those numbers (Porter 1996). The appeal of these numbers concentrates in the objectification of society, as these numbers are regarded as “factual” representations. Government officials use these numbers to provide clarity and transparency within policy, as a decision backed by numbers has an appearance of “fairness” and “neutrality.” Historian of science Theodore Porter notes that “quantification is a way of making decisions without seeming to decide. Objectivity lends authority to officials who have very little of their own” (Porter 1996, 8). It is in this nonneutral position of the data system and quantification at large where feminist critiques of objectivity become a necessity in the analysis of computer-mediated inequalities (see Draude, Klumbyte, and Treusch 2018). Harding’s (2015) “standpoint theory” suggests that actors and agents involved in knowledge production practices should be attentive to power relations in which knowledge is always implicated in what perspective is used, who benefits from this perspective, and, particularly, who does not. From this “standpoint,” a certain normative perception is made visible and acknowledges what is left “unseen” or “excluded” (Harding 2015). Additionally, Harding’s concept of “strong/weak objectivity” contributes to this notion by depicting what is regarded as objective and how we should try to construct a “neutral” objectivity, but rather as a kind of partial objectivity that acknowledges its perspective and is open about the benefits it produces to particular groups (while possibly excluding others). We will show the situatedness and the partial perspective embedded in open government data in the Netherlands and how this affects the process of racing.
Critical data studies offers means to scrutinize underlying sociotechnical, cultural, and political implications by perceiving data systems not as individual machines but as data assemblages embedded within a particular context (Dalton and Thatcher 2014; Dalton, Taylor, and Thatcher 2016; Iliadis and Russo 2016). Kitchin and Lauriault (2014, p. 2) propose that critical data studies should study these “data assemblages”—that is, “the technological, political, social and economic apparatuses and elements” which constitute and frame “the generation, circulation and deployment of data.” They argue that capturing and storing (personal) data within vast repositories and databases cannot be perceived as a neutral means of processing and assembling data (Kitchin and Lauriault 2014, 3--4). These systems are regarded as complex omnipresent sociotechnical regimes, which are entangled within a heterogeneous institutional spectrum of corporations, institutions, and researchers (Ruppert, Isin, and Bigo 2017). We understand data assemblages to be an important constituting element in contemporary discursive formations concerning immigration and integration. Data assemblages, then, are subjected to vast sociotechnical systems, which are “grounded in engineering and industrial practices, technological artefacts, political programs, and institutional ideologies which act together to govern technological development” (Kitchin and Lauriault 2014, 4). Kitchin and Lauriault elaborate on the apparatus of these data assemblages as follows: “the apparatus and elements of a data assemblage may include systems of thought, forms of knowledge, finance, political economy, governmentalities and legalities, materialities and infrastructures, practices, organizations and institutions, subjectivities and communities, places, and the marketplace where data are constituted” (3–4). It is this notion of the data assemblage as system of thought, form of knowledge, and infrastructure concerning migration where a postcolonial perspective can provide us with a more historically and discursively aware understanding of how power and hegemony work through governmental data systems.
In contemporary datafied governance, it is the data system where the datafied and therefore discursive “Other” is produced. This could be understood as a technologically mediated “Orientalism.” Edward Said (1979) explained this concept as the discursive fabrication of the colonized and the colony. In his influential work of the same title, Said (1979) set out to explain how Orientalism manifests itself in science, cultural representations, and politics. Produced by cultural representations and legitimated by science, this process of Othering informs internal colonial politics as global processes that produce inequality (Said 1993). One of the main reasons for this process of Othering is the fact that the production of culture and knowledge is predominantly performed through a white, Western perspective. In this process, the Other does not have a voice and is therefore not able to be heard. This subject position has been theorized as subaltern by the literary scholar and feminist critic Gayatri Chakravorty Spivak (1988). In her famous essay titled “Can the Subaltern Speak?,” Spivak uses the term “epistemic violence” to explain how people, through their position as minorities, women, and/or racially charged subjects, are denied access to cultural production and are subsequently denied a voice. It should be said that for Spivak, subaltern does not simply mean “minority,” since being part of a minority does not necessarily mean that one does not have a voice. Rather, the subaltern should be seen as a theoretical position that can be left as soon as people are able to communicate and speak for themselves. Part of these processes of Othering and epistemic violence are the ways in which a diverse set of stereotypes manifest themselves. From a psychoanalytical perspective, literary theorist Homi Bhabha (2010, p. 110) explains that “[l]ike the mirror phase ‘the fullness’ of the stereotype—its image as identity—is always threatened by ‘lack,’” meaning that both the colonizer and the colonized envision the other in terms of difference. The discursive construction of the “other” in the form of a diverse set of stereotypes can not only be seen in popular media and literary representations. Recent postcolonial studies and critical race studies work on technologies show how these stereotypes and Orientalist epistemologies persist in the digital domain (see Nakamura 2008, 2013; Sharma 2013; Daniels 2013; Browne 2015; Noble 2018; Eubanks 2018; Risam 2018). In addition, the colonial attitude of companies extracting data and selling their services is addressed in business and media literature (Zuboff 2015; Couldry and Mejias 2019a, 2019b). Postcolonial data studies work remains rather scarce at the time of writing and mainly criticizes the rather Western bias of critical data studies, instead of focusing on the Global South (Arora 2016, 2019; Milan and Treré 2019; Segura and Waisbord 2019; Ricaurte 2019). Other critical data studies with a postcolonial sensitivity can be found in the field of digital migration studies, mainly focused on the digital interactions between (forced) migrants, and between these migrants and the countries that monitor them (often with the intent of keeping them out), mostly in Europe and the United States (Dijstelbloem and Meijer 2011; Ploeg and Sprenkels 2011; Fassin 2011; Leurs and Shepherd 2017; Leurs and Smets 2018; Leurs and Ponzanesi 2018; Sánchez-Querubín and Rogers 2018). Little work has been done on how these systems scrutinize people who are already legal citizens of a particular country. Therefore, our intervention lies in the focus on the persisting colonial situation in the production of Dutch and non-Dutch identities and how these are mediated, constituted, and raced by the network of automated, datafied policy systems of the Dutch government.
2. Race-Ethnicity in the Dutch Governmental Data Ontology
We start with our investigation of racing in the Dutch governmental data assemblage with the building blocks of epistemological systems: the categories and definitions used and the technical infrastructure they are part of. The system of definitions and categories, in programming often referred to as “ontology,” is governmentally defined.2 In the context of datafication, ontology is understood as “a social construction of reality, defined in the context of a specific epistemic culture as sets of norms, symbols, human interactions, and processes that collectively facilitate the transformation of data into knowledge” (Kuiler 2014, 312).
The specific epistemic culture we are dealing with is the immigration discourse of the Netherlands. One of the reasons for the difficulty in discussing the process of racing in this context is the fact that instead of “race,” the “neutral” characteristic “nationality” is used to determine the “ethnic origins” of Dutch citizens. Though not unlike the nativist sentiment currently spreading across Europe and the United States, due to some particularities of the Dutch language and their conception of foreignness, the Allochtonenmeter and LB as presented here could exist only in Flanders or the Netherlands. The Van Dale dictionary translates the term allochtoon into English as “immigrant, foreigner” and the term autochtoon as “autochthonous, indigenous, native” (Van Dale 2019). This simple native/foreigner dichotomy is opposed by the same publisher’s Dutch definition, which defines it as “someone coming from somewhere else,” adding that according to CBS, an allochtoon is “someone who has at least one parent born in a foreign country” (CBS 2019b, authors’ translation). Its etymology can be traced back to the Greek word chtõn, meaning “earth” or “soil.” Combining this root word with the prefixes auto and allo, meaning “same” and “different,” respectively, creates the words autochthon and allochthon, referring to people originating from the same and different ground or soil (Yanow and van der Haar 2013, 237). Cultural anthropologists Gloria Wekker and Helma Lutz (2001, p. 4) set out a more practical translation of these terms: autochtoon as “Dutch” and allochtoon as “ethnic minority.” Furthermore, the intentionally broad allochtoon is a means by which many Dutch scholars and policymakers can avoid discussing race, seen either as irrelevant or taboo after the atrocities of World War II, or racism, which is seen as nonexistent (Essed and Nimako 2006; Özdil 2014; Essed and Hoving 2014; Weiner 2015). Through everyday use and in the popular discourse, the allochtoon/autochtoon dichotomy has become “a racial discourse carried on implicitly in a setting in which the use of the term ‘race’ may be verboten, but where ‘everyone’ knows, and understands, tacitly, the unspoken text” (Yanow and van der Haar 2013, 250).
Gloria Wekker theorized this Dutch condition of racial denial as “white innocence” (Wekker 2016). Although officially, the term allochtoon has been abandoned by the government after a critical report by the Scientific Council for Government Policy, pointing at the negative connotations of the word (see Bovens et al. 2016), the replacement term “people with a migration background” (migratieachtergrond) is still defined in the exact same way (CBS 2019b), based on what CBS calls “ethnicity” but what technically is the nationality of the parents of Dutch citizens. In our understanding of this raced discourse on ethnicity and nationality, we follow Dvora Yanow’s suggestion to consider resulting categories race-ethnic(Yanow 2003). One of the ways in which the flaws of this epistemology become clear is in the classification of people from areas in the world where borders are contested. Dutch race-ethnic categorization classifies Kurdish Turks as Turkish and people from Palestine as Israeli (see CBS 2017b).3 People are therefore possibly classified as belonging to the very nation from whose violence they might have fled. Another pitfall is the fact that over time, “neutral” nationalities become racialized through the assignment of different values to different nationalities.
The classification system is closely linked with the technological infrastructure that affords it. Since 1980 the Dutch government has been performing so-called “virtual” censuses, which, rather than counting people themselves, counted people “administratively” (van Schie 2018, 77). This means that from that moment onward, data and information were prioritized over the bodies of the people to which the data referred, which had at least two consequences relevant for this investigation. First, by no longer counting all people present but only the people registered, the Dutch government made policy-making addressing marginalized people much harder, since particular groups no longer showed up in the numbers. The possibility for people’s presence to be “illegal” was created. Second, CBS was forced to abolish the passport-based migration background data ontology. Instead, they adopted a new data ontology based on the place of birth of the parents of Dutch citizens. These changes coincided with a recent wave of naturalized immigrants, which made it so that monitoring citizens based on their passport alone no longer yielded enough information. Large portions of people with a migration background, previously recognizable by their passport, became officially Dutch and therefore invisible. In short, many people who had been considered migrants in discourse were now technically Dutch, making them difficult to differentiate in statistics and policy. To counter this, the government installed a new system of classification, based on the place of birth of the parents of a Dutch citizen rather than their own place of birth (van Schie 2018, 79--80). In this system, the birthplace of the mother of a person is leading, but if the mother was born in the Netherlands, the birthplace of the father counts (see figure 3). This means that if a person has only one non-Dutch-born parent, they are not considered Dutch. To put this more clearly, in the Netherlands so-called “autochthonous” parents can have “allochthonous” children. However, allochthonous parents can only have autochthonous children if they themselves were both born in the Netherlands. Although the grandchildren of people who migrated to the Netherlands are now officially autochthons, recently CBS has started to monitor the “third generation” as well, since, according to CBS, migration background is “a relevant factor in their socio-economic development” (CBS 2016). By slowly shifting the definition, CBS is also shaping the public discourse in such a way that it seems natural that allochthons never really become Dutch and continue to be seen as migrants (Schinkel 2010; Boersma and Schinkel 2015).
Within the category of allochthons, there is a subdivision into Western allochthons and non-Western allochthons. People from Europe (excluding Turkey), North America, Oceania, Japan, and Indonesia are considered Western allochthons. People from Turkey, Africa, Latin America, Asia (excluding Japan), Suriname, the Dutch Antilles, and Aruba are considered non-Western allochthons (see figure 4). Officially, the division is due to “differences in socio-economic and cultural position” (Keij 2000, 24, authors' translation), however, as noted by Kees Groenendijk (2007, p. 105), other factors also play an important role: “The typically Dutch distinction between Western and Non-Western allochthones is evidently based on political criteria, namely welfare level, geographical or cultural proximity of the country of origin and assumptions about the problematic character of the group.”
With “the problematic character of the group,” Groenendijk is explaining the difference between the Netherlands’ former colonies. Whereas people from Suriname, Aruba, and the Dutch Antilles (all considered non-Western) have been viewed as problem groups, people from Indonesia are seen as assimilated into Dutch society without any major problems. The reason for this can be found in the different social and economic circumstances in which these groups settled in the Netherlands. While people from Indonesia immigrated in the 1950s during an economic boom and with the plan to stay indefinitely, people from Suriname immigrated in the beginning of the 1980s during a recession (Bovenkerk 1978, 13--14). In addition, the status of Indonesia as the “jewel in the crown” of the Dutch Empire in colonial times has been suggested as another reason for the different categorization of people with an Indonesian migration background (Wekker 2016). This difference in situation has severely shaped the image Dutch society constructed about different groups and has resulted in the construction of a race-ethnic hierarchy. Its result can be seen in the way CBS produces statistics about a diverse range of socioeconomic phenomena on its interactive website StatLine.
Many statistics tables feature a default structure in which “ethnic” differences are correlated with often negative phenomena (see, for example, CBS 2019a, also detailed in figure 5). Tables with statistics seem as if the information provided is a neutral representation of reality; however, they obfuscate many choices that have been made in their construction (Kennedy and Hill 2018). In the case of figure 5, the choice of categories and their relation with the issue under scrutiny give the impression that the relationship is causal, meaning that, depending on the political stance of the reader, this might be read as either migration background having an impact on the willingness to pay health insurance, or Dutch society not being able to provide for people with a migration background well enough, causing them to be unable to pay health insurance. Either way, it is assumed that the frame of “migration background” is relevant in this context, as it is assumed it represents reality. However, within our framework, the “migration background frame” of these statistics is made relevant through its very existence; it performs reality. What is missing in this presentation of information is the process through which choices were made about which factors to include and which to exclude in addition to the construction of these factors themselves. Moreover, the identity factors chosen invite the reader to think of them as independent of other similarly constructed identity characteristics, such as age and gender. The result of all these choices is the reduction of the complex social problem of people not being able to pay for their health insurance into a one-dimensional, racially charged issue.
Figure 5 is a table that is made available on the Statline application CBS website. However, apart from reading data and tables, it is also possible to directly tap into CBS data by using an API. This makes it possible for third parties to create their own applications based on CBS data, affording further recombinations with other data and further decontextualization. Two of these applications, the Allochtonenmeter and the Leefbaarometer, will be discussed in detail in the next two sections. Our aim here is to show, through our postcolonial lens, that the racist practices of the Allochtonenmeter should not be seen just as racist politics of individual actors or companies but as practices that are built on racism that is deeply embedded in institutional and infrastructural information practices of the Dutch state. As such, the affordance of the API of CBS to appropriate race-ethnic data is not a politically neutral option but rather the very material infrastructure through which race-ethnicity is made. With the case of the Leefbaarometer, then, we show how the Dutch state itself has a similar politics which are effectively obfuscated through the practice of “objectivity” through data and visualizations, normalizing racist ideologies in the process. Through this case, it becomes clear that the technical infrastructure combined with race-ethnically categorized data are inherently racing systems, regardless of their intended purposes. Additionally, by discussing the performativity of race in these two systems, we are providing an argument against the commonly expressed idea that intentionality is an important factor in the determination of whether or not practices can be considered racist. As we will show, institutional racism operates regardless of intention.
3. The Allochtonenmeter
The Allochtoon-o-meter was an application built on the CBS API, which, with a very simple design (see figure 6), returned the percentage of non-Western immigrants for an entered postal code (Geenstijl 2017). Although it did not explicitly state whether or not a higher percentage would create a positive or a negative situation, it did play into certain stereotypes associated with neighborhoods with a large immigrant population by showing a picture with an apartment building with a lot of satellite dishes. In the Netherlands, this type of housing is mainly associated with social housing areas (government-subsidized apartments in lower-income neighborhoods). The satellite dishes shown in the image suggest a population that is not watching television channels commonly available in the Netherlands and that identifies itself more with media from non-European countries. This picture suggests that integration policies are not effective and that immigrants live in their own parallel societies in the Netherlands.
Although the first app like this, which was featured on the Dutch New Right blog Geenstijl, has been removed now, an almost identical copy was launched with a very similar name: the Allochtonenmeter (Allochtonenmeter 2019, see figure 7). This app works in a similar way but has a different look. Instead of the stereotypical housing Geenstijl used, the Allochtonenmeter featured a picture of a group of angry-looking cartoon figures wearing what look like niqabs. As of 2019, the picture has been changed to a more diverse group of cartoon figures, still featuring one with a niqab but now flanked by lab coat–clad male characters, signifying the scientific nature of this app. After entering a four-digit postcode used in the Netherlands, the user is first presented with the total number of inhabitants in the area covered by this postal code, followed by the absolute numbers and percentages of allochthons (total), Western allochthons, and non-Western allochthons. The word for “autochthons” and the number of “autochthons” are not mentioned. On the bottom of the page, a pink button reads “get to know your neighbors,” which is followed by an explanation of the terms used. What is striking in this explanation is that the tool does not use the updated categories the government itself is using but still talks about “allochtoon” and “Western” and “non-Western.” This is possible because an API affords taking data out of a particular context and appropriating it at will, including changing the names of the categories—categories that were only relabeled but not differently defined in the first place. The graphic design of the website makes this process complete as the presence of the niqab makes clear that the Allochtonenmeter is not so much against, for example, German people living in the Netherlands. The white lab coat–wearing men and the absence of the autochthon in the numbers communicate the unmarked category in this particular app. It is this unmarked category that produces the perspective in the form of “the gaze that mythically inscribes all the marked bodies, that makes the unmarked category claim the power to see and not be seen, to represent while escaping representation” (Haraway 1988, 581). From this absence, we can trace back the public that was envisioned by the makers and that this application produces: the white autochthonous population of the Netherlands. It is this process that simultaneously produces the cultural Other of this population in the form of the allochthon. As such, allochthons are envisioned to be not only non-Dutch but also socioeconomically less advantaged, as well as disconnected from Dutch society at large.
It could be argued that this app is a New Right political project that abuses objective and neutrally collected data. However, the way in which it builds on existing state infrastructure that explicitly affords this use shows how datafied racism in this context is not incidental or accidental but institutional. From a perspective that takes these data systems not as representational instruments but as actors that perform race-ethnic identities, the purposes of the Allochtonenmeter cannot be seen as “unintended” by-products of otherwise neutral epistemic methods. Rather, racing is an inherent process of an infrastructural system that provides race-ethnically categorized data for reuse. To further articulate this argument, we will now discuss how the Dutch government itself appropriates race-ethnic data in a system that intends, but nevertheless fails, to be nonpolitical, neutral, and objective.
4. The Leefbaarometer
In 2007 the Dutch Ministry of Internal Affairs assigned RIGO and Atlas voor Gemeenten, institutions that map geographical and demographical data within areas in the Netherlands, to develop the Leefbarometer (Leefbaarometer 2019). The LB is an interface that calculates a livability score, based on a hundred indicators divided into five dimensions: security, population, facilities, physical surroundings, and buildings (see figure 84). Ten data sources are used to construct the indicators that form the five dimensions of the LB. These data sources are Bisnode,5 Centraal Bureau voor de Statistiek (CBS),6 Rigo,7 Atlas voor gemeenten,8 Vastgoedmonitor (VGM),9 Kadaster,10 Gemeentelijke Basis Administratie (GBA),11 Centrum voor Werk en Inkomen (CWI),12 Korps Landelijke Politiediensten (KLPD),13 and Politiemonitor14(Leefbaarometer 2019). The primary data source is socioeconomic data from the Central Bureau of Statistics (CBS). For the “Population” dimension, the data is accompanied by data from the GBA. With the help of these data sources, the LB presents a statistical estimate of livability within a certain area in the Netherlands. In this estimation, the statistical model focuses on the measure of various conditions of a habitat. The calculation depends on the presence and availability of data from the various data sources within an area. The postulation is made that the attributed data is valid and could be generalized for the entirety of the Netherlands. On the basis of these estimates, a livability score is calculated on a scale of a 6ppc (six-number postal code) area with a minimum of forty inhabitants within that geographically bounded area (Leefbaarometer 2019).
A large literature study, conducted years before the construction of the Leefbaarometer, forms the basis of the livability configuration (Leidelmeijer and Kamp 2003). Based upon this study, the following definition of livability is used: “livability is the measure in which the environment connects to the wishes and demands people have of it” (Leefbaarometer 2019, authors’ translation). The measure of livability is based on the factors that people value in relation to a livable environment. These factors are translated into the constructed indicators and divided into five dimensions. With these indicators, a measure of livability is depicted in an interactive system, consisting of maps, graphs, and texts. In what follows, we will show how this measure of livability is visualized on a map. We will specifically focus on the way in which race-ethnic and economic indicators regarding Dutch citizens are used and presented in this system.
The measure of livability, based on the aforementioned indicators, is presented within a geographical map of the Netherlands (see figure 915). An area is left blank when no score is calculated due to insufficient data or because fewer than the minimum number of forty people live in the area. This minimum is constructed so that the livability scores cannot be used to identify individuals. Nine possible colors represent the livability score, ranging from dark red (very insufficient) to dark green (excellent). In total, there are three shades of red (very insufficient to insufficient), one shade of yellow (weak), and five shades of green (sufficient to excellent). On the lowest scale, the default spatial dimension is set to a grid map. The grid map shows blocks of 100 × 100 meters that are colored according to the calculated livability score (see figure 1016). If fewer than forty people live within one grid or if not enough data is available/retrievable from the various data sources, no livability score is presented, and the area remains gray. The size of the squares (100 × 100 m) illustrates that the system is more applicable for an urban area, since in rural areas the number of people within one square might result in statistical invalidity because of a lower population density. When perceiving the map on a scale larger than the grid map, a complete district is visualized in a color that resembles the average livability score of that area (see figures 9 and 10). This can project a skewed presentation of livability within the specified area, as the scores with colors on a lower scale are homogenized. For example, an area including two districts with a livability score of excellent (dark green) that are surrounded by several districts with a livability score of very insufficient (dark red) is represented by a single color. When the map is viewed on a larger scale, this visualization would imply that the livability score for that entire area is below average, as the color for that area is presented as red even though those two districts are scored above average. The scale on which the map is visualized is therefore of utmost importance, as different meanings could be attributed to the visualizations. Clicking on a square shows an information window (see figure 1117), which elaborates on the calculated scores based on the five dimensions with their indicators in comparison with the national average. The national average is a livability score that ranges from amply sufficient to good for all districts within the Netherlands (Leefbaarometer 2019).
In addition, through the use of these graphic designs with accompanying color schemes of shades ranging from green to dark red, a Dutch norm concerning livability is signified. As all scores are consistently compared to the Dutch average level of livability, the Dutch norm—or, in the words of Harding (1991), a clear “standpoint”—is used to score areas. Racing within the LB is most clearly embedded within the population indicators, since, as figure 11 shows, it is possible for an area to receive a negative score for its “inhabitants.” This score is not a proxy for crime or income and does not include factors other than the presence of particular groups in this area. The way in which these groups of people are scrutinized is further elaborated on in the next section.
For the calculation of livability, the LB determines the postal code of a certain area and incorporates aggregated data regarding that postal code area in the calculation based on one hundred indicators. As stated in the pie chart in figure 8, 15 percent of the livability score is determined by the dimension “population.” This dimension consists of 16 indicators (see figure 12). Indicators 1 to 7 refer to the number of allochthons with various origins and ethnicities living in a specified postal code area. Indicators 8 to 10 refer to the composition of the households, incorporating the average number of people in a household, whether or not there are children in the house, and if there are one or two parents in the household. Indicators 11 and 12 refer to the number of people who receive social benefits and/or welfare from the state, due to disabilities and unemployment. Indicator 13 is the only indicator that focuses solely on one specific age group, which consists of the elderly (65+). Indicators 14 to 16 refer to the development and fluctuation of people within the age group of 15 to 24 years, composition of households, and people migrating from one postal area to another. For the purpose of this article, we will focus on the indicators that relate to race-ethnicity, age, and source of income.
Indicators 1 and 2 refer to the Western and non-Western allochthon category as a percentage of the whole in a postal code area. There is no category for autochthons. The following indicators are subcategories of this rather broad division. Indicator 3 refers to Middle and Eastern Europeans, which includes the countries of Estonia, Hungary, Latvia, Lithuania, Poland, Slovenia, Slovakia, Czech Republic, Romania, and Bulgaria. Indicators 4, 5, and 6 refer to the three largest groups of people with a non-Dutch migration background: Moroccans, Turkish, and Surinamese. Indicator 7 is a residual category of people with other non-Western origins. For all mentioned populations, larger than average percentages have a negative impact on the population score. This serves as an example of how the LB embeds the dominant state ideology and hegemonic norms concerning the discursive and institutional “Other.” Through the construction of these indicators, the LB actively produces these so-called “Others.” As described earlier, these datafied instances of inequalities, “making up people” (Hacking 1986), and “social sorting” (Lyon 2005) are not understood merely as epistemological issues but also as ontological practices producing racial formations. Whereas in the Allochtonenmeter, datafied racing happens only through indication of the allochthon in the form of a percentage and the accompanying imagery that negatively depicts Muslims, in the LB, racing happens through indication of location and country of origin, combined with characteristics such as class, welfare status, and household composition, producing a datafied form of racing that is hard to challenge as it hides under a cloak of objectivity.
The absence of the category of the autochthon is the most striking aspect in the context of how racing is performed within the system. This absence is in line with what other scholars investigating migration and integration discourses have found: “Autochthony” functions as the implicit “reference category” (Emirbayer and Desmond 2012; Boersma 2019). The autochthon/allochthon dichotomy as an epistemology has shaped knowledge production by setting autochthony as the norm against which all “others” are measured:
Like whiteness, “autochthony” has implicitly (and sometimes explicitly) functioned as the unreflexive norm, a neutral category, a natural fact without a history or relational context. Thus it functions, like whiteness, as a “reference category” (Emirbayer and Desmond 2012; Hartigan 1997) against which deviant cultures can be measured, or a cultural “whole” into which minoritized and racialized others can be reasonably expected to “integrate.” (Mepschen 2016, 29)
This absent “reference category” also communicates what the intended publics and/or users are, not only of the more right-wing Allochtonenmeter but also of the governmentally commissioned LB. This embedded norm of autochthony casts a different light on the LB’s own definition of livability in terms of “the wishes and demands people have about it” (Leefbaarometer 2019, authors’ translation). The “people” in this sentence are not so much the total and diverse population of the Netherlands but rather the position of the unmarked category that is taken to be “objective” in the system. It is the claim of a “gaze from nowhere” that Haraway (1988) explains as signifying “the unmarked positions of Man and White, one of the many nasty tones of the word ‘objectivity’ to feminist ears in the scientific and technological, late-industrial, militarized, racist, and male-dominant societies” (Haraway 1988, 581).
What counts as an unmarked category in this system is quite literally the categories that are missing from the list. In the case of race-ethnicity, we can quite clearly see that autochthons are not accounted for, but this is not the only identity characteristic that is missing in this system. When we look at the income sources, we find that indicators 11 and 12 are derived from socioeconomic data concerning recipients of disability benefits and social welfare relating to unemployment as a percentage of the whole in a postal code area. These indicators generally have a negative value for the livability calculation. The assumption exists that the income levels of these people are lower in relation to the national average, which results in a devaluation of the livability measure. There is no indicator for people who do not receive benefits—that is, people who have a job or people who are not searching for a job. Moreover, indicator 13 consists of elderly people above the age of 65 as a percentage of a whole in a postal code area. This group is already retired from work or is close to retiring. The indicator is derived from data concerning retirement measures and people receiving a pension. The data is corrected and enriched with data on income levels and generally has a negative value for the livability calculation, as retired people receiving a pension are categorized in a lower income level than people with a job. Indicators 14 and 15 are derived from data concerning the development of people residing in the age group of 15 to 24 years, as these young adults move from their teens to adolescence. There is no explicit indicator for people between 24 and 65 years of age.
From this we can infer that the intended public that is produced consists of white, working people between the ages of 25 and 65, whose normative frameworks are envisioned by the application as not wishing to have a range of minority groups and welfare recipients as part of their environment.
Regarding these indicators, the LB can be understood as an e-governmental instrument producing data publics with a sense of the state’s performance, an experience that is somewhat less analytical, as a result of its oversimplification of a complex issue, and more experiential, through its graphical interface and use of colors. It is through this experience that it produces a discursive separation of those who are considered migrants and those who are not. It was primarily intended for data publics that have some experience with open governmental data and can play and logically interact with rankings, dashboards, scoreboards, and other visualizations (Ruppert 2015). In practice, it is not known if municipalities use it or which actors specifically use it. The problematic aspects are embedded within the conception that the data publics have a certain literacy with such instruments and comprehend the complex configurations built, and choices made, within the system. The framing of the LB “as if” it is objective and is constructed upon “factual” open government data that represents the state ever more exacerbates the skewed interpretation and use of the LB and its initial purpose to inform the public. Within the perspective of critical data studies, data is “those units of data that have been selected and harvested from the sum of all potential data” (Kitchin 2014b, 2). This means that the data is easily appropriated for whichever means (political or otherwise) while still retaining an inaccurate veneer of neutrality.
We argue for the use of critical frameworks through which the embedded normative assumptions of datafied systems and epistemologies can be made visible. With the proposed postcolonial data studies perspective, it is possible not only to critically engage with the technical apparatus that reproduces inequality but also to place these systems and their ontologies and epistemologies in relation to a (post)colonial history and present. Through an infrastructural inversion, we have shown how already racialized social and cultural understandings of the (former) migrant travel through a diverse set of technical systems, taking up new racial meanings on their path. With the notion of the “data assemblage,” we have argued that the construction of racial hierarchies does not happen in one particular location. Instead, datafied racing emerges through an interplay of a diverse set of actors, which do not require the notion of “race” itself to be already present in the system. Even, or maybe especially, in a culture of racial denial, the process of racing can happen to a virtually unlimited set of human characteristics. With this understanding, “white innocence” (Wekker 2016) is not only an exclusively human condition but can be carried out through technical systems as well. We have shown how the race-ethnic conceptual pair of autochthon and allochthon are therefore social as well as sociotechnical constructs, as their meanings rely heavily on the data ontologies through which they are institutionalized. With our performative notion of racing, this technical institutionalization and infrastructuralization of race-ethnicity can hardly be seen as a neutral and objective epistemic method. In the words of Joshua Scannell, we will have to push past “the insufficient critique that such systems run the risk of reproducing racial inequalities. Rather, producing racialized oppression is all that they can do” (Scannell 2019, 113).
We propose to rethink the foundation and construction of indicators representing minorities, which are commonly embedded within e-governmental systems and distributed across the whole information system. Scrutinizing the multidimensional problematic implications of indicators is an essential element when shifting their terms from certainty, objectivity, and neutrality to ambiguity and constructedness (Ruppert 2015). The concept of “standpoint theory” helps to increase awareness regarding these concerns, whereas a postcolonial perspective sheds a different light on the mixture of socioeconomic and cultural norms embedded in e-governmental systems. With these perspectives, it becomes ever more possible to place contemporary issues concerning race, ethnicity, and nationality in conversation with their past in order to imagine better futures. Futures in which the government especially should start to think about more accountability, rather than transparency, in their datafied systems (Ananny and Crawford 2016; Lepri et al. 2017).
This accountability would start by making explicit the purposes of a system and the way in which particular information helps in reaching that goal. In this process, engineers should take into account how their systems produce and reproduce marginalized identities. Ideally, the systems should account for their situatedness (Suchman 2006), creating the possibility for critique and making it possible for affected communities to challenge outcomes. This way, this situatedness can be leveraged not only with existing laws and regulations but also with ethics concerning equality, as only this would create the possibility for a more inclusive and antiracist datafied society.
Gerwin van Schie and Alex Smit received funding for the research leading to authorship and publication of this article from the Dutch Scientific Council (NWO) through Gerwin van Schie’s PhD project, “Datafication of Race and Ethnicity in the Netherlands: Investigating Practices, Politics and Appropriation of Governmental Open Data.” Nicolás López Coombs is an independent researcher affiliated with the Datafied Society Research Hub of Utrecht University.
Gerwin van Schie received his MA in new media and digital culture from Utrecht University in 2015. In 2017 he began the NWO-funded PhD project “Datafication of Race and Ethnicity in the Netherlands: Investigating Practices, Politics, and Appropriation of Governmental Open Data.” In this project, he focuses on the way Dutch citizens are quantified and racialized through data systems in use by various governmental institutions.
Alex Smit holds an MA in new media and digital culture from Utrecht University. His research focuses on the epistemological implications of data-driven systems within society, with a strong interest in data literacy, data justice, open (government) data, and critical data studies.
Nicolás López Coombs is an independent researcher in Brisbane, Australia, and formerly affiliated with Utrecht Data School of Utrecht University. He received his MA in new media and digital culture from Utrecht University in 2016, where his thesis took a fan studies approach to citizen journalism related to the podcast Serial. His research focuses on the use of data visualization within online communities.
Image acquired in December 2019 from https://leefbaarometer.nl/kaart/#kaart.
Image acquired in December 2019 from https://www.leefbaarometer.nl/page/help.
Bisnode is an international data science/consulting company originating in Sweden, specializing in helping governmental and corporate actors optimize data-driven decision-making and analytics.
Centraal Bureau voor de Statistiek, or Statistics Netherlands, is an independent governmental organization that provides socioeconomic data concerning Dutch society; i.e., the data collections consist of data on people’s age, income, and the amount of social benefits in a specified area.
RIGO (Research Instituut Gebouwde Omgeving) is a research and consulting organization with a strong focus on advising semigovernmental actors how to improve livability levels within municipalities, based on data analytics and research.
Atlas voor gemeenten is a semigovernmental research bureau exploring socioeconomic and cultural phenomena within municipalities in the Netherlands, based on government data.
Vastgoedmonitor, or Real-estate Monitor (authors’ translation), is an information system presenting information on real estate issues within the Netherlands. The system connects multiple geo-economic databases to construct graphs and figures concerning real estate phenomena within municipalities in the Netherlands.
Kadaster is a Dutch national real estate registration agency, formulating advice concerning real estate themes such as the value of real estate, borders of property, etc.
Gemeentelijke Basis Administratie, or Municipal Administrations Office (authors’ translation), is a Dutch governmental administration office registering socioeconomic and demographic data concerning Dutch civilians.
Centrum voor Werk en Inkomen, or Centre for Work and Income (authors’ translation), is a Dutch governmental administration office registering all types of socioeconomic data concerning work and income levels within the Netherlands.
Korps Landelijke Politiediensten, or Dutch National Police (authors’ translation), is the Dutch law enforcement agency on a national level.
Politiemonitor, or Police Monitor (authors’ translation) is a Dutch governmental research agency focusing on investigations concerning Dutch criminality and safety.
Image acquired in December 2019 from https://leefbaarometer.nl/kaart/#kaart.
Image acquired in December 2019 from https://leefbaarometer.nl/kaart/#kaart.
Image acquired in December 2019 from https://leefbaarometer.nl/kaart/#kaart.