India's Data Protection Framework Will Need to Treat Privacy as a Social and Not Just an Individual Good
Published in Economic & Political Weekly, Volume 53, Issue No. 18, 05 May, 2018. Article can be accessed online here.
In July 2017, the Ministry of Electronics and Information Technology (MeITy) in India set up a committee headed by a former judge, B N Srikrishna, to address the growing clamour for privacy protections at a time when both private collection of data and public projects like Aadhaar are reported to pose major privacy risks (Maheshwari 2017). The Srikrishna Committee is in the process of providing its input, which will go on to inform India’s data-protection law.
While the committee released a white paper with provisional views, seeking feedback a few months ago, it may be discussing a data protection framework without due consideration to how data practices have evolved.
In early 2018, a series of stories based on investigative journalism by Guardianand Observer revealed that the data of 87 million Facebook users was used for the Trump campaign by a political consulting firm, Cambridge Analytica, without their permissions. Aleksandr Kogan, a psychology researcher at the University of Cambridge, created an application called “thisisyourdigitallife” and collected data from 270,000 participants through a personality test using Facebook’s application programming interface (API), which allows developers to integrate with various parts of the Facebook platform (Fruchter et al 2018). This data was collected purportedly for academic research purposes only. Kogan’s application also collected profile data from each of the participants’ friends, roughly 87 million people.
The kinds of practices concerning the sharing and processing of data exhibited in this case are not unique. These are, in fact, common to the data economy in India as well. It can be argued that the Facebook–Cambridge Analytica incident is representative of data practices in the data-driven digital economy. These new practices pose important questions for data protection laws globally, and how these may need to evolve to address data protection, particularly for India, which is in the process of drafting its own data protection law.
Privacy as Control
Most modern data protection laws focus on individual control. In this context, the definition by the late Alan Westin (2015) characterises privacy as:
The claim of individuals, groups, or institutions to determine for themselves when, how, and to what extent information about them is communicated to other.
The idea of “privacy as control” is what finds articulation in data protection policies across jurisdictions, beginning with the Fair Information Practice Principles (FIPP) from the United States (US) (Dixon 2006). These FIPPs are the building blocks of modern information privacy law (Schwartz 1999) and not only play a significant role in the development of privacy laws in the US, but also inform data protection laws in most privacy regimes internationally (Rotenberg 2001), including the nine “National Privacy Principles” articulated by the Justice A P Shah Committee in India. Much of this approach is also reflected in the white paper released by the committee, led by Justice Srikrishna, towards the creation of data protection laws in India (Srikrishna 2017)
This approach essentially involves the following steps (Cate 2006):
(i) Data controllers are required to tell individuals what data they wish to collect and use and give them a choice to share the data.
(ii) Upon sharing, the individuals have rights such as being granted access, and data controllers have obligations such as securing the data with appropriate technologies and procedures, and only using it for the purposes identified.
The objective in this approach is to make the individual empowered and allow them to weigh their own interests in exercising their consent. The allure of this paradigm is that, in one elegant stroke, it seeks to “ensure that consent is informed and free and thereby also (seeks) to implement an acceptable tradeoff between privacy and competing concerns.” (Sloan and Warner 2014). This approach is also easy to enforce for both regulators and businesses. Data collectors and processors only need to ensure that they comply with their privacy policies, and can thus reduce their liability while, theoretically, consumers have the information required to exercise choice. In recent years, however, the emergence of big data, the “Internet of Things,” and algorithmic decision-making has significantly compromised the notice and consent model (Solove 2013).
Limitations of Consent
Some cognitive problems, such as long and difficult to understand privacy notices, have always existed with regard to the issue of informed consent, but lately these problems have become aggravated. Privacy notices often come in the form of long legal documents, much to the detriment of the readers’ ability to understand them. These policies are “long, complicated, full of jargon and change frequently” (Cranor 2012).
Kent Walker (2001) lists five problems that privacy notices typically suffer from:
(i) Overkill: Long and repetitive text in small print.
(ii) Irrelevance: Describing situations of little concern to most consumers.
(iii) Opacity: Broad terms that reflect limited truth, and are unhelpful to track and control the information collected and stored.
(iv) Non-comparability: Simplification required to achieve comparability will lead to compromising of accuracy.
(v) Inflexibility: Failure to keep pace with new business models.
Today, data is collected continuously with every use of online services, making it humanly impossible to exercise meaningful consent.
The quantity of data being generated is expanding at an exponential rate. With connected devices, smartphones, appliances transmitting data about our usage, and even the smart cities themselves, data now streams constantly from almost every sector and function of daily life, “creating countless new digital puddles, lakes, tributaries and oceans of information” (Bollier 2010).
The infinitely complex nature of the data ecosystem renders consent of little value in cases where individuals may be able to read and comprehend privacy notices. As the uses of data are so diverse, and often not limited by a purpose identified at the beginning, individuals cannot conceptualise how their data will be aggregated and possibly used or reused.
Seemingly innocuous bits of data revealed at different stages could be combined to reveal sensitive information about the individual. While the regulatory framework is designed such that individuals are expected to engage in cost–benefit analysis of trading their data to avail services, this ecosystem makes such individual analysis impossible.
Conflicts Between Big Data and Individual Control
The thrust of big data technologies is that the value of data resides not in its primary purposes, but in its numerous secondary purposes, where data is reused many times over (Schoenberger and Cukier 2013).
On the other hand, the idea of privacy as control draws from the “data minimisation” principle, which requires organisations to limit the collection of personal data to the minimum extent necessary to obtain their legitimate purpose and to delete data no longer required. Control is excercised and privacy is enhanced by ensuring data minimisation. These two concepts are in direct conflict. Modern data-driven businesses want to retain as much data as possible for secondary uses. Since these secondary uses are, by their nature, unanticipated, their practices run counter to the very principle of purpose limitation (Tene and Polonetsky 2012).
It is evident from such data-sharing practices, as demonstrated by the Cambridge Analytica–Facebook story, that platform architectures are designed with a clear view to collect as much data as possible. This is amply demonstrated by the provision of a “friends permission” feature by Facebook on its platform to allow individuals to share information not just about themselves, but also about their friends. For the principle of informed consent to be meaningfully implemented, it is necessary for users to have access to information about intended data practices, purposes and usage, so they consciously share data about themselves.
In reality, however, privacy policies are more likely to serve as liability disclaimers for companies than any kind of guarantee of privacy for consumers. A case in point is Mark Zuckerberg’s facile claim that there was no “data-breach" in the Cambridge Analytica–Facebook incident. Instead of asking each of the 87 million users whether they wanted their data to be collected and shared further, Facebook designed a platform that required consent in any form only from 270,000 users. Not only were users denied the opportunity to give consent, their consent was assumed through a feature which was on by default. This is representative of how privacy trade-offs are conceived by current data-driven business models. Participation in a digital ecosystem is by itself deemed as users’ consent to relinquish control over how their data is collected, who may have access to it, and what purposes it may be used for.
Yet, Zuckerberg would have us believe that the primary privacy issue of concern is not about how his platform enabled the collection of users’ data without their explicit consent, but in the subsequent unauthorised sharing of the data by Kogan. Zuckerberg’s insistence that collection of data of people without their consent is not a data breach is reminiscent of the UIDAI’s recent claims in India that publication of Aadhaar numbers and related information by several government websites is not a data breach, so long as its central biometric database in secure (Sharma 2018). In such cases also, the intended architecture ensured the seeding of other databases with Aadhaar numbers, thus creating multiple potential points of failure through disclosure. Similarly, the design flaws in direct benefit transfers enabled Airtel to create payments bank accounts with the customers’ knowledge (Hindu Business Line 2017). Such claims clearly suggest the very limited responsibility data controllers (both public and private) are willing to take for personal data that they collect, while wilfully facilitating and encouraging data practices which may lead to greater risk to data.
On this note, it is also relevant to point out that the Srikrishna committee white paper begins with identifying informational privacy and data innovation as its two key objectives. It states that “a firm legal framework for data protection is the foundation on which data-driven innovation and entrepreneurship can flourish in India.”
Conversations around privacy and data have become inevitably linked to the idea of technological innovation as a competing interest. Before engaging in such conversations, it is important to acknowledge that the value of innovation as a competing interest itself is questionable. It is not a competing right, nor a legitimate public interest endeavour, nor a proven social good.
The idea that in policymaking, technological innovations may compete with privacy of individuals assumes that there is social and/or economic good in allowing unrestricted access to data. The social argument is premised on the promises of mathematical models and computational capacity being capable of identifying key insights from data. In turn, these insights may be useful in public and private decision-making. However, it must be remembered that data is potentially a toxic asset, if it is not collected, processed, secured and shared in the appropriate way. Sufficient research suggests that indiscriminate data collection is greatly increasing the ratio of noise to signal, and can lead to erroneous insights. Further, the greater the amount of data you collect, the greater is the attack surface that leads to cybersecurity risks. Further, incidents such as Facebook–Cambridge Analytica demonstrate that toxicity of data in various ways and underscores the need for data regulation at every stage of the data lifecycle (Scheiner 2016). These are important tempering factors that need to be kept in mind while evaluating data innovation as a key mover of policy or regulation.
Privacy as Social Good
As long as privacy is framed as arising primarily from individual control, data controllers will continue to engage in practices that compromise the ability to exercise choice. There is a need to view privacy as a social good, and policymaking should ensure its preservation and enhancement. Contractual protections and legal sanctions can themselves do little if platform architectures are designed to do the exact opposite.
More importantly, policymaking needs to recognise privacy not merely as an individual right, available for individuals to forego when engaging with data-driven business models, but also as a social good. The recognition of something as a social good deems it desirable by definition, and a legitimate goal of law and policy, rather than rely completely on market forces for its achievement.
The Puttaswamy judgment (K Puttaswamy v Union of India 2017) lends sufficient weight to privacy’s social value by identifying it as fundamental to any individual development through its dependence on solitude, anonymity, and temporary releases from social duties.
Sociological scholarship demonstrates that different types of social relationships, be it Gesellschaft (interest groups and acquaintances) or Gemeinschaft (friendship, love, and marriage), and the nature of these relationships depend on the ability to conceal certain things (Simmel 1906). Demonstrating this in the context of friendships, it has been stated that such relationships “present a very peculiar synthesis in regard to the question of discretion, of reciprocal revelation and concealment.” Friendships, much like most other social relationships, are very much dependent on our ability to selectively present ourselves to others. Contrast this with Zuckerberg’s stated aim of making the world more “open” where information about people flows freely and effectively without any individual control. Contrast this also with government projects such as the Aadhaar which intends to act as one universal identity which can provide a 360-degree view of citizens.
Other scholars such as Julie Cohen (2012) and Anita Allen (2011) have demonstrated that data that a person produces or has control over concerns both herself and others. Individuals can be exposed not only because of their own actions and choices, but also made vulnerable merely because others have been careless with their data. This point is amply demonstrated in the Facebook–Cambridge Analytica incident. What this means is that protection of privacy requires not just individual action, but in a sense, requires group co-ordination. It is my argument that this group interest of privacy as a social good must be the basis of policymaking and regulation of data in the future, in addition to the idea of privacy as an individual right. In the absence of attention to the social good aspect of privacy, individual consumers are left to their own devices to negotiate their privacy trade-offs with large companies and governments and are significantly compromised.
What this translates into is a regulatory framework and data protection frameworks should not be value-neutral in their conception of privacy as a facet of individual control. The complete reliance of data regulation on the data subject to make an informed choice is, in my opinion, an idea that has run its course. If privacy is viewed as a social good, then the data protection framework, including the laws and the architecture must be designed with a view to protect it, rather than leave it entirely to the market forces.
The Way Forward
Data protection laws need to be re-evaluated, and policymakers must recognise Lawrence Lessig’s dictum that “code is law.” Like laws, architecture and norms can play a fundamental role in regulation. Regulatory intervention for technology need not mean regulation of technology only, but also how technology itself may be leveraged for regulation (Lessig 2006; Reidenberg 1998). It is key that the latter is not left only in the hands of private players.
Zuckerberg, in his testimony (Washington Post 2018) before the United States Senate's Commerce and Judiciary committees, asserted that "AI tools" are central to any strategy for addressing hate speech, fake news, and manipulations that use data ecosystems for targeting.
What is most concerning in his testimony is the complete lack of mention of standards, public scrutiny and peer-review processes, which “AI tools” and regulatory technologies need to be subject to. Further, it cannot be expected that data-driven businesses will view privacy as a social good or be publicly accountable.
As policymakers in India gear up for writing the country’s data protection law, they must acknowledge that their responsibility extends to creating norms and principles that will inform future data-driven platforms and regulatory technologies.
Since issues of privacy and data protection will have to be increasingly addressed at the level of how architectures enable data collection, and more importantly how data is used after collection, policymakers must recognise that being neutral about these practices is no longer enough. They must take normative positions on data collection, processing and sharing practices. These positions cannot be implemented through laws only, but need to be translated into technological solutions and norms. Unless a multipronged approach comprising laws, architecture and norms is adopted, India’s new data protection regime may end up with limited efficacy.