Big Data in India: Benefits, Harms, and Human Rights - Workshop Report

Posted by Vidushi Marda, Akash Deep Singh and Geethanjali Jujjavarapu at Nov 14, 2016 05:45 AM | Permalink

Filed under: Human Rights, UID, Big Data, Privacy, Artificial Intelligence, Internet Governance, Machine Learning, Featured, Digital India, Aadhaar, Information Technology, E-Governance

The Centre for Internet and Society held a one-day workshop on “Big Data in India: Benefits, Harms and Human Rights” at India Habitat Centre, New Delhi on the 1st of October, 2016. This report is a compilation of the the issues discussed, ideas exchanged and challenges recognized during the workshop. The objective of the workshop was to discuss aspects of big data technologies in terms of harms, opportunities and human rights. The discussion was designed around an extensive study of current and potential future uses of big data for governance in India, that CIS has undertaken over the last year with support from the MacArthur Foundation.

Contents

Big Data: Definitions and Global South Perspectives

Aadhaar as Big Data

Seeding

Aadhaar and Data Security

Aadhaar’s Relational Arrangement with Big Data Scheme

The Myths surrounding Aadhaar

IndiaStack and FinTech Apps

Problems with UID

Big Data: Definitions and Global South Perspectives

“Big Data” has been defined by multiple scholars till date. The first consideration at the workshop was to discuss various definitions of big data, and also to understand what could be considered Big Data in terms of governance, especially in the absence of academic consensus. One of the most basic ways to define it, as given by the National Institute of Standards and Technology, USA, is to take it to be the data that is beyond the computational capacity of current systems. This definition has been accepted by the UIDAI of India. Another participant pointed out that Big Data is not only indicative of size, but rather the nature of data which is unstructured, and continuously flowing. The Gartner definition of Big Data relies on the three Vs i.e. Volume (size), Velocity (infinite number of ways in which data is being continuously collected) and Variety (the number of ways in which data can be collected in rows and columns).

The presentation also looked at ways in which Big Data is different from traditional data. It was pointed out that it can accommodate diverse unstructured datasets, and it is ‘relational’ i.e. it needs the presence of common field(s) across datasets which allows these fields to be conjoined. For e.g., the UID in India is being linked to many different datasets, and they don’t constitute Big Data separately, but do so together. An increasingly popular definition is to define data as “Big Data” based on what can be achieved through it. It has been described by authors as the ability to harness new kinds of insight which can inform decision making. It was pointed out that CIS does not subscribe to any particular definition, and is still in the process of coming up with a comprehensive definition of Big Data.

Further, discussion touched upon the approach to Big Data in the Global South. It was pointed out that most discussions about Big Data in the Global South are about the kind of value that it can have, the ways in which it can change our society. The Global North, on the other hand, has moved on to discussing the ethics and privacy issues associated with Big Data.

After this, the presentation focussed on case studies surrounding key Central Government initiatives and projects like Aadhaar, Predictive Policing, and Financial Technology (FinTech).

Aadhaar as Big Data

In presenting CIS’ case study on Aadhaar, it was pointed out that initially, Aadhaar, with its enrollment dataset was by itself being seen as Big Data. However, upon careful consideration in light of definitions discussed above, it can be seen as something that enables Big Data. The different e-governance projects within Digital India, along with Aadhaar, constitute Big Data. The case study discussed the Big Data implications of Aadhaar, and in particular looked at a ‘cradle to grave’ identity mapping through various e-government projects and the datafication of various transaction generated data.

Seeding

Any digital identity like Aadhaar typically has three features: 1. Identification i.e. a number or card used to identify yourself; 2. Authentication, which is based on your number or card and any other digital attributes that you might have; 3. Authorisation: As bearers of the digital identity, we can authorise the service providers to take some steps on our behalf. The case study discussed ‘seeding’ which enables the Big Data aspects of Digital India. In the process of seeding, different government databases can be seeded with the UID number using a platform called Ginger. Due to this, other databases can be connected to UIDAI, and through it, data from other databases can be queried by using your Aadhaar identity itself. This is an example of relationality, where fractured data is being brought together. At the moment, it is not clear whether this access by UIDAI means that an actual physical copy of such data from various sources will be transferred to UIDAI’s servers or if they will just access it through internet, but the data remains on the host government agency’s server. An example of even private parties becoming a part of this infrastructure was raised by a participant when it was pointed out that Reliance Jio is now asking for fingerprints. This can then be connected to the relational infrastructure being created by UIDAI. The discussion then focused on how such a structure will function, where it was mentioned that as of now, it cannot be said with certainty that UIDAI will be the agency managing this relational infrastructure in the long run, even though it is the one building it.

Aadhaar and Data Security

This case study also dealt with the sheer lack of data protection legislation in India except for S.43A of the IT Act. The section does not provide adequate protection as the constitutionality of the rules and regulations under S.43A is ambivalent. More importantly, it only refers to private bodies. Hence, any seeding which is being done by the government is outside the scope of data protection legislation. Thus, at the moment, no legal framework covers the processes and the structures being used for datasets. Due to the inapplicability of S.43A to public bodies, questions were raised as to the existence of a comprehensive data protection policy for government institutions. Participants answered the question in the negative. They pointed out that if any government department starts collecting data, they develop their own privacy policy. There are no set guidelines for such policies and they do not address concerns related to consent, data minimisation and purpose limitation at all. Questions were also raised about the access and control over Big Data with government institutions. A tentative answer from a participant was that such data will remain under the control of the domain specific government ministry or department, for e.g. MNREGA data with the Ministry of Rural Development, because the focus is not on data centralisation but rather on data linking. As long as such fractured data is linked and there is an agency that is responsible to link them, this data can be brought together. Such data is primarily for government agencies. But the government is opening up certain aspects of the data present with it for public consumption for research and entrepreneurial purposes.The UIDAI provides you access to your own data after paying a minimal fee. The procedure for such access is still developing.

Aadhaar’s Relational Arrangement with Big Data Scheme

The various Digital India schemes brought in by the government were elucidated during the workshop. It was pointed out that these schemes extend to myriad aspects of a citizen’s daily life and cover all the essential public services like health, education etc. This makes Aadhaar imperative even though the Supreme Court has observed that it is not mandatory for every citizen to have a unique identity number. The benefits of such identity mapping and the ecosystem being generated by it was also enumerated during the discourse. But the complete absence of any data ethics or data confidentiality principles make us unaware of the costs at which these benefits are being conferred on us. Apart from surveillance concerns, the knowledge gap being created between the citizens and the government was also flagged. Three main benefits touted to be provided by Aadhaar were then analysed. The first is the efficient delivery of services. This appears to be an overblown claim as the Aadhaar specific digitisation and automation does not affect the way in which employment will be provided to citizens through MNREGA or how wage payment delays will be overcome. These are administrative problems that Aadhaar and associated technologies cannot solve. The second is convenience to the citizens. The fallacies in this assertion were also brought out and identified. Before the Aadhaar scheme was rolled in, ration cards were issued based on certain exclusion and inclusion criteria.. The exclusion and inclusion criteria remain the same while another hurdle in the form of Aadhaar has been created. As India is still lacking in supporting infrastructure such as electricity, server connectivity among other things, Aadhaar is acting as a barrier rather than making it convenient for citizens to enroll in such schemes.The third benefit is fraud management. Here, a participant pointed out that this benefit was due to digitisation in the form of GPS chips in food delivery trucks and electronic payment and not the relational nature of Aadhaar. Aadhaar is only concerned with the linking up or relational part. About deduplication, it was pointed out how various government agencies have tackled it quite successfully by using technology different from biometrics which is unreliable at the best of times.

The Myths surrounding Aadhaar

The discussion also reflected on the fact that Aadhaar is often considered to be a panacea that subsumes all kinds of technologies to tackle leakages. However, this does not take into account the fact that leakages happen in many ways. A system should have been built to tackle those specific kinds of leakages, but the focus is solely on Aadhaar as the cure for all. Notably, participants who have been a part of the government pointed out how this myth is misleading and should instead be seen as the first step towards a more digitally enhanced country which is combining different technologies through one medium.

IndiaStack and FinTech Apps

What is India Stack?

The focus then shifted to another extremely important Big Data project, India Stack, being conceptualised and developed by a team of private developers called iStack, for the NPCI. It builds on the UID project, Jan Dhan Yojana and mobile services trinity to propagate and develop a cashless, presence-less, paperless and granular consent layer based on UID infrastructure to digitise India.

A participant pointed out that the idea of India Stack is to use UID as a platform and keep stacking things on it, such that more and more applications are developed. This in turn will help us to move from being a ‘data poor’ country to a ‘data rich’ one. The economic benefits of this data though as evidenced from the TAGUP report - a report about the creation of National Information Utilities to manage the data that is present with the government - is for the corporations and not the common man. The TAGUP report openly talks about privatisation of data.

Problems with India Stack

The granular consent layer of India Stack hasn’t been developed yet but they have proposed to base it on MIT Media Lab’s OpenPDS system. The idea being that, on the basis of the choices made by the concerned person, access to a person’s personal information may be granted to an agency like a bank. What is more revolutionary is that India Stack might even revoke this access if the concerned person expresses a wish to do so or the surrounding circumstances signal to India Stack that it will be prudent to do so. It should be pointed out that the the technology required for OpenPDS is extremely complex and is not available in India. Moreover, it’s not clear how this system would work. Apart from this, even the paperless layer has its faults and has been criticised by many since its inception, because an actual government signed and stamped paper has been the basis of a claim.. In the paperless system, you are provided a Digilocker in which all your papers are stored electronically, on the basis of your UID number. However, it was brought to light that this doesn’t take into account those who either do not want a Digilocker or UID number or cases where they do not have access to their digital records. How in such cases will people make claims?

A Digital Post-Dated Cheque: It’s Ramifications

A key change that FinTech apps and the surrounding ecosystem want to make is to create a digital post-dated cheque so as to allow individuals to get loans from their mobiles especially in remote areas. This will potentially cut out the need to construct new banks, thus reducing the capital expenditure , while at the same time allowing the credit services to grow. The direct transfer of money between UID numbers without the involvement of banks is a step to further help this ecosystem grow. Once an individual consents to such a system, however, automatic transfer of money from one’s bank accounts will be affected, regardless of the reason for payment. This is different from auto debt deductions done by banks presently, as in the present system banks have other forms of collateral as well. The automatic deduction now is only affected if these other forms are defaulted upon. There is no knowledge as to whether this consent will be reversible or irreversible. As Jan Dhan Yojana accounts are zero balance accounts, the account holder will be bled dry. The implication of schemes such as “Loan in under 8 minutes” were also discussed. The advantage of such schemes is that transaction costs are reduced.The financial institution can thus grant loans for the minimum amount without any additional enquiries. It was pointed out that this new system is based on living on future income much like the US housing bubble crash. Interestingly, in Public Distribution Systems, biometrics are insisted upon even though it disrupts the system. This can be seen as a part of the larger infrastructure to ensure that digital post-dated cheques become a success.

The Role of FinTech Apps

FinTech ‘apps’ are being presented with the aim of propagating financial inclusion. The Technology Advisory Group for Unique Projects report stated that as managing such information sources is a big task, just like electricity utilities, a National Information Utilities (NIU) should be set up for data sources. These NIUs as per the report will follow a fee based model where they will be charging for their services for government schemes. The report identified two key NIUs namely the National Payments Corporation of India (NPCI) and the Goods and Services Tax Network (GSTN). The key usage that FinTech applications will serve is credit scoring. The traditional credit scoring data sources only comprised a thin file of records for an individual, but the data that FinTech apps collect - a person’s UID number, mobile number. and bank account number all linked up, allow for a far more comprehensive credit rating. Government departments are willing to share this data with FinTech apps as they are getting analysis in return. Thus, by using UID and the varied data sources that have been linked together by UID, a ‘thick file’ is now being created by FinTech apps. Banking apps have not yet gone down the route of FinTech apps to utilise Big Data for credit scoring purposes.

The two main problems with such apps is that there is no uniform way of credit scoring. This distorts the rate at which a person has to pay interest. The consent layer adds another layer of complication as refusal to share mobile data with a FinTech app may lead to the app declaring one to be a risky investment thus, subjecting that individual to a higher rate of interest .

Regulation of FinTech Apps and the UID Infrastructure

India Stack and the applications that are being built on it, generate a lot of transaction metadata that is very intimate in nature. The privacy aspects of the UID legislation doesn't cover such data. The granular consent layer which has been touted to cover this still has to come into existence. Also, Big Data is based on sharing and linking of data. Here, privacy concerns and Big Data objectives clash. Big Data by its very nature challenges privacy principles like data minimisation and purpose limitation.The need for regulation to cover the various new apps and infrastructure which are being developed was pointed out.

Problems with UID

It has been observed that any problem present with Aadhaar is usually labelled as a teething problem, it’s claimed that it will be solved in the next 10 years. But, this begs the question - why is the system online right now?

Aadhaar is essentially a new data condition and a new exclusion or inclusion criteria. Data exclusion modalities as observed in Rajasthan after the introduction of biometric Point of Service (POS) machines at ration shops was found to be 45% of the population availing PDS services. This number also includes those who were excluded from the database by being included in the wrong dataset. There is no information present to tell us how many actual duplicates and how many genuine ration card holders were weeded out/excluded by POS.

It was also mentioned that any attempt to question Aadhaar is considered to be an attempt to go back to the manual system and this binary thinking needs to change. Big Data has the potential to benefit people, as has been evidenced by the scholarship and pension portals. However, Big Data’s problems arise in systems like PDS, where there is centralised exclusion at the level of the cloud. Moreover, the quantity problem present in the PDS and MNREGA systems persists. There is still the possibility of getting lesser grains and salary even with analysis of biometrics, hence proving that there are better technologies to tackle these problems. Presently, the accountability mechanisms are being weakened as the poor don’t know where to go to for redressal. Moreover, the mechanisms to check whether the people excluded are duplicates or not is not there. At the time of UID enrollment, out of 90 crores, 9 crore were rejected. There was no feedback or follow-up mechanism to figure out why are people being rejected. It was just assumed that they might have been duplicates.

Another problem is the rolling out of software without checking for inefficiencies or problems at a beta testing phase. The control of developers over this software, is so massive that it can be changed so easily without any accountability.. The decision making components of the software are all proprietary like in the the de-duplication algorithm being used by the UIDAI. Thus, this leads to a loss of accountability because the system itself is in flux, none of it is present in public domain and there are no means to analyse it in a transparent fashion..

These schemes are also being pushed through due to database politics. On a field study of NPR of citizens, another Big Data scheme, it was found that you are assumed to be an alien if you did not have the documents to prove that you are a citizen. Hence, unless you fulfill certain conditions of a database, you are excluded and are not eligible for the benefits that being on the database afford you.

Why is the private sector pushing for UIDAI and the surrounding ecosystem?

Financial institutions stand to gain from encouraging the UID as it encourages the credit culture and reduces transaction costs.. Another advantage for the private sector is perhaps the more obvious one, that is allows for efficient marketing of products and services..

The above mentioned fears and challenges were actually observed on the ground and the same was shown through the medium of a case study in West Bengal on the smart meters being installed there by the state electricity utility. While the data coming in from these smart meters is being used to ensure that a more efficient system is developed,it is also being used as a surrogate for income mapping on the basis of electricity bills being paid. This helps companies profile neighbourhoods. The technical officer who first receives that data has complete control over it and he can easily misuse the data. This case study again shows that instruments like Aadhaar and India Stack are limited in their application and aren’t the panacea that they are portrayed to be.

A participant pointed out that in the light of the above discussions, the aim appears to be to get all kinds of data, through any source, and once you have gotten the UID, you link all of this data to the UID number, and then use it in all the corporate schemes that are being started. Most of the problems associated with Big Data are being described as teething problems. The India Stack and FinTech scheme is coming in when we already know about the problems being faced by UID. The same problems will be faced by India Stack as well.

Can you opt out of the Aadhaar system and the surrounding ecosystem?

The discussion then turned towards whether there can be voluntary opting out from Aadhaar. It was pointed out that the government has stated that you cannot opt out of Aadhaar. Further, the privacy principles in the UIDAI bill are ambiguously worded where individuals only have recourse for basic things like correction of your personal information. The enforcement mechanism present in the UIDAI Act is also severely deficient. There is no notification procedure if a data breach occurs. . The appellate body ‘Cyber Appellate Tribunal’ has not been set up in three years.

CCTNS: Big Data and its Predictive Uses

What is Predictive Policing?

The next big Big Data case study was on the Crime and Criminal Tracking Network & Systems (CCTNS). Originally it was supposed to be a digitisation and interconnection scheme where police records would be digitised and police stations across the length and breadth of the country would be interconnected. But, in the last few years some police departments of states like Chandigarh, Delhi and Jharkhand have mooted the idea of moving on to predictive policing techniques. It envisages the use of existing statistical and actuarial techniques along with many other tropes of data to do so. It works in four ways: 1. By predicting the place and time where crimes might occur; 2. To predict potential future offenders; 3. To create profiles of past crimes in order to predict future crimes; 4. Predicting groups of individuals who are likely to be victims of future crimes.

How is Predictive Policing done?

To achieve this, the following process is followed: 1. Data collection from various sources which includes structured data like FIRs and unstructured data like call detail records, neighbourhood data, crime seasonal patterns etc. 2. Analysis by using theories like the near repeat theory, regression models on the basis of risk factors etc. 3. Intervention

Flaws in Predictive Policing and questions of bias

An obvious weak point in the system is that if the initial data going into the system is wrong or biased, the analysis will also be wrong. Efforts are being made to detect such biases. An important way to do so will be by building data collection practices into the system that protect its accuracy. The historical data being entered into the system is carrying on the prejudices inherited from the British Raj and biases based on religion, caste, socio-economic background etc.

One participant brought about the issue of data digitization in police stations, and the impact of this haphazard, unreliable data on a Big Data system. This coupled with paucity of data is bound to lead to arbitrary results. An effective example was that of black neighbourhoods in the USA. These are considered problematic and thus they are policed more, leading to a higher crime rate as they are arrested for doing things that white people in an affluent neighbourhood get away with. This in turn further perpetuates the crime rate and it becomes a self-fulfilling prophecy. In India, such a phenomenon might easily develop in the case of migrants, de-notified tribes, Muslims etc. A counter-view on bias and discrimination was offered here. One participant pointed out that problems with haphazard or poor quality of data is not a colossal issue as private companies are willing to fill this void and are actually doing so in exchange for access to this raw data. It was also pointed out how bias by itself is being used as an all encompassing term. There are multiplicities of biases and while analysing the data, care should be taken to keep it in mind that one person’s bias and analysis might and usually does differ from another. Even after a computer has analysed the data, the data still falls into human hands for implementation.

The issue of such databases being used to target particular communities on the basis of religion, race, caste, ethnicity among other parameters was raised. Questions about control and analysis of data were also discussed, i.e. whether it will be top-down with data analysis being done in state capitals or will this analysis be done at village and thana levels as well too. It was discussed as topointed out how this could play a major role in the success and possible persecutory treatment of citizens, as the policemen at both these levels will have different perceptions of what the data is saying. . It was further pointed out, that at the moment, there’s no clarity on the mode of implementation of Big Data policing systems. Police in the USA have been seen to rely on Big Data so much that they have been seen to become ‘data myopic’. For those who are on the bad side of Big Data, in the Indian context, laws like preventive detention can be heavily misused.There’s a very high chance that predictive policing due to the inherent biases in the system and the prejudices and inefficiency of the legal system will further suppress the already targeted sections of the society. A counterpoint was raised and it was suggested that contrary to our fears, CCTNS might lead to changes in our understanding and help us to overcome longstanding biases.

Open Knowledge Architecture as a solution to Big Data biases?

The conference then mulled over the use of ‘Open Knowledge’ architecture to see whether it can provide the solution to rid Big Data of its biases and inaccuracies if enough eyes are there. It was pointed out that Open Knowledge itself can’t provide foolproof protection against these biases as the people who make up the eyes themselves are predominantly male belonging to the affluent sections of the society and they themselves suffer from these biases.

Who exactly is Big Data supposed to serve?

The discussion also looked at questions such as who is this data for? Janata Information System (JIS), is a concept developed by MKSS where the data collected and generated by the government is taken to be for the common citizens. For e.g. MNREGA data should be used to serve the purposes of the labourers. The raw data as is available at the moment, usually cannot be used by the common man as it is so vast and full of information that is not useful for them at all. It was pointed out that while using Big Data for policy planning purposes, the actual string of information that turned out to be needed was very little but the task of unravelling this data for civil society purposes is humongous. By presenting the data in the right manner, the individual can be empowered. The importance of data presentation was also flagged. It was agreed upon that the content of the data should be for the labourer and not a MNC, as the MNC has the capability to utilise the raw data on it’s own regardless.

Concerns about Big Data usage

Participants pointed out that privacy concerns are usually brushed under the table due to a belief that the law is sufficient or that the privacy battle has already been lost.
In the absence of knowledge of domain and context, Big Data analysis is quite limited. Big Data’s accuracy and potential to solve problems needs to be factually backed.
The narrative of Big Data often rests on the assumption that descriptive statistics take over inferential statistics, thus eliminating the need for domain specific knowledge. It is claimed that the data is so big that it will describe everything that we need to know.
Big Data is creating a shift from a deductive model of scientific rigour to an inductive one. In response to this, a participant offered the idea that troves of good data allow us to make informed questions on the basis of which the deductive model will be formed. A hybrid approach combining both deductive and inductive might serve us best.
The need to collect the right data in the correct format, in the right place was also expressed.

Potential Research Questions & Participants’ Areas of Research

Following this discussion, participants brainstormed to come up with potential areas of research and research questions. They have been captured below:

Big Data, Aadhaar and India Stack:

Has Aadhaar been able to tackle illegal ways of claiming services or are local negotiations and other methods still prevalent?
Is the consent layer of India Stack being developed in a way that provides an opportunity to the UID user to give informed consent? The OpenPDS and its counterpart in the EU i.e. the My Data Structure were designed for countries with strong privacy laws. Importantly, they were meant for information shared on social media and not for an individual’s health or credit history. India is using it in a completely different sphere without strong data protection laws. What were the granular consent layer structures present in the West designed for and what were they supposed to protect?
The question of ownership of data needs to be studied especially in context of a globalised world where MNCs are collecting copious amounts of data of Indian citizens. What is the interaction of private parties in this regard?

Big Data and Predictive Policing:

How are inequalities being created through the Big Data systems? Lessons should be taken from the Western experience with the advent of predictive policing and other big data techniques - they tend to lead to perpetuation of the current biases which are already ingrained in the system.
It was also pointed out how while studying these topics and anything related to technology generally, we become aware of a divide that is present between the computational sciences and social sciences. This divide needs to be erased if Big Data or any kind of data is to be used efficiently. There should be a cross-pollination between different groups of academics. An example of this can be seen to be the ‘computational social sciences departments’ that have been coming up in the last 3-4 years.
Why are so many interim promises made by Big Data failing? A study of this phenomenon needs to be done from a social science perspective. This will allow one to look at it from a different angle.

Studying Big Data:

What is the historical context of the terms of reference being used for Big Data? The current Big Data debate in India is based on parameters set by the West. For better understanding of Big Data, it was suggested that P.C. Mahalanobis’ experience while conducting the Indian census, (which was the Big Data of that time) can be looked at to get a historical perspective on Big Data. This comparison might allow us to discover questions that are important in the Indian context. It was also suggested that rather than using ‘Big Data’ as a catchphrase to describe these new technological innovations, we need to be more discerning.
What are the ideological aspects that must be considered while studying Big Data? What does the dialectical promise of technology mean? It was contended that every time there is a shift in technology, the zeitgeist of that period is extremely excited and there are claims that it will solve everything. There’s a need to study this dialectical promise and the social promise surrounding it.
Apart from the legitimate fears that Big Data might lead to exclusion, what are the possibilities in which it improve inclusion too?
The diminishing barrier between the public and private self, which is a tangent to the larger public-private debate was mentioned.
How does one distinguish between technology failure and process failure while studying Big Data?

Big Data: A Friend?

In the concluding session, the fact that the Big Data moment cannot be wished away was acknowledged. The use of analytics and predictive modelling by the private sector is now commonplace and India has made a move towards a database state through UID and Digital India. The need for a nuanced debate, that does away with the false equivalence of being either a Big Data enthusiast or a luddite is crucial.

A participant offered two approaches to solving a Big Data problem. The first was the Big Data due process framework which states that if a decision has been taken that impacts the rights of a citizen, it needs to be cross examined. The efficacy and practicality of such an approach is still not clear. The second, slightly paternalistic in nature, was the approach where Big Data problems would be solved at the data science level itself. This is much like the affirmative algorithmic approach which says that if in a particular dataset, the data for the minority community is not available then it should be artificially introduced in the dataset. It was also suggested that carefully calibrated free market competition can be used to regulate Big Data. For e.g. a private personal wallet company that charges higher, but does not share your data at all can be an example of such competition.

Another important observation was the need to understand Big Data in a Global South context and account for unique challenges that arise. While the convenience of Big Data is promising, its actual manifestation depends on externalities like connectivity, accurate and adequate data etc that must be studied in the Global South.

While the promises of Big Data are encouraging, it is also important to examine its impacts and its interaction with people's rights. Regulatory solutions to mitigate the harms of big data while also reaping its benefits need to evolve.