You are here: Home / Internet Governance / Comments on the Statistical Disclosure Control Report

Comments on the Statistical Disclosure Control Report

This submission presents comments by the Centre for Internet and Society, India (“CIS”) on the ​Statistical Disclosure Control Report published on March 30th by Ministry of Statistics and Programme Implementation.


1. PRELIMINARY

This submission presents comments by the Centre for Internet and Society, India (“CIS”) on the ​Statistical Disclosure Control Report published on March 30th by Ministry of Statistics and Programme Implementation.

CIS is thankful for the opportunity to put forth its views.
This submission is divided into three main parts. The first part, ‘Preliminary’, introduces the document; the second part, ‘About CIS’, is an overview of the organization; and, the third part contains the ‘Comments’.

2. ABOUT CIS

CIS is a non-​profit organisation that undertakes interdisciplinary research on internet and digital technologies from policy and academic perspectives. The areas of focus include digital accessibility for persons with diverse abilities, access to knowledge, intellectual property rights, openness (including open data, free and open source software, open standards, open access, open educational resources, and open video), internet governance, telecommunication reform, freedom of speech and expression, intermediary liability, digital privacy, and cybersecurity.​

CIS values the fundamental principles of justice, equality, freedom and economic development. This submission is consistent with CIS' commitment to these values, the safeguarding of general public interest and the protection of India's national interest at the international level. Accordingly, the comments in this submission aim to further these principles.

3. Comments

3.1 General Comments

As a non-profit organisation we recognize the importance of the efforts by the Ministry of Statistics and Programme Implementation (MoSPI) to make the  data you collect available to the public in open formats with relevant information about reliability of statistical estimates.

We at CIS have recently released a report titled “Information Security Practices of Aadhaar (or lack thereof): A documentation of public availability of Aadhaar Numbers with sensitive personal financial information”. We encountered several central and state government departments collecting socioeconomic data from citizens, linking it with Aadhaar and even publishing them in exportable data formats like EXCEL and MS ACCESS Databases.  While we understand this issue primarily concerns to Unique Identification Authority of India (UIDAI), the lack of standards around information/statistical disclosure are a general threat to transparency in a democracy and privacy of individuals. Going through the report we understand the committee is unable to prescribe a standard for other ministries and departments until they try and pilot these standards within Ministry of Statistics and Programme Implementation. This delay in prescribing the standards can be really dangerous in the current circumstances of massive data collection by government departments and linking all the databases with a unique identifier, Aadhaar Number.  At the same time we understand the importance of data dissemination to be carried out and we recommend the following for improving the standards around data disclosure control.

3.2 Integrity of Information and Data

We agree with the committee that the error rates need to be kept in mind while designing practices to convert raw data. But we request the process of changes being made be actively measured and documented. In case of errors being computed, guidelines can be made to decrease the possibilities of misinterpretation of errors causing loss of integrity of information. Statistics are important for decision making in governance, errors in computations can be biased towards millions of people. Statistical biases are important to be looked into while converting data from its raw format to make sure there are no damage caused by information.

3.3 Data Security

One of the important issues around storage and publication of Aadhaar information is the lack of masking standards. With the availability of data from multiple departments, it is possible to reconstruct identification details by linking data from multiple databases. It is recommended to bring masking standards while personally identifiable micro data is being published. There is an urgent need for departments to also look at auditing access to information and tracking sharing of information. It is recommended the department digitally signs all the information and documents being published or shared by them to keep track of who had accessed the information and verifying the authenticity of information.

We request the department to define what exactly is “usage for statistical purposes only” and recommend standards to control and restrict usage of information for this purpose. It is important they design frameworks or mechanisms to allow others to report violations around this. This process should be transparent and documented heavily.

3.4 Anonymization of microdata

We recommend the data being collected be anonymized at source to evade the possibility of the accidental disclosure of personally identifiable information. While the current anonymization efforts have been helpful, with steady increase in data mining and classification algorithms and practices it is recommended to evolve the standards around this area.

3.5 Data Dissemination

Data dissemination is an important aspect for district statistics officers, we recommend they actively communicate their work through monthly newsletters, quarterly workshops to help improve the conversations around statistics and at the same time engage with the users who would benefit from the data.

We also recommend that data when being published includes metadata of collection, modification, storage and other important information. Also the information needs to be published in open formats which does not require proprietary software to be used to open them. At the same time data should be published in multiple formats like CSV, XLS, PDF,

The committee also recognizes the need for having data users part of discussions around important decisions and be part of committees. We would like the department to recognize our efforts and consider us for future committee representations.

 

Thank you for this opportunity and we look forward to work with you in future.

Document Actions