Workshop on Open Data for Human Development - Sessions Report

CIS facilitated a workshop on open data policy and tools for government officials from Sikkim, Meghalaya, and Tripura, and those from Bhutan and Maldives, in June 2015. The workshop was co-facilitated with Akvo, DataMeet, and Mapbox, and was supported by International Centre for Human Development of UNDP India. Here we share the workshop report and other related documents. The report is written by Sumandro, along with Amitangshu Acharya of Akvo.


Day 01, June 03, 2015

The first day of the workshop began with Mr. Prem Das Rai, Honourable MP, Loksabha, Sikkim, briefly addressing the participants. He contextualised the workshop against the background of technological changes and emerging opportunities of governance through effective usages of data. Dr. A.K. Shiva Kumar, Director of the International Centre for Human Development (IC4HD), UNDP India, welcomed the participants and initiated a panel discussion on data, ICTs and governance. The panel had three speakers: Mr. Srivatsa Krishna, IAS and Secretary, Department of Information Technology, Biotechnology, and Science and Technology, Government of Karnataka; Dr. B. Gangaiah, Additional Director General, Centre for Good Governance, Hyderabad; and Sunil Abraham, Executive Director, the Centre for Internet and Society, Bengaluru and Delhi.

Mr. Krishna spoke about the strategies adopted in setting up IT and ITES clusters in Cyberabad, Andhra Pradesh and in Bengaluru, Karnataka. He noted that tax cuts and accelerated land allocation are key to incentivising the private sector to set up IT and ITES units. Another major concern is that of ensuring supply of good quality IT workers. He also emphasised on the need for governments to build effective public facing electronic services - either in the form of Nemmadi Kendras, where people can physically go to access various government services, or in the form of mobile applications that bring different civic services into one digital interface, like Bangalore One and Karnataka Mobile One.

Dr. Gangaiah gave an extensive overview of the idea and applications of open data in the contexts of governance and development. He noted that government data (in India) often suffers from criticisms related to quality, as well as the lack of availability of the same in public domain. The key problems, he identified, for opening up government data in India are that most often the data is collected by a government agency for a very specific purpose, and the steps required to ensure wider circulation and use of the same is not taken (such as lack of documentation and interoperability of data); and that the government agencies most often consider the collected data as a source of power, and hence as something to be retained and not disclosed in full details. The slides from Dr. Gangaiah’s presentation can be accessed here.

Mr. Abraham’s presentation highlighted several areas of concern when deploying data-driven techniques and solutions for human development challenges. He described how the current phase of open data discussions by central and state governments in India represent the third phase of ‘openness’ in governance in India. While the first phase focused on usage of Free/Libre Open Source Softwares in building electronic governance applications and information systems, the second phase involved embracing of open software standards and formats across government information systems and IT solutions. It is very important to note that with the third phase of openness focusing on opening up of data and information, both of these earlier foci of free and open source softwares, and open standards and interoperability are returning as complementary components to ensure seamless publication of open government data. However, he argued, when deploying data-driven techniques and solutions for human development challenges, it is imperative to remember three things: 1) collection of data is a time- and effort-consuming task, and hence must be optimised so as to not to take away time and effort from actual developmental interventions, 2) bad quality of development data is a structural problem, often emanating from the data being not useful to the person actually collecting it, and 3) availability of data does not automatically change or open up the process of decision-making.

The second session of the day started with a detailed presentation by Mr. T. Samdup, Joint Director, Department of Information Technology, Government of Sikkim, on the context, the making, and the salient features of the Sikkim Open Data Acquisition and Accessibility Policy (SODAAP), 2014. He explained that the Policy mandates setting up of an online state data portal that will host all data sets generated by various agencies of the Government of Sikkim, and making such data available, subject to concerns of privacy and security, across all state government agencies and the citizens in general. The key needs driving this Policy have been that for availability of accurate and timely data on various aspects of human development in the state, as well as for reducing expenses and confusions due to duplication of data collection efforts. The slides from Mr. Samdup’s presentation can be accessed here.

The presentation by Mr. Samdup was followed by one by Mr. Sumandro Chattapadhyay of the Centre for Internet and Society on an initial set of questions and concerns that should be addressed by the implementation plan of the SODAAP. He took a detailed look at the four objectives mentioned in the Policy document, and discussed what tasks, decisions, and deliberations are needed to achieve each of those. In conclusion, he listed a set of core components of the implementation process that must also be discussed in the implementation plan document, namely: 1) governance and oversight structure for implementation, 2) incentivising government personnel for opening up data across departments, including financial support for the same, 3) metadata, documentation of data collection process, and implementing unique identifiers, and 4) developing processes of sharing of data between the Union and the state government, especially in reference to national Management Information Systems. The slides from Mr. Chattapadhyay’s presentation can be accessed here.

These presentations were followed by a general discussion on various aspects of the SODAAP and the challenges to be overcome during its implementation. This session provided a general introduction to the SODAAP, especially for workshop participants who are not from Sikkim, and also set up the key questions to be discussed and answered while preparing the first draft of the SODAAP implementation plan.

After the second session ended, the participants were asked to individually write down the key challenges they identify for the implementation process of SODAAP. These responses were compiled by Sumandro and made available as a reference document for the implementation plan. The chart below summarises these responses.


In the third session of the day, Joy Ghosh and Amitangshu Acharya of Akvo talked about the challenges of collecting structured born-digital data from the grassroots level, and how using mobile-based applications, like Akvo FLOW, can address such challenges. Akvo FLOW runs on all Android-based smartphones, and allows ground level development workers to directly feed data into the phone, as well as collect related materials like GPS location and photographs, based upon a form that is centrally designed and downloaded into their phones by the development workers. The data is then kept in the phone till it is sent back to the main server, where data coming from all different surveyors using the same form is shown on a map-based interface for easy navigation of the data across space and time. In this session, Mr. Acharya first introduced the participants to the issues around digital data collection, touching upon issues of ethics, capacity, prioritisation of data collection process along with tools. Mr. Ghosh then took over to describe the functioning of the tool, and then distributed several smartphones, pre-loaded with Akvo FLOW, among the participants for an applied data collection exercise where the participants walked around the NIAS campus and collected data using the FLOW interface. They returned to see their data mapped and analysed on the online dashboard. Their presentation can be accessed here.


Day 02, June 04, 2015

The second day started with two consecutive presentations by Mr. Thejesh GN of DataMeet, and Mr. Sivaram Ramachandran of Mapbox on the tools and techniques for working with statistical data and with geospatial data, respectively. The former presentation took the participants through the stages of working with statistical data: from collecting and finding data, to cleaning and validating, and finally analysing the data. Various free and open source tools for each of these stages were also discussed in brief, such as PDF Tables and Tabula for converting PDF tables to spreadsheets, Open Refine for cleaning data, and RAW and DataWrapper for generating web-based dynamic charts. The latter presentation explored the various ways in which geospatial data can be used to inform and support decision-making, and the tools that can be used to render and present geospatial data in forms that are accessible for decision-makers within government and also for individual users. Mr. Ramachandran presented the various free and open source tools available for working with geospatial data, such as Mapbox Studio, Quantum GIS, and Leaflet JS. He also gave a brief introduction to OpenStreetMap, the wiki-like user-contributed global map data platform. Both the presentations can be accessed here and here, respectively. After this session, the participants were divided into two groups. One group engaged further with tools and techniques of working with statistical and geospatial data. The second group took part in a series of exercises to identify and document the current data flows and bottlenecks thereof across several key departments of Government of Sikkim.

The group engaging in applications of various software tools for working with statistical and geospatial data was facilitated by Mr. Thejesh and Mr. Ramachandran. This group worked with a sample statistical data set, taking it across the stages of finding, cleaning, analysing, and visualising as discussed earlier. The participants used the online version of Tableau to create dynamic charts. Afterwards, they were introduced to various methods of contributing and downloading data from the OpenStreetMap, including directly adding data points through the online editor named iD. The participants went out in the NIAS campus to collect geospatial data about various natural and human-made features of the campus, such as trees, pathways, etc.

The second group working on documenting data flows and identifying bottlenecks was facilitated by Mr. Chattapadhyay, Mr. Acharya, and Ms. Rajashi Mukherjee from Akvo. The group was further divided into department-wise teams, one each for the Department of Health, the Department of Economic Statistics, Monitoring, and Evaluation (DESME), the Human Resource Development Department (HRDD), and representatives from Gram Panchayat Units. The exercise began with each of the teams discussing and drawing the flow of data for one of the major data set maintained by the agency concerned. The data flows were drawn by identifying key moments of its processing (such as primary collection, verification, digitisation, analysis, storage, reporting, etc.), the actors involved in that moment, the tools and data formats relevant for each moment, and which agency finally stores and uses the data. Once these processes were described on paper, the next part of the exercise focused on identifying which challenges exist at which part of these data flows. This was followed up by a ranking of all these challenges, in terms of how critically they affect the ability of the agency concerned to use and share the final data. All the teams worked separately, and conversed with the facilitators as needed, to develop the data flow diagrams and identify the key challenges.

The major common challenges noted by these teams were: 1) delays in collection, verification, and digitisation of data, 2) inability of state government agencies to access data collected as part of centrally-funded welfare schemes, and 3) parallel systems of data collection employed by different departments leading to duplication of efforts and data.

Several interesting insights came through in this exercise. For example, data related to education is collected both by the HRDD, and the Sarva Shiksha Abhiyaan (SSA). However, SSA data is not shared with the HRDD. Also, the HRDD publishes all its data, including the name of students, on their website, making it publicly available. One of the data challenges identified by the HRDD was their difficulty in tracking if scholarship money is reaching the suitable students. When a student moves from one school to another, the records do not get updated easily. This leads to different schools continuing to receive funds for the same scholarship. Aligning school records is important to prevent such leakages.

After these two grouped exercises, all the participants gathered back so that the data flows diagrams and identification of key challenges documented by departmental teams could be presented to the entire group. Each team presented their data flow diagram, and discussed challenges and opportunities. This created a context for different departments to discuss what kind of data they often needed from each other, and how there was neither a platform for inter-departmental discussion on such issues, nor systems that facilitate the same. There was an agreement that an open data platform could address this issue to a great extent. The discussion also highlighted that the most significant data collecting government agency in Sikkim is DESME, however, it does not publish any data in machine-readable formats, and does not even have a website.

This data flow and bottleneck exercise made it very clear that there are several data production and collection processes in place in Sikkim, and also systems that are digesting, processing, and reporting data. Hence, implementing the open data policy will need to negotiate with such complexity.

In the final session of the day, Dr. Shiban Ganju made a presentation on applications of open data in healthcare. His talk focused on how converting medical information about a patient being stored at various locations to a combined and shareable Electronic Health Record can save the patient as well as the medical practitioners from duplication of medical tests, easier mobility from one medical institute to another, and a clearer macro-level understanding of key public health indicators. Dr. Ganju discussed the open health data initiatives in the United States, in the United Kingdom, and in Sweden, before discussing the challenges faced in implementing interoperable standards for open health data in India. The slides from Dr. Ganju’s presentation can be accessed here.


Day 03, June 05, 2015

The final day started with a set of presentations from Mr. Garab Dorji, Deputy Chief IT Officer, Office of the Prime Minister, Thimphu, Bhutan of the Government of Bhutan, Mr. Birendra Tiwari, Senior Informatic Officer, Department of Information Technology, Government of Meghalaya, and Mr. Milan Chhetri of Melli Dara Paiyong Gram Panchayat Unit, Sikkim, on various technological solutions being explored, implemented, and practiced by the respective governments and administrative units.

Mr. Milan Chhetri’s presentation was on the operationalisation of Cyber Villages in Sikkim, which had been initiated in 2013 with support from the Honourable Chief Minister of Sikkim, Pawan Kumar Chamling. Cyber Villages aim to address digital divide, by empowering local village units with handheld data devices to collect data from every household and connect the same to a real time dashboard. All village related data is expected to be available in one place. At the same time as part of e-governance initiative, SMS based updates on Government programmes and services will be sent to all villagers. Mr. Chhetri ended his presentation with a short promotional video of the concept, which is embedded below.


The second session of the day started with a presentation from Mr. D. P. Misra, National Data Sharing and Accessibility Policy - Programme Management Unit (NDSAP-PMU), National Informatics Centre, Government of India. The presentation focused on the process of implementation of the National Data Sharing and Accessibility Policy approved by the Government of India in 2012. Mr. Misra has played a key role in the NDSAP-PMU that was trusted with development of the national open government data platform of India and in setting up the procedures and standards for publication of government data by various central and state government agencies through that Platform. His talk described the technical solutions designed by the NDSAP-PMU to make data accessible for the end-users in various file formats, to make visualisation of available data easy, and to make it possible for users to comment upon existing data and to request for data that is unavailable at the moment. Further, he emphasised the need for outreach initiatives by the government so as to build awareness and activities around the available open government data. The slides from Mr. Misra’s presentation can be accessed here.

The presentation by Mr. Misra was followed by a group exercise where various teams, self-selected by the participants, worked on different sections of the SODAAP implementation plan to put together ideas and plans for the first draft of the document. Five groups were formed and each of them worked on a separate section of the implementation plan: 1) Governance Framework and Budgetary Support, 2) Data Inventory and Negative List, 3) Data Acquisition and Open Standards, 4) Data Publication Process, Licenses, and Timeframes, and 5) Awareness, Capacity, and Demand of Data. The initial section titled ‘Introduction to the Policy and its Principles’ was put together by Vashistha Iyer on the basis of the SODAAP document. The technical section on the ‘Sikkim Open Data Portal’ was left out of this drafting exercise, as it was decided that the representatives of the Department of Information Technology will prepare this section on the basis of their interactions with the NDSAP-PMU later in June.

The drafting session was followed by presentations by each team working on a separate section, and quick feedbacks from all the participants. These drafts, along with the feedbacks, have been compiled together by Mr. Chattapadhyay, and is shared with the officials from the Government of Sikkim for their further discussion and eventual finalisation of the SODAAP implementation plan document.

The workshop ended with a round of final words and sharing of learning by the participants, and a vote of thanks on the behalf of the organisers.