Analysis of Konkani Wikipedia: Facts & Challenges

Posted by Nitika Tandon at Apr 30, 2013 05:00 AM |
Nitika Tandon, in this blog post, provides an analysis of the Konkani Wikipedia. She reflects on the challenges faced by Konkani Wikipedia and identifies four possible solutions.

Konkani is a language spoken primarily by people living in Goa and in the neighbouring states on the western coast of India (also known as the Kokan belt) – some pockets in Maharashtra, Karnataka and Kerala. All of these neighbouring states are places where speakers from Goa may have migrated over the past five centuries (Garry, 2001).[1] Each region has a different dialect, pronunciation style, vocabulary, tone and sometimes, significant differences in grammar.

The total number of Konkani speakers seems to have remained remarkably stable for over a century. This is borne out by the census reports over the years. The Census Department of India, 2001 figures put the number of Konkani speakers in India as 2,489,015. Out of these, around 6 lakh were in Goa, 7 lakh in Karnataka, 3 lakh in Maharashtra, 6 lakh in Kerala and rest live outside of India, either as expatriates or citizens of other countries (NRIs).[2]

Goa has a high literacy rate of 87.4 per cent.[3] Goa is also in one of the top three states/union territories with highest computer density. In Goa, about 31.1 per cent of households use computers while 12.7 per cent of households use computers with internet.[4]

Konkani Wikipedia is currently in its incubation stage with very few editors. About 86 articles have been added but much work needs to be done 1) to push it as a live Wikipedia project and 2) to improve quality and quantity of articles. Konkani Wikipedia promotes editing in all scripts (Roman, Devanagari, Kannada) equally.

Challenges faced by Konkani Wikipedia

  1. Konkani Wikipedia has grown meagerly in the past 7 years ever since the inception of the project (in 2006). It is still an extremely small project with barely 86 articles. The project received maximum number of edits in 2007-2008 and since then the level of activity has been gradually decreasing with negligible edits in 2012-2013.
  2. There are handful of volunteers who have ever edited on Konkani Wikipedia and they too seem to have lost interest/inclination to sustain their edits. There are only 16 editors who have total of 10 edits or more on Konkani Wikipedia. In addition, no virtual/physical meeting of any kind has ever been organised by/for the volunteers. No outreach event had also been organised for Konkani Wikipedia until Dec 2012.[5]
  3. Problem of usage of multiple scripts in Konkani Wikipedia aggravates the massive challenges the project currently faces. Konkani does not have a unique script of its own and hence scripts of the other languages native to the regions are used. Konkani speakers are spread across different states and they use multiple scripts like Kannada, Devanagari, Roman and Perso-Arabic scripts. There is no one/common script - Konkani thus has a unique distinction of being written in four scripts.
  4. Multiple script issue ties in with some religious aspects as well. The Goan Hindus use the Devanagari script in their writings while the Goan Christians use the Roman script. The Saraswats of Karnataka use the Devanagari script in North Kanara district and the Kannada script in Udupi and South Kanara. Malayalam script is used in Kerala, but now there is a move to use the Devanagari script.

Because of these reasons and more Konkani Wikipedia has struggled to grow since its inception. It has been in the incubation stage for the past 7 years with an extremely small number of volunteers.

Fact Sheet

ParametersNumber
No. of scripts used in current incubation project 3 (Roman, Devanagari, Kannada)
No. of articles in Devnagri script 52
No. of articles in Roman script 32
No. of articles in Kannada script 2
No. of system messages translated in Devnagari script 126 (including 30 core messages)
No. of system messages translated in Roman script 500
No. of system messages translated in Kannada script 0
No. of editors with more than 10 edits 16
No. of editors with less than 10 edits 32
No. of articles 86
No. of redirects 4
No. of revisions 1113 (including 254 minor edits)

Multiple script issue faced by Konkani Wikipedia

Developing of Konkani Wikipedia faces a problem of usage of multiple scripts. Enabling the users to access and edit all articles in different scripts is the main problem faced by this wiki.

Similar challenge faced by other wikis: Similar problem was faced by Kashmiri Wikipedia which used Pasho, Sharada and Devanagari scripts. Punjabi has the Gurmukhi and Shahmukhi scripts, of which the former is used in India and the latter in Pakistan. Chinese language has two major writing systems; simplified and traditional Chinese. Other Wikipedias that have faced similar challenge are Uyghur, Azerbaijani, Korean Wikipedia, etc.

Possible solutions: Let’s look at what are the possible solutions to this problem of multiple scripts, how some of these language Wikipedias have tackled this in the past, and what might or might not work for Konkani Wikipedia in particular.

Possible Solution 1: Automatic conversion system.

  • A plug-in could be built into the server end of language Wikipedia to automatically transliterate content from one script into another.
  • Other Wikis using this solution are Chinese, Serbian, Kazakh, Kurdish Wikipedias. To give an example, Automatic conversion system has been running successfully on Chinese Wikipedia since 2004 and has been well received by the community. In addition to Chinese Wikipedia, Chinese Wiktionary, Wikiquote, and Wikibooks also have the conversion systems.
  • Automatic transliteration from one script to another might not work for Konkani Wikipedia, as there are dialectical differences and also there is no ready tool available for to convert one script to another. (transliterating between Roman to Devanagari or Roman to Kannada script etc.).

Possible Solution 2: Partial automatic conversion system.

  • A plug-in could be used that can transliterate one script to at least another; out of all the writing systems used.
  • Other Wikis using this solution are Tajik, Uzbek, Gan Wikipedia. To give an example, Tajik Wikipedia currently has auto-converting system for two of the writing systems (Cyrillic - Latin) but not into Perso-Arabic.
  • This could be a possible solution for Konkani Wikipedia if the community decides that they’d like to have transliteration tools installed at least for the Indian scripts.

Possible Solution 3: Multiple writing system

  • Have multiple articles in different scripts about the same topic. For example, have multiple articles about India in Konkani Wikipedia - one in Devanagari script, another in Roman script and yet another in Kannada script.
  • Some of the other wikis considering to adopt multiple writing system in the near future are Korean and Javanese Wikipedia.
  • This could be the short term solution for Konkani Wikipedia. It is something that is currently being used in Konkani Wikipedia in incubation and might also prove to be one of the best solutions for live Konkani Wikipedia project.

Possible Solution 4: Create separate wikis for each script

  • Create separate wikis for each script, at least those which prove to be active.
  • Separate wikis were created for Punjabi-Gurmukhi and Punjabi-Shahmukhi.
  • This could potentially be the long term solution for Konkani Wikipedia i.e. to have different Wikis for each active writing system - Konkani-Roman, Konkani-Devanagari and Konkani-Kannada. If there is an intersted active community to create content for a particular script; we could push that to a new project in due course of time. As things stand, the Roman script has been active in the recent past, followed by Devanagari and Kannada in that order.

Mitigation or consensus building: The community needs to reach consensus on how to deal with this issue. The best short term solution seems to be having multiple writing systems in the same Konkani Wikipedia project. However, in the long run we could evaluate the option of creating separate wikis for each script, at least those which prove to be active.


[1]. Garry, Jane, & Rubino, Carl. (Ed.). (2001). Facts about the world's languages: an encyclopedia of the world's major languages, past and present. New York & Dublin: A New England Publishing Associates Book.

[2]. http://bit.ly/RNVq53

[3]. http://bit.ly/13gBsoo

[4]. http://bit.ly/JUUjJ6

[5]. http://bit.ly/12eY4Dn