Indic Language Wikipedias — Statistical Report — 2012

Posted by Shiju Alex at Jan 21, 2013 05:30 AM |
I have compiled the statistical update of the Indic language Wikipedias for the year 2012. As usual, in this report, my aim is to provide my perspectives on the health of various Indic language communities as well as the state of various Indic language wikipedias.

(The period of analysis is editor contributions between December 1, 2011 to December 31, 2012. December to December data is taken to account for the seasonal variations). Read the 2011 report here and the 2010 report here. The data for this report and analysis is based on the statistical data published at http://stats.wikimedia.org. A special thanks to Erik Zachte for compiling all this information.

Here is my executive summary after analyzing the data for 2012 and my experince with building some wiki communities:

  • Steady and sustainable growth is available for communities which focus on community building.
  • Small languages with guidance and support are making huge progress than many big languages.
  • Lack of support from proper channels at the much needed time had affected the community growth of some communities.
  • Even though many outreach programs had happened across country, that is not showing up in terms of number of active editors.
  • Still many language communities (especially big languages) are not open to the idea of reaching out to the speakers of the respective language.
  • Pageviews of Indic projects continues to increase.

This report is presented in the following sequence. This is done so because I believe that  community is central to the Wikimedia movement. Community will give us content which will drive readership.

  • Community
  • Content
  • Readership

Community

As mentioned above, according to me, community is the backbone of Wikimedia movement. But still many communities are not understanding the importance of this. It is important that all language wiki communities give adequate importance to community building to build the free knowledge repository in their language. The following table gives information based on two important parameters about the community. The first parameter shows the highly active editors (more than 100 edits per month) in wiki. The second parameter shows the active editors (more than 5 edits per month).

Indic Language Statistical Report
  • Like last year, Malayalam continues to show an upward growth in terms of the number of active users. It has close to 120 active editors now. The graphical summary shows that mean number of editors is around 100. Malayalam is the biggest wiki community among Indic languages even though Malayalam is only the 11th biggest spoken language in India. The sincere efforts put by Malayalam wikipedians to build its community is the only reason for this. The programs like Malayalam Wiki conference, Education program, CD project, wiki workshops, photo events, Wikimeetups, and many other outreach events started showing its result. If the community continues with these type of efforts then I am sure that the community strength in Malayalam Wikipedia will cross 150 in 2013. Apart from Wikipedia, the importance given to Malayalam Wikisource, Wiktionary, and more recently to Wikivoyage (in incubator) will attract more Malayalam speakers to the Malayalam wiki projects.
  • Tamil comes second with close to 80 active editors. However, the number of active editors has gone down from last year. The graphical summary shows that number of active users was around 70-75 especially during the last two quarters.
  • Bengali comes third with around 60 active editors. This is a slight increase from the last year’s number of active editors. The involvement of editors from India in Bengali Wikipedia is less. That needs to be changed. Bangladeshi wikipedians are having many outreach programs to build Bengali wiki community. It will be nice if they extend their support to Indian Bengali speakers also as Indian Bengali wikipedians are not growing.
  • Telugu, Urdu, Gujarati, and Punjabi are the wikipedia languages that show notable increase in the number of active editors. But it will be be a mistake from my part if I am not mentioning that these numbers are not encouraging and the current number of active users is not showing justice to the number of speakers these languages have. This statement is more significant when we consider the fact that some smaller languages are showing a better progress.
  • We have seen that last year (2011) the success stories were Odia and Assamese wikipedias. In 2012, the shining star is Punjabi. The community has grown from one active editor from last year to almost 15 active editors now. As mentioned in my blog posts (post 1, post 2, post 3, and post 4) about building Punjabi wikipedia community, the task of building community for Punjabi was very challenging. Initiated in 2002 along with Assamese, Punjabi is one of the first Indic language wikipedia. But nothing much had happened in that wiki until deliberate efforts to build community initiated. The news is now we have an active community in Punjabi Wikipedia. From the just one person last year (Guglani – who took lots of pain to travel to multiple locations to introduce Punjabi wikipedia), now Punjabi wikipedia has close to 15 active editors. Unlike Odia and Assamese, I have faced so many issues during Punjabi wikipedia community building (mostly conflicts between editors). But I am happy to see that community is slowly coming out of all that. The technical team has fixed some of the bugs related to typing tool which was very important for Punjabi wikipedia.  Punjabi wikipedians require lot support from other wikipedians to sustain the current momentum and grow the community further. My best wishes to Punjabi wikipedians.
  • Gujarati and Urdu are the two other communities that made considerable progress in community growth. The efforts put by Gujarati wikipedians to reach out to Gujarati speakers started showing the results. I am sure with the significant attention also given to Gujarati Wikisource (which was created last year), more Gujarati speakers will be  attracted to Gujarati wiki projects. The involvement of Indians in Urdu Wikipedia is very less. But it is good to notice that Urdu wiki community slowly started growing. May be Wikipedia is one place where Indians and Pakistanis can work together.
  • The wikipedia languages that haven't shown significant change in number of active editors are Marathi, Odia, Assamese, and Nepali. The respective communities need to start putting efforts to build community by taking lessons from other Indic language wiki communities.
  • The languages that have considerable reduction in number of editors are Hindi, Kannada, and Sanskrit. Among this, except Sanskrit, all are spoken by at least five crore people. It is not good to see that speakers of these languages are not giving any attention to the wiki projects in their respective language. The case of Hindi is very strange considering the fact that it has support of the central government and many state governments of India.
  • The dormant language communities are Sindhi, Bhojpuri, Kashmiri, and some other small languages. Considering the fact that Odia, Assamese, and Punjabi were also dormant two years before, I am sure if someone is putting effort to build communities for these now dormant communities, these language wiki communities will also grow like it happened for Odia, Assamese, and Punjabi. Now there are multiple entities to support wikimedia movement in India and I hope that someone will take care of this apart from concentrating on the bigger languages.
  • In short, the point I want to emphasis is, conscious efforts are required from different stakeholders to grow communities and to sustain that growth for all Indic language wikipedias.

Content

Number of articles is an important parameter, but it has misguided some wiki communities in the past. Fortunately that trend is coming down.

Language, Speakers & Articles
  • With more than 1,04,000 articles, Hindi continues to be the biggest Indic language wikipedia in terms of the number of articles. Almost 3500 articles were added to Hindi wikipedia in the year 2013.
  • Tamil and Malayalam had added around 7000 articles which is the "biggest growth" in terms of number of articles. Urdu and Nepali added close to 5000 articles.
  • If we consider percentage of increase then Assamese language has shown more than 100 per cent increase in the number of articles.
  • Some of the important milestones are, Tamil and Telugu crossing 50,000 articles, Malayalam crossing 25,000 articles, and Assamese crossing 1,000 article milestones.
  • The languages that have shown very slow growth in terms of number of articles are Gujarati, Telugu and Kannada. I assume at least for few of these languages the focus went into enhancing the existing articles and building the community rather than creating thousands of stub articles.
  • As mentioned in the past reports, communities don’t need to worry about the number of articles. Also the examples of Bishnupriya Manipuri and Newari Wikipedias shows the after effect of increasing the article count without focusing on building the community.

Readership (page views)

Unlike the number of editors, the number of page views in wiki is showing an upward trend irrespective of the language.(Please note that the information available in the below table is the total visits (page views) for a language wikipedia for a month from all the platforms combined. It includes visits by readers and editors. This is NOT the list of Number of Unique Visitors to the website).

Speakers & Readers
  • This is the one parameter where the figures are showing relative justice to the number of speakers.
  • Hindi with 78 lakh page views is in the top position.
  • The page views for Tamil had increased by more than 50 per cent.
  • Assamese has more than 100 per cent growth in page views.
  • Since the support for Indic languages is increasing for smart phone operating systems, I am sure the page views are going to increase further.

Conclusion

I am concluding this report with the following thoughts:

  • Being the biggest language (or number of speakers) does not automatically build community for an Indic language wikipedia. Efforts from respective language speakers are necessary to build community.
  • Most Indians who have access to internet and computer still don’t know their respective language typing. This is the biggest road block to build Indic language wiki community.
  • Do not get obsessed by article counts or readership. These are natural outcomes of community building.
  • Focus on community building through community interaction (through meetups, talk pages, village pumps, and mailing lists).
  • Focus on community building through community collaboration (WikiProjects or planning outreach efforts or advocacy).
  • Focus on community building through doing more outreach, better outreach, and being supportive of newbies.
  • Stay away from bots and translation tools for article creation as they do more harm than good. Use bots in such a way that it is not affecting the growth of the community.

Wishing all of you a wonderful wiki year 2013.