Open Data is a powerful worldwide movement these days. Regarding open data projects in developing countries and in high industrialised countries (Europe, US, Australia et al) where do you see the main differences – regarding organisational – cultural – technical issues?
We conducted feasibility studies in Ghana and Chile several months ago, are supporting the Ghana government on the development of its national initiative and have visited and have engaged in Open Data discussions with many other countries in Africa, Latin America and Asia.
The situations are quite diverse and can vary significantly from country to country. It is always difficult to generalize, but I think there are a few important differences that can be highlighted (in no particular order):
The amount of information available in digital form is generally much lower
The IT infrastructure is yet to be fully developed or under development
The capacities on the government and civil society side have to be improved
The mobile phone is the main device to access information but data connectivity is still scarce, only available in the big cities and not at all in the rural areas
Digital literacy related issues have to be seriously considered and addressed
Multilingualism is an important factor, as there are dozens of dialects being spoken in many countries
Said all of the above, I would say that there are also quite a number of commonalities such as privacy and security concerns, the resistance to change but also the existence of champions within government, and the interest and willingness in civil society, that is already producing a number of interesting applications.
You are also very familiar with the concept of Linked Open Data (LOD) – where do you see the main benefit in using LOD – where do you think are the main challenges – where the main obstacles?
Having managed a few projects achieving 5-star open data, I’ve learned a thing or two about the pros and cons. I’ve been saying consistently that there are a few important issues:
There is still little knowledge about LOD out there and it is perceived as too complex
The demand for LOD is, hence, very low
The tooling is not powerful enough yet, specially when compared to XML tooling and others
The modeling part is very tough
People are used to work with XML and Web Services and believe that anything along this line such as REST+JSON fulfils most expectations and needs. But this is not fully true. In my opinion, the power of LOD resides on the linking part more than anything else. Combination of data from disparate sources using RESTful techniques is much more difficult while it’s a natural fit for LOD.
My experience tells me that for dealing with few and simple datasets, investing in LOD is not really needed, but if you want to scale up and, specially, if you want to link and integrate, then you should consider LOD. It is generally a bigger investment but it pays back for interlinking big volumes of information, facilitates re-use in multiple formats, and can get very powerful when using SPARQL appropriately as it allows access to the whole underlying knowledge base.
Where do you see the main differences regarding effort of publishing and benefit in re-use (or the re-use itself) between Open Data and Linked Open Data?
I would say that the main difference here is between using the Web as an archive for files and using the full potential of the Web. If one publishes hundreds of spreadsheets on the Web using an open format and license, he is already doing Open Data, but more than using the Web, he is going back to the FTP days. And that is not too different from giving away a USB stick with the files. We can do much better nowadays.
The often cited Tim Berners-Lee’s 5-star scale is a good reference here. The higher you can achieve on that scale, the more power of the Web you are using, the more you are facilitating reuse.
Are there differences regarding the use of LOD principles and technologies between developing countries and industrialised countries in your opinion? For example: does it make sense to start an Open Data Initiative in a developing country using Linked Open Data from the scratch?
All the issues with LOD I mentioned above apply and are even more strongly found in the developing world. I think we should take a step by step approach and start going from no data to some-star data in the very near term, lower the barriers one by one and start to building capacities in government and civil society but always with Web architecture principles in mind.
We will have to address the specificities of the developing world. For example, given that the LOD community is relying more and more on cloud-based options, on centralized data stores that require stable high-speed internet, how would one deploy a LOD solution in a country where clients (computers/mobile phones) have limited resources (disk, cpu) and where connectivity is unstable and with low-bandwidth? We’re participating in a worskhop to explore these issues.
This does not mean that LOD is completely ruled out from the beginning. As I pointed out before, there are cases on which it can be extremely useful and powerful and in those, we intend to accelerate adoption, likely piloting and building capacities as a first step.
Could you please tell us a few words about the Web Foundation?
The Web Foundation was launched by the inventor of the Web, Sir Tim Berners-Lee, in 2009 to address global challenges by connecting humanity and empowering individuals through an increasingly inclusive and powerful Web. More on the vision of the Web Foundation at: http://www.webfoundation.org/vision/
Jose, many thanks for this interview. It seems that there is a quick progress in open data in developing countries as well as there are different requirements there to be taken into account in comparison to open data projects in Australia, the US or in Europe! Also the potential of Linked Open Data seems an interesting point for these countries! We are looking forward to staying in touch with you on this in the future and wish you all the best for your future work in this area!