20110214/Canada open data: Difference between revisions

From zooid Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 98: Line 98:


</div></blockquote>
</div></blockquote>
{{Blikied|Sep 17, 2011}}

Revision as of 15:48, 17 September 2011

On February 14, 2011, I had the opportunity to speak before a Canadian parliamentary committee on the topic of open data. The content was developed at the Visible Government wiki, including the speaking notes and a parl.gc.ca video of the event. I've copied the notes below.

It was an interesting experience. I think some of the MPs are genuinely engaged on the topic. Unfortunately, the House collapsed shortly after this event, and there was no real follow-up.

I'm going to take a chance here and post some thoughts.

Open data is a funny movement. Despite being about openness and connections, there are many complexities and disagreements, and most people don't really cooperate easily.

Chief among the complexities is the question of whether open data should have a "share-alike" restriction. This means that anyone who uses (in this case) government-provided data must make any derived content available under the same terms (sometimes called a "viral license"). Those arguing against "share-alike" generally think it makes using data too complicated, and that it makes data re-use unfriendly to business; they believe that business will provide more value if they don't have to meet this restriction.

People who want to see share-alike, like myself, are probably generally more idealistic. As I tried to say in the session, I would like to see linking and participation more normal parts of everyday culture; for example, a classroom session would link directly into the same databases government use. We're already seeing at least the mechanism for this with near-ubiquitous internet, smart phones and social networking.

I don't see why business can't benefit from being more open, and I would like to see people expect it. Those arguing against this view state most people don't care, and that's true, especially when they're blocked or not expected to care.

Anyway,. everyone doesn't have to care, opening this up to a small percentage of people will have a profound effect. Most people know a few people who are 'nerdier,' and those friends can be edges into more supportive knowledge networks. Even idiots (which I can be at times) can be contributors if they expect real answers to their questions. Social media provides an ideal setting for these networks to develop. I think we need to look forward about ten years to what we want to see. A linked, open world is much more likely with share-alike type license, and I think it's most appropriate for publicly funded and concerned data.

I am not at all anti-business, but the current scheme will always involve creating more compromise. It's not just business either, my experience shows again and again that hospitals and other public institutions have incredible problems that are too readily accepted. I don't think there are any dangers involved in a more open world (barring witch hunts against the problems we all know exist under the covers), and I think we should work very intentionally toward a world where we don't need Wikileaks because everything is connected and open.

The interesting things about opening and organizing systems is the "missing pieces" become very evident, so it would be possible, with enough people and perspectives involved, to create great systems. There are too many good people who are disadvantaged, and I don't believe "full employment" serving business will ever be a solution. I believe the effects of massive openness and participation will lead to transformational efficiency and integrity that will allow more people to be positive and involved in shared systems.

Idealistic, yes, but with share-alike, we can work on creating that practical and better world. Without it, it's ultimately the same old, a new layer will form over the old layer and everyone in those layers with true access and ability will be squatters vying for more power as part of the pyramid scheme, fighting as much for the proprietary secrets as they do for real improvements. But I guess they gotta eat, right?




Introduction

In our crowd-sourced briefing document, we covered an array of topics around open data, open government, and more involved citizens. We covered topics such as the usefulness of open data inside government to enable connections, and better enable relationships with vendors. In science communities, open data helps create wider standards for more data sharing, and enables a culture of scientist-citizen. In education, notable institutions enable free access to the world's best information.

We talked about poisonous data and systems that assume individuals would never get access to their own health care record, as well as inspiring signs from GCPedia and our geomatics community. Others have spoken about how open data can make Access to Information a more efficient and useful service.

Business is exploring more open and social modes; consumer-serving openness is a competitive advantage.

We talked about creating a *culture* of innovation and problem solving, built on the fact so many Canadians are online. How what we're building can create consistent, re-usable knowledge system for everyone. Where a fourteen year old or eighty year old can access the same data and networks as a researcher, organize it according to their perspective, and connect with others. Where people stop using their computers as typewriters and instead create re-usable data. How many more people can be deeply involved in democratic processes, and how this can be used to build up trust in government.

Hospital finder

I want to talk about a specific open data project. Today, if I go to a health clinic, I may be told I can't be seen that day. If I search many completely different sources of health clinic information, I might get a better idea of the best clinic to visit at that moment.

Modern internet based software can provide easy solutions to this kind of problem. In an afternoon, I scraped the locations of hospital emergency departments across Montréal, put them on a map including the user's current position and the closest hospital, and added scraped information about capacity and resource usage.

Even this effort would be useful for someone trying to make an informed opinion and take more responsibility for their own health. It could help many people waste less of their own time sitting in a waiting room, and help balance the health system by choosing the clinic that's closest to them, and likely to be least crowded.

But if hospitals and clinics intentionally published information as quality open data, much more could be built. We could learn what clinics are best for different conditions, and develop real-time and predictive views of when to go to particular locations. Past the technical design, people could contribute their experiences to help measure problems and successes. This would result in a low-cost, harmonious feedback loop for individuals and the health system. With open data, lightweight Internet tools, and crowd-sourcing, the budget impact would be minimal, and the effects profound.

Because hospitals and regions are fragmented, we may never have an official comprehensive system. But with a minimal level of open data support, we can have useful, constantly developing views that institutions could never build in the foreseeable future.

Many people like me are able to create this kind of system in an afternoon, because it's what we do during the day. We work with free, world-scale systems that let us put interactive data into the best and most recognized Web based interfaces in the world. The proprietary and custom interfaces often used by institutions usually can't compete with this. They make the user relearn a system that's usually not nearly as good as the best on the Web, cut and paste an address to get transit directions or see what's nearby. They don't let users easily add information that can be helpful to others.


In the last few days I read two news items where government didn't take advantage of the best the Internet has to offer. In one case, the UK government paid a consulting firm 200 thousand pounds to create a system, which collapsed under load when put online. An individual wrote a system in eight spare hours that was much more robust.

In another case, the BBC announced it had to shut down 172 content web sites for budget reasons. An individual scraped and archived them using a $4 a month plan.


Using the best, low cost tools online today, for free people use digital maps to find restaurants and bus routes that suit them perfectly. Craigslist demolished the newspaper classifieds business with a free, easy to use volunteer based service. People count on looking up information on the collaboratively created Wikipedia. Fine grained news travels quickly in social networks, with personalized comments. Sites like OpenParliament publish and allow finer examination of proceedings. These are examples of the benefits digital networks; a basis of open data enables people to effectively re-use information to participate in democratic processes, and enable lifelong learning.

Precedents and benefits

In a generation the Internet will be deeply embedded in everything we do. We'll continue to see problem-solving waves of innovation from the best and most motivated minds around the world. Most people may not profoundly interact, but some will, and it will affect everyone.

All this potential is based on existing features and design of computer data and the breakthrough Web, created by Tim Berners-Lee, who leads open data development in the United Kingdom government.

Berners-Lee's mandate is to make data open and accessible, including individual direct involvement.

Openly learning from, using and advancing efforts and standards around the world must be a key part of the Canadian approach. We know there are qualities of open data, ranging from the opaqueness of a PDF, to richly organized and connected data using open standards and licenses. Accessible means data needs to be consistently organized according to many perspectives, in a culture that embraces this as the right thing to do. And though most people are online, and computers can be equalizers for vision and mobility disabled people, one third of Canadians are not online, and may never be. So we look to social networks to connect people. Many two-way knowledge translators will be required, inside and outside government.

This is an enormous undertaking. But it's an investment that will yield smarter, more capable people and genuine quality of life improvements in a knowledge economy. There will be short term rewards, but we need to create long term goals, visions and concrete milestones, with the open involvement of many people.

Steps forward

Thinking about real steps forward, as more information becomes available, it needs to be carefully organized using systems like CKAN. Otherwise it will never be found, or will be redundant and opportunities will be lost. Data directories that don't use these structured standards are a step backwards.

Licenses need to be determined. For many reasons, Creative Commons by attribution can be considered best. It's well recognized, and creates links with the origins of data.

Government needs to negotiate openly with firm like Google, to make sure data available in cloud based services doesn't become dependent on any provider, that they instead become standards like those developed for transit services.

My experience in hospital systems informs me there are clear sets of data that can be shared, and others that can't. Government departments need to enable their existing experts, and appoint people to determine how to draw clear lines in data reuse, as well as instituting an open data culture.

Getting people to widely understand how data is re-used is a harder problem. But government could serve many purposes by producing an awareness and participation campaign, supporting privacy and anti-fraud interests to instill an entertaining and realistic culture of inquiry in social networks. That attitude is the best starting point to create a trustworthy, participatory culture.

Finally, if government is going to conduct an e-consultation on this topic, that sounds like a great opportunity to work openly in a real first step to organize the issues and truly involve individuals in these discussions as first class participants.


Tim Berners-Lee's five levels of re-usable open data:

  • simply making your data available on the web with an open license, about equivalent to a fax and other nearly non reusable information;
  • make it available as structured data, where data can be re-used with the right software;
  • release it in non-proprietary formats;
  • map it to [persistent public] web locations so it can be reliably re-used, and;
  • rich linking between data sources.



RSS

Blikied on Sep 17, 2011