Monday 27 April 2009

Why can't we search scientific data like we search Google

I've had training in reading scientific data yet there's so much out there I just can't read it all. Wouldn't it be great if there were standards in place that made scientific data fit the digital age and made it easy to share the data.

All I want is to be able to find a bunch of data and either export it into a standardised format, xml or csv based (works in excel) then feed it into tools that can make the data come alive and allow me to search for patterns etc. Basically data mining. 

Take this piece of work on vigorous activity and it's benefits for eye health. The results sound amazing but why can't I just click a link and get the actual results in a format I can use. that's what science is about for me. Openness. Why can't just download the data run a few simple statistical tests myself and understand in greater detail exactly what the scientists have found. 

I would love to be able to upload the data into my own online application that also stores or links to all the other studies that either I've found or are part of its database. I can then combine this new knowledge with all that data that's already in there and do calculations with this vast wealth of information. 

The first thing anyone would say is that it's not that simple. The data and groups need to be carefully matched. I'd say they're right but that I do this every day for all types of data. Most data only really takes on meaning when combined with lots of other data. I and many like me have come up with many ways of standardising and comparing separate sets of data so why can't scientific data from different researchers be treated the same way. 

For me it's just about creating, implementing and enforcing standards. I also expect that these standards are already in place because this is not a new issue it's as old as science. what I don't like though is that all this information feels closed to me. We have the tools and technology to get the data in and to perform the analyses and present the results. I just don't know of a way to access all this via the web and whether anyone has even put it all together. 

If something like this was available to me then I'd be able to analyse all the information together, understand it more as a whole and see how it all works. I would also be able to see whhich of my and other peoples theories it supports and which it doesn't. Basically I think we'd all be able to learn about our bodies faster and science would progress so much faster. 

It would also help us all share this important information so that everyone can feel a part of these big debates. Not just those with the big resources. If this kind of technology were open to everyone then we'd all begin to understand more completely the answers that have been found. 

Basically it's like open source to create open science. 

Here's hoping someone puts it all together. 

update 2009 04 29 17:49
Ooh, no sooner do I ask that I find some one has already started. In truth gapminder is a site trying to put some of this in place. Just thought I'd add my two cents. Here's googles take on adding search power to public data.

No comments: