Peter Wittenburg
Abstract:
Despite a number of exceptions data practices in general do not allow easy integration and re-purposing to extract new knowledge. A European overview with about 120 intensive interactions with experts from different disciplines and from different types of institutions indicates that still much legacy is being created that can only be made re-usable by investing large sums of capital in curation. It also indicates that senior researchers know about the inefficiency of current data intensive research, but they hesitate to invest since in general we lack guidance through a huge solution space and we lack data professionals who could change practices.
On the other hand we see in the realm of Open Science and Open Data that the pressure to repurpose data (and also tools) and thus enable innovation is increasing. The question is then how we can accelerate agreement finding towards best practices which then help to make data work much more efficient and thus reduce costs. Science can play the role of pioneering again, since the large companies simply have their business in mind – the role the big companies plaid when the basics of Internet were discussed some decades ago.
Science is increasingly global, since the challenges are global or since many scientific questions can only be answered when a global perspective is taken. Despite traditional structures that are still being maintained science is increasingly interdisciplinary. On the other hand trends showed clearly that many data issues (including analytics) are discipline-unspecific. Thus successful agreement forming needs to be cross-boundary (countries, disciplines). This may result in a difficult and time-consuming process.
The Research Data Alliance (RDA) has been established therefore to speed up acceleration forming across disciplines and national boundaries. It’s key elements are “bottom-up process driven by practitioners”, “results to overcome specific barriers within 18 months” and “rough consensus”. These principles have led to the first 5 concrete outputs within the first 20 months of life and a few other activities such as “Data Fabric” that tries to put the various activities on a common landscape to show their relationships and to harmonize them. Still RDA is a very young initiative and taking over the main principles of action of the Internet community does not give guarantees of success. But it is a chance which we need to make use of.