RSS

Tag Archives: Open Data

The future of scientific information & communication presented at the SUNY Potsdam Academic Festival

This is a LONG presentation….I talk about the “It’s All About Me” attitude that can positively feed science….we want to share OUR science, we want people to know about our opinions, our activities, our collaborators, we want to get funding, recognition and attribution. And why not…it can all be to the benefit of science.

This presentation was given at the SUNY Potsdam Academic Festival

The future of scientific information & communication

Our access to scientific information has changed in ways that were hardly imagined even by the early pioneers of the internet. The immense quantities of data and the array of tools available to search and analyze online content continues to expand while the pace of change does not appear to be slowing. While scientists now have access to the enormous capacities and capability of the internet the vast majority of scientific communication continues to be through peer-reviewed scientific journals. The measure of a scientist’s contribution is primarily represented by their publication profile and the citations to their published works and offers an incomplete view of their activities. However, we are at the beginning of a new revolution where the ability to communicate offers the opportunity to embrace new forms of publishing and where scientific participation and influence will be measured in new ways. This presentation will provide an overview of our new generation of “openness” in which open source, open standards, open access and open data are proliferating. The future of scientific information and communication will be underpinned by these efforts, influenced by increasing participation from the scientific community and facilitated collaboration and ultimately accelerate scientific progress.

 

Tags: , , , ,

Why Open Drug Discovery Needs Four Simple Rules for Licensing Data and Models

There are a number of people in my domain that I have great appreciation for and that I enjoy working with. So, an opportunity to co-author on rules for licensing data with Sean Ekins and John Wilbanks was an opportunity too good to miss. There are a lot of opinions, rants and views on data licensing floating around the internet, discussed at conferences and over beverages. Meanwhile we have opinions too and have shared them through this perspective on PLoS Computational Biology through this paper: “Why Open Drug Discovery Needs Four Simple Rules for Licensing Data and Models”

 

Tags: , ,

Navigating an Internet of Chemistry via ChemSpider

The internet is a rich source of chemistry related data and, nowadays, if a chemist knows how to initiate a search, data can be sourced for millions of chemicals online. The nature of online data varies from simple molecule diagrams, to experimental and predicted properties, encyclopedic articles, synthetic routes, analytical data, patents and publications. The array of information now accessible is distributed across thousands of sites giving rise to the information overload commonly associated with the Google-type searches on the internet. In addition the purest language of chemistry, that of chemical structures, is not fully supported on the web as yet. This presentation will provide an overview of how the internet is being meshed together using data aggregation and standardization approaches to enable a structure-searchable internet for chemistry. The speaker will present an overview of the ChemSpider platform (http://www.chemspider.com), the challenges of linking together over 400 internet resources and 26 million unique chemicals, and discuss how members of the chemistry community can directly contribute to enhancing the availability of quality data online.

This is a movie of the talk I gave using the BigBlueButton platform to students and faculty at the University of Arkansas, Little Rock.

 
Leave a comment

Posted by on October 19, 2011 in Publications and Presentations

 

Tags: , , , , ,

ChemSpider – Does Community Engagement work to Build a Quality Online Resource for Chemists?

This is my presentation at the Skolnik Symposium at ACS Denver to honor the contributions of Alexander “Sandy” Lawson to our domain of Cheminformatics.

ChemSpider – Does Community Engagement work to Build a Quality Online Resource for Chemists?

With an intention to provide a high quality free internet resource of chemistry related data for the community, ChemSpider has aggregated almost 25 million compounds linked out to over 400 data sources and provided a platform for the community to both deposit and curate data. This experiment in crowdsourcing for chemistry has now been running for over three years. This presentation will review a number of aspects of the project including (a) the level of community participation in depositing and curating data; (b) the nature of data and content supplied by the community; (c) how ChemSpider is used by the community; (d) using game-based systems to assist in data curation; (e) algorithmic-based approaches to data validation and filtering; and (f) sharing data curation efforts with other online databases.

 

 
Leave a comment

Posted by on August 30, 2011 in Publications and Presentations

 

Tags: , , , , , , , ,

Presentation at FACCS2010 in Raleigh

Today I gave a presentation at FACCS 2010 here in Raleigh, NC. The abstract and embedded SlideShare presentation are listed below.

Building a Community Resource of Open Spectral Data

ChemSpider is an online database of almost 25 million chemical compounds sourced from over 300 different sources including government laboratories, chemical vendors, public resources and publications. Developed with the intention of building community for chemists ChemSpider allows its users to deposit data including structures, properties, links to external resources and various forms of spectral data. Over the past three years ChemSpider has aggregated almost 3000 spectra including Infrared and Raman Data and continues to expand as the community deposits additional data. The majority of spectral data is licensed as Open Data allowing it to be downloaded and reused in presentations, lesson plans and for teaching purposes. This presentation will provide an overview of our efforts to build a structure-indexed online database of spectral data, initiate a call to action to the community to participate in improving this resource for the community at large and discuss how such a resource could be used as the basis of a spectral game to teach students spectral interpretation.

 

Tags: , , , , ,

Copy of Beautiful Data Chapter Now Available Online

I’ve previously blogged about the book chapter I co-authored for a book about Beautiful Data. The book chapter is now available online at Scribd after being uploaded by Jean-Claude Bradley. Feel free to go take a gander.

 
Leave a comment

Posted by on July 28, 2009 in Book Reviews

 

Tags: ,

A Conversation with Peter Suber – Navigating the Complexities of Open Access Definitions

Yesterday I had the opportunity to talk with Peter Suber. If you have not heard the name then his website speaks volumes in regards to his interests and involvement in Open Access. The opening line on is website kind of says it all: “I am an independent policy strategist for open access to scientific and scholarly research literature. Most of my work consists of research, writing, consulting, and advocacy.”

I am NOT an Open Access expert. in fact, I can comment that I have found it difficult to navigate the issues. My experience is that when trying to build a community the best path forward is phone conversation when face-to-face is not available. I approached Peter with the following questions and discussion points.

1) Some clarity around Free versus Open access
2) Open Data (http://www.opendefinition.org/1.0/ )versus Creative Commons
3) How ChemSpider is trying to be “Open”
4) Are we allowed to say we are “Open” under our activities?

After a one hour phone conversation with Peter I admit to be much more educated and at ease with how ChemSpider is operating and how we fit into the Open Access and Open Data world. I am clear now with we are doing fine in our position and our intent despite the fact that we may have not yet posted all of OUR understandings of the definitions. There is one major outcome from this for me to execute on. I will be defining our POLICY around openness in the near future when there is a little more bandwidth to get it done. It is very clear that language and definitions are of hypercritical importance in this domain.

One major learning…reach out to talk with the experts. A voice to voice conversation and dialog is far more interactive, entertaining and informative than a web search for definitions. Thank you Peter!

 
3 Comments

Posted by on October 17, 2007 in Community Building

 

Tags: , ,

Who Gets to Choose Whether Data is Open or Not?

For those of you who have been watching the blog of late you will be aware of the recent discussions about Open Data (1,2). We have offered the possibility to submitters of spectral data to declare their data either Open or Closed. Noel posted a comment on the blog asking the question “Why is the default Closed? Why even offer the option of Closed?”

So..my response to “Why not offer the option of Closed?” My opinion is that this is the submitters decision. It’s not our role to force “Openness” of data onto users. We are working to create an environment that provides value to ChemSpider users rather than one that forces them into a policy regarding openness. Personally, I would prefer to have access to data to help answer a question, even if they are NOT Open Data, than to not have access to those data. I have asked all of the people who have submitted data or had me submit data to ChemSpider whether they would like to have their data moved to open. 3 said yes 2 said no. I do NOT intend to force people to adhere to making their data Open. That is their choice, not mine. We are creating a community for collaboration. There is value in having access to data whether it is Open or not. if you look at the recent conversations about RSC and their Free Access versus Open Access we must agree that there IS value to Free Access to their articles despite the fact that they are not Open Access.

My friend Gary Martin has allowed us to deposit some of his data onto ChemSpider. He has commented twice (1,2) and I refer you to those blog postings for his opinions. They are interesting to read.

The reality is tha our policies, even as they are, appear to be appropriate to have people deposit their data. We already have over 100 spectra deposited on ChemSpider and more to come based on recent conversations. Some of these ARE Open Data and the depositors are acknowledged for this. They are sharing their data with you through us. That’s the benefit of building a community for chemists.

 

Tags: , ,