I get interviewed quite regularly regarding ChemSpider, my views on Open Data and data quality on the internet, as well as general comments about the chemistry data explosion online. So, when I was interviewed recently for the online article “Chemistry’s web of data expands” I was more than happy to give my thoughts regarding patent data coming online, data quality and the need for standards for handling chemistry data.
One of the parts of the conversation was regarding the work put in to clean up chemistry data on Wikipedia. What seems like an eternity ago I did “Dedicate Christmas Time to the Cause of Curating Chemistry on Wikipedia” and initiated a project to check every chemical compound on Wikipedia, bond by bond, atom by atom. However, I very quickly connected with Walkerma who then introduced me to a number of other Wikipedia Chemistry people. I started participating in IRC Chats with this group and we started exchanging comments about how we could move the project along. It was a pleasure to work with the team and while I did continue to participate it was nowhere near the level that I had contributed in the early days of the project. The project was a collaborative effort for sure, one of the best I have been involved with over the past few years.
When the original article on Nature.Com was published it stated “In fact, notes Williams, Wikipedia proved the most reliable source of structure information in that experiment – largely because he had led an effort to clean up the site’s 13,000 structures.” I definitely didn’t want that statement in the article and had specifically requested that I was represented as being part of a collaborative effort. I did not lead the project…I was a part of it only. So, with a couple of email exchanges with the author of the article, Richard van Noorden, the language was changed to “In fact, notes Williams, Wikipedia proved the most reliable online source of structure information in that experiment – largely because of an effort to clean up the site’s 13,000 pages about drugs and chemicals”. It’s a subtle edit but I definitely did not want to carry the responsibility for leading a project that was an ideal representation of crowdsourcing, collaboration and caring for chemistry on Wikipedia. And, to clarify…I know for a fact that all pages are not fully curated and validated yet…it’s a long process!!!