RSS

Is ChemSpider Dangerous for Students?

12 May

Are students at risk using ChemSpider? It seems so based on recent commentary by Peter Murray-Rust. Peter has done us the service of driving ChemSpider from the point of view of someone interested in inorganic and organometallic complexes. The majority of users are performing either text based or structure/substructure searches based on organic molecules and their feedback is mostly congratulatory. It is excellent to receive feedback on that area of chemistry we suspect would be very challenging – inorganics and organometallics. I believe that we all struggle with these types of compounds and have therefore compared with the two other databases of note with over 5 million compounds – PubChem and eMolecules

The power of curation is clear on this blog where Peter has again identified some issues with ChemSpider’s treatment of certain compounds. I have addressed two other situations previously (1,2).

Peter identified an issue with the display of sodium hydride. We have NOT manually examined 10.6 million records so were not aware of this bug. The Sodium Hydride record is now curated with Peter’s comments and the display bug is now fixed. THIS is the power of community feedback. It will take some time to repopulate the images across 10 million records though. By comparison, a search of eMolecules produces no hits. A search of PubChem produces a number of hits, one containing a sodium ion and a hydride ion, bonded by a dative bond.

 

 

Peter also identified issues with Prussian Blue as excerpted below “… the chemical formula has been represented as separated iron ions and cyanide ions.” The Prussian blue record is now also curated with Peter’s comments. These complexes are challenging for all us…so warn your students! The record in question for ChemSpider is here, for PubChem are here and for Emolecules is here. Look at the display for PubChem 182606 as an example of the challenge.Prussian Blue on PubChem

 

 

 

 

 

 

 

 

 

 

 

Also, check eMolecules display. If you search eMolecules for Prussian Blue you will find 3 results. Check each of them. Here’s an example. Notice any issues?

Prussian Blue on Emolecules

 

The conversion of search structures via SDF files as well as the display of such compounds is challenging for all of us! The work has already been done this evening to deal with the dative bonds and coordination bonds in such complexes and these structures will be updated in the near future.

While searching millions of organic molecules is not easy the truth is it is more challenging for organometallics and we are conscious there would be issues here. I judge there to be two organizations with the ability to handle these complex molecules appropriately. One is CAS and the other is the Cambridge Crystallographic Database. Certainly it remains a challenge for us, as well as others. In theory this will be addressed well in CrystalEye and when these data are made available we will work with the group to determine a path to migrate such complex structures via SDF if possible. This will likely be done if they are to be deposited in PubChem. InChIs are not the solution since as identified at the InChiFAQ it does not support complex organometallics.

Are students at risk using ChemSpider? There have been recent reports about errors on Wikipedia and whether or not Wikipedia should be trusted. I know people working hard on populating Wikipedia and they are passionate individuals attempting to give back to the community. ChemSpider has already challenged the statement about Calcium Carbonate solubility on this blog but on Wikipedia it states it is insoluble but in the same page discusses the solubility of calcium carbonate (this might be because there is a Wikipedia accepted definition of insoluble). The ChemSpider team is also working hard and are passionate about what we are doing. What we need is continuing feedback. The best warning we can give at present is ChemSpider is beta. But, it is here to stay and we are working on all reported bugs in an appropriate order. As with all other large database resources students should take caution. We are all imperfect.

We are very grateful to Peter for his ongoing feedback regarding ChemSpider. So much so that we have voted Peter our “Tester of the Month”. The feedback is welcome. We’ve already fixed all the bugs…publishing the update to >10 millions structures will take time though.

 

About tony

Antony (Tony) J. Williams received his BSc in 1985 from the University of Liverpool (UK) and PhD in 1988 from the University of London (UK). His PhD research interests were in studying the effects of high pressure on molecular motions within lubricant related systems using Nuclear Magnetic Resonance. He moved to Ottawa, Canada to work for the National Research Council performing fundamental research on the electron paramagnetic resonance of radicals trapped in single crystals. Following his postdoctoral position he became the NMR Facility Manager for Ottawa University. Tony joined the Eastman Kodak Company in Rochester, New York as their NMR Technology Leader. He led the laboratory to develop quality control across multiple spectroscopy labs and helped establish walk-up laboratories providing NMR, LC-MS and other forms of spectroscopy to hundreds of chemists across multiple sites. This included the delivery of spectroscopic data to the desktop, automated processing and his initial interests in computer-assisted structure elucidation (CASE) systems. He also worked with a team to develop the worlds’ first web-based LIMS system, WIMS, capable of allowing chemical structure searching and spectral display. With his developing cheminformatic skills and passion for data management he left corporate America to join a small start-up company working out of Toronto, Canada. He joined ACD/Labs as their NMR Product Manager and various roles, including Chief Science Officer, during his 10 years with the company. His responsibilities included managing over 50 products at one time prior to developing a product management team, managing sales, marketing, technical support and technical services. ACD/Labs was one of Canada’s Fast 50 Tech Companies, and Forbes Fast 500 companies in 2001. His primary passions during his tenure with ACD/Labs was the continued adoption of web-based technologies and developing automated structure verification and elucidation platforms. While at ACD/Labs he suggested the possibility of developing a public resource for chemists attempting to integrate internet available chemical data. He finally pursued this vision with some close friends as a hobby project in the evenings and the result was the ChemSpider database (www.chemspider.com). Even while running out of a basement on hand built servers the website developed a large community following that eventually culminated in the acquisition of the website by the Royal Society of Chemistry (RSC) based in Cambridge, United Kingdom. Tony joined the organization, together with some of the other ChemSpider team, and became their Vice President of Strategic Development. At RSC he continued to develop cheminformatics tools, specifically ChemSpider, and was the technical lead for the chemistry aspects of the Open PHACTS project (http://www.openphacts.org), a project focused on the delivery of open data, open source and open systems to support the pharmaceutical sciences. He was also the technical lead for the UK National Chemical Database Service (http://cds.rsc.org/) and the RSC lead for the PharmaSea project (http://www.pharma-sea.eu/) attempting to identify novel natural products from the ocean. He left RSC in 2015 to become a Computational Chemist in the National Center of Computational Toxicology at the Environmental Protection Agency where he is bringing his skills to bear working with a team on the delivery of a new software architecture for the management and delivery of data, algorithms and visualization tools. The “Chemistry Dashboard” was released on April 1st, no fooling, at https://comptox.epa.gov, and provides access to over 700,000 chemicals, experimental and predicted properties and a developing link network to support the environmental sciences. Tony remains passionate about computer-assisted structure elucidation and verification approaches and continues to publish in this area. He is also passionate about teaching scientists to benefit from the developing array of social networking tools for scientists and is known as the ChemConnector on the networks. Over the years he has had adjunct roles at a number of institutions and presently enjoys working with scientists at both UNC Chapel Hill and NC State University. He is widely published with over 200 papers and book chapters and was the recipient of the Jim Gray Award for eScience in 2012. In 2016 he was awarded the North Carolina ACS Distinguished Speaker Award.
Leave a comment

Posted by on May 12, 2007 in ChemSpider Chemistry, Uncategorized

 

0 Responses to Is ChemSpider Dangerous for Students?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

 
Stop SOPA