Call For Papers: Applications of Cheminformatics and Computational Chemistry in Environmental Health


Applications of Cheminformatics and Computational Chemistry in

Environmental Health

 253rd American Chemical Society National Meeting & Exposition

“Advanced Materials, Technologies, Systems & Processes”

San Francisco, California, April 2-6, 2017

Abstract Deadline: October 2016


Cheminformatics and computational chemistry have had an enormous impact in regards to providing environmental chemists and toxicologists access to data, information and knowledge. With an overwhelming array of online resources and an increasingly rich collection of software tools, the ability to source information continues to expand. Scientists typically seek chemical data in the form of chemical properties, their function and use, as well as information regarding their exposure potential, persistence in the environment and their transformation in environmental and biological systems. Commonly, the most pressing concern regarding chemicals is their potential as environmental toxicants. The increasing rate of production and release of new chemicals into commerce requires improved access to historical data and information to assist in hazard and risk assessment. High-throughput in vitro and in silico analyses increasingly are being brought to bear to rapidly screen chemicals for their potential impacts and interweaving this information with more traditional in vivo toxicity data and exposure estimation to provide integrated insight into chemical risk is a burgeoning frontier on the cusp of cheminformatics and environmental sciences.

This symposium will bring together a series of talks to provide an overview of the present state of data, tools, databases and approaches available to environmental chemists. The session will include the various modeling approaches and platforms, will examine the issues of data quality and curation, and intends to provide the attendees with details regarding availability, utility and applications of these systems. We will focus especially on the availability of Open systems, data and code to ensure no limitations to access and reuse.

The topics that would be covered in this session are, but are not limited to:

  • Environmental chemistry databases
  • Data: Quality, Modeling and Delivery
  • Computational hazard and risk assessment
  • Prioritizing environmental chemicals using screening and predictive computational tools
  • Standards for data exchange and integration in environmental chemistry
  • Implementations of Read-across prediction
  • Adverse Outcome Pathway data and delivery


Please submit your abstracts using the ACS Meeting Abstracts Programming System (MAPS) at  General information about the conference can be found at  Any other inquiries should be directed to the symposium organizers:

Antony J. Williams and Chris Grulke, National Center for Computational Toxicology, Environmental Protection Agency, Research Triangle Park, Durham, NC

Emails: and

BIA-10-2474, confusions in chemical structure and the need for EARLY clarity in chemical structures

My blog has been fairly inactive for the past few months, driven primarily by my move from working on cheminformatics at the Royal Society of Chemistry to working at the National Center for Computational Toxicology at the Environmental Protection Agency. While I stopped working on ChemSpider about 18 months before I left RSC (to focus on the developing RSC Data Repository) my interest and focus on data quality and a long-standing interest in “accuracy in chemical structure representations” has never dwindled. At the EPA-NCCT we are very focused on working to produce high quality chemical structure databases, following on from the work of my colleague Ann Richard who initiated work on DSSTox over a decade ago.

It was therefore with great interest that I became aware of the confusion in regards to the chemical structure of BIA-10-2474, a drug that has attracted a lot of interest because of a clinical trial with negative outcomes. I am entering the story late compared to my many time collaborators and friends Sean Ekins, Chris Southan and ALex Clark, but more about their work later. The news to date is best summarized at Derek’s In the Pipeline blog and on David Kroll’s post on Forbes.

Based on my previous history and work with helping to curate chemical structures on Wikipedia (starting one Christmas in 2008) my experience would be that Wikipedia is a GOOD PLACE to source high quality structures, especially after the work invested in curating chemical data over the years. The first structure for BIA-10-2474 that was reported on Wikipedia is shown below.

ORIGINAL BIA structure

On January 16th Chris performed his usually thorough examination of structure integrity and links to public sources (he is a master in this domain!) but commented specifically ” The molecular identity of BIA-10-2474 can only be formally verified directly by BIAL or indirectly from regulatory documentation they may have submitted” as the chemical structure itself was inferred from the name.

Nevertheless my friends Sean Ekins and Alex Clark were already investigating what OPEN MODELS may be able to predict about the chemical: See here, here and here. You should be impressed regarding what is possible when running a molecular structure through several Bayesian models in Alex’s mobile app called PolyPharma!

By January 21st Chris was commenting that the structure had changed and highlighted the extract from what was exposed by Figaro and listing the chemical name: 3-(1-(cyclohexyl(methyl)carbamoyl)-1H-imidazol-4-yl)pyridine 1-oxide. Want to know what that name means as a structure? Take the name “3-(1-(cyclohexyl(methyl)carbamoyl)-1H-imidazol-4-yl)pyridine 1-oxide” and paste it into the free online service OPSIN. The results are shown below.

OPSIN BIA Structure

That structure has now found its way to Wikipedia (updated on the 21st January – check out the edits between the two forms of the article here).

FINAL BIA structure

Sean Ekins has maintained a running series of blog posts here. Using a stack of openly accessible algorithms and websites Sean has now produced a whole series of predictions for the “final molecule”. Chris Southan has also continued to expand his work and I direct you to his latest blogpost for more information. Nice stuff Chris.

It took days following the news starting to show up regarding the results of the drug trial before the chemical structure was actually identified (i.e. the structure was blinded). How much work, how much confusion was created by having the drug structures blind? We have to imagine that the authorities had faster access to the details!

It is understandable that companies keep their chemical structures hidden. Patents are intentionally obfuscating (with a compound going into a trial commonly hidden among hundreds if not tens of thousands of chemicals that could be enumerated from a Markush structure). Until then Chris Southan will continue to educate the world about how competitive intelligence investigations.


Personal experiences in participating in the expanding social networks for science

This is the third presentation I gave at the ACS Meeting in Indianapolis:

Personal experiences in participating in the expanding social networks for science

The number of social networking sites available to scientists continues to grow. We are being indexed and exposed on the internet via our publications, presentations and data. We have many ways to contribute, annotate and curate, many of them as part of a growing crowdsourcing network. As one of the founders of the online ChemSpider database I was drawn into the world of social networking to participate in the discussions that were underway regarding our developing resource. As a result of my experiences in blogging, and as a result of developing collaborations and engagement with a large community of scientists, I have become very immersed in the expanding social networks for science. This presentation will provide an overview of the various types of networking and collaborative sites available to scientists and ways that I expose my scientific activities online. Many of these activities will ultimately contribute to the developing measures of me as a scientist as identified in the new world of alternative metrics.

Accessing chemical health and safety data online using Royal Society of Chemistry resources

This is the second presentation I gave at the ACS Meeting in Indianapolis

Accessing chemical health and safety data online using Royal Society of Chemistry resources

The internet has opened up access to large amounts of chemistry related data that can be harvested and assembled into rich resources of value to chemists. The Royal Society of Chemistry’s ChemSpider database has assembled an electronic collection of over 28 million chemicals from over 400 data sources and some of the assembled data is certainly of value to those searching for chemical health and safety information. Since ChemSpider is a text and structure searchable database chemists are able to find relevant information using both of their general search approaches. This presentation will provide an overview of the types of chemical health and safety data and information made available via ChemSpider and discuss how the data are sourced, aggregated and validated. We will examine how the data can be made available via mobile devices and examine the issue of data quality and its potential impacts on such a database.


Drugs, Racemic Mixtures, Data Curation and Tautomers

Yes, that’s quite a title for a blog post. But it covers the nature of exchanges that Egon Willighagen and I have been having recently (that among others as we are co-authoring a book chapter on Computational Toxicology also). Egon asked a question on the Blue Obelisk Discussion group about tautomers and I answered it on this blog. Egon has posted a follow up blog post here. His most recent post makes a series of valid comments…all good and well worth discussing.

Egon then asked about one of the compounds on ChemSpider here saying

“Anyway. Tautomerism was a curation issue in the first(!!!) entry I was curating. The sixth had the more well-known problem, I think. I may be blind, but I would say this drug has a stereocenter:

But none of the databases I checked so far (including ChemSpider) defines the stereochemistry! I thought we settled that some decades ago? Stereochemistry of drugs matter. What is going on here?”.

The drug shown is Aminoglutethimide. It’s on Wikipedia here without specific stereochemistry. But as we know Wikipedia does have errors (see slide 47/126 here). So what gives? It’s on KEGG in the same way. Also on ChEMBL here. But it IS on ChemSpider as both stereoforms….R and S. I would suggest that the drug is likely a racemic mixture of the stereoforms and as represented on all of the databases it’s probably okay to not draw the stereobonds as there is only one stereocenter to worry about. Checking Dailymed supports this in this record. A search on “Aminoglutethimide” gives 36000 hits on Google…I did not wade through them! I think the drug is therefore supplied as a racemate, can be separated (see top google hits) but is okay on ChemSpider as is.


Finding the Structure of Vitamin K1 Online

You would think that finding the correct structure of Vitamin K1 online in public domain resources would be an easy exercise. But not so fast. Using the assertion that the chemical structure is correct in the Merck Index, and then wandering through CAS’s Common Chemistry to validate this assumption, this short movie takes us through Wikipedia, Wolfram Alpha, KEGG, DrugBank, PubChem and other online resources to show how complex and impure the public domain databases are in terms of resourcing good quality name-structure associations for chemicals. Vitamin K1 is actually a rather simple chemical structure. Finding the correct chemical structure online…not so simple.

Books I am reading – The Autoimmune Epidemic

I seem to be surrounded by people who have developed “autoimmune diseases” (ID) over the past few years. These are commonly people around the age of 40 and are therefore my peer group. It is hard to watch my friends. and over the past few years, members of my immediate family, be severely debilitated by some form of ID whether it’s gastrointestinal in nature, thyroid function or some form of multiple chemical sensitivity.

A close personal friend of mine recently gifted me with a copy of a book called “The Autoimmune Epidemic: Bodies Gone Haywire in a World Out of Balance–and the Cutting-Edge Science that Promises Hope” and I am close to finishing it. I think the title speaks for itself. With an increasing number of “westerners” being diagnosed with autoimmune diseases, and numbers far exceeding thos with cancer, the book makes for interesting, and I would say for me personally, quite shocking reading. As a father of young children I am concerned now for what they will encounter as challenges to their bodies moving forward. A recommended read for everyone…not just scientists.


