The EPA Online Prediction Physicochemical Prediction Platform to Support Environmental Scientists

This poster was presented at the American Chemical Society in Philadelphia in August 2016 at the Sci-Mix gathering and at the ENVR section on Wednesday.

SESSION: Sci-Mix
SESSION TIME:
August 22, 2016 from 8:00 PM to 10:00 PM

and

SESSION TIME: Wednesday, August, 24, 2016, 6:00 PM – 8:00 PM
ROOM & LOCATION:
Hall D – Pennsylvania Convention Center

Poster Title: The EPA Online Prediction Physicochemical Prediction Platform to Support Environmental Scientists

As part of our efforts to develop a public platform to provide access to predictive models we have attempted to disentangle the influence of the quality versus quantity of data available to develop and validate QSAR models.  Using a thorough manual review of the data underlying the well-known EPI Suite software, we developed automated processes for the validation of the data using a KNIME workflow. This includes: approaches to validate different chemical structure representations (e.g. molfile and SMILES), identifiers (chemical names and registry numbers), and methods to standardize the data into QSAR-consumable formats for modeling. Our efforts to quantify and segregate data into various quality categories has allowed us to thoroughly investigate the resulting models developed from these data slices, as well as allowing us to examine whether or not efforts into the development of large high-quality datasets has the expected pay-off in terms of prediction performance. Machine-learning approaches have been applied to create a series of models that have been used to generate predicted physicochemical and environmental parameters for over 700,000 chemicals. These data are available online via the EPA’s iCSS Chemistry Dashboard. This abstract does not reflect U.S. EPA policy.

No Comments

Investigating Impact Metrics for Performance for the US-EPA National Center for Computational Toxicology

This presentation was presented at the American Chemical Society in Philadelphia in August 2016

DAY & TIME OF PRESENTATION: Sunday, August, 21, 2016 from 4:10 PM – 4:30 PM
ROOM & LOCATION: Room 112B – Pennsylvania Convention Center

Title: Investigating Impact Metrics for Performance for the US-EPA National Center for Computational Toxicology

The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrates advances in biology, chemistry, and computer science to help prioritize chemicals for further research based on potential human health risks. This work involves computational and data driven approaches that integrate chemistry, exposure and biological data. We have delivered public access to terabytes of open data, as well to a large number of publicly accessible databases and applications, to support the research efforts for a large community of scientists. Many of our contributions to science are summarily described in research papers but  to date we have not optimized our contributions to  inform altmetrics statistics associated with our work. Critically missing from altmetrics is access to our numerous software applications and web service accesses, as well as the growing importance of our experimental data and models (e.g ToxCast, ExpoCast, DSSTox and others) to the scientific and regulatory communities.  This presentation will provide an overview of our efforts to more fully understand, and quantify, our impact on the environmental sciences using a combination of our measurement approaches and available altmetrics tools. This abstract does not reflect U.S. EPA policy.

No Comments

Structure Identification Using High Resolution Mass Spectrometry Data and the EPA’s Chemistry Dashboard

This presentation was presented at the American Chemical Society in Philadelphia in August 2016

DAY & TIME OF PRESENTATION: Sunday, August, 21, 2016 from 1:10 PM – 1:35 PM
ROOM & LOCATION: Room 105A – Pennsylvania Convention Center

Title: Structure Identification Using High Resolution Mass Spectrometry Data and the EPA’s Chemistry Dashboard

The iCSS Chemistry Dashboard is a publicly accessible dashboard provided by the National Center for Computation Toxicology at the US-EPA. It serves a number of purposes, including providing a chemistry database underpinning many of our public-facing projects (e.g. ToxCast and ExpoCast). The available data and searches provide a valuable path to structure identification using mass spectrometry as the source data. With an underlying database of over 720,000 chemicals, the dashboard has already been used to assist in identifying chemicals present in house dust. However, it can also be applied to many other purposes, e.g., the identification of agrochemicals in waste streams. This presentation will provide a review of the EPA’s platform and underlying algorithms used for the purpose of compound identification using high-resolution mass spectrometry data. We will also discuss progress towards a high-throughput non-targeted analysis platform for use by the mass spectrometry community.  This abstract does not reflect U.S. EPA policy.

 

No Comments

Presentations and Posters at #ACSPhiladelphia August 2016

I will be delivering five presentations and a poster (twice) at the ACS Meeting in Philadelphia this week. These presentations will introduce the latest version of our CompTox Dashboard, renamed from the iCSS Chemistry Dashboard because now we are offering way more than just a large set of chemical structures! I look forward to introducing attendees to the latest and greatest.

DAY & TIME OF PRESENTATION: Sunday, August, 21, 2016 from 1:10 PM – 1:35 PM
ROOM & LOCATION: Room 105A – Pennsylvania Convention Center

Title: Structure Identification Using High Resolution Mass Spectrometry Data and the EPA’s Chemistry Dashboard

The iCSS Chemistry Dashboard is a publicly accessible dashboard provided by the National Center for Computation Toxicology at the US-EPA. It serves a number of purposes, including providing a chemistry database underpinning many of our public-facing projects (e.g. ToxCast and ExpoCast). The available data and searches provide a valuable path to structure identification using mass spectrometry as the source data. With an underlying database of over 720,000 chemicals, the dashboard has already been used to assist in identifying chemicals present in house dust. However, it can also be applied to many other purposes, e.g., the identification of agrochemicals in waste streams. This presentation will provide a review of the EPA’s platform and underlying algorithms used for the purpose of compound identification using high-resolution mass spectrometry data. We will also discuss progress towards a high-throughput non-targeted analysis platform for use by the mass spectrometry community.  This abstract does not reflect U.S. EPA policy.

 

DAY & TIME OF PRESENTATION: Sunday, August, 21, 2016 from 4:10 PM – 4:30 PM
ROOM & LOCATION: Room 112B – Pennsylvania Convention Center

Title: Investigating Impact Metrics for Performance for the US-EPA National Center for Computational Toxicology

The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrates advances in biology, chemistry, and computer science to help prioritize chemicals for further research based on potential human health risks. This work involves computational and data driven approaches that integrate chemistry, exposure and biological data. We have delivered public access to terabytes of open data, as well to a large number of publicly accessible databases and applications, to support the research efforts for a large community of scientists. Many of our contributions to science are summarily described in research papers but  to date we have not optimized our contributions to  inform altmetrics statistics associated with our work. Critically missing from altmetrics is access to our numerous software applications and web service accesses, as well as the growing importance of our experimental data and models (e.g ToxCast, ExpoCast, DSSTox and others) to the scientific and regulatory communities.  This presentation will provide an overview of our efforts to more fully understand, and quantify, our impact on the environmental sciences using a combination of our measurement approaches and available altmetrics tools. This abstract does not reflect U.S. EPA policy.

DAY & TIME OF PRESENTATION: Wednesday, August, 24, 2016 from 9:40 AM – 10:00 AM
ROOM & LOCATION:
Juniper’s Ballroom – Philadelphia Downtown Courtyard by Marriott

Title: Delivering The Benefits of Chemical-Biological Integration in Computational Toxicology at the EPA

Abstract: Researchers at the EPA’s National Center for Computational Toxicology integrate advances in biology, chemistry, and computer science to examine the toxicity of chemicals and help prioritize chemicals for further research based on potential human health risks. The intention of this research program is to quickly evaluate thousands of chemicals for potential risk but with much reduced cost relative to historical approaches. This work involves computational and data driven approaches including high-throughput screening, modeling, text-mining and the integration of chemistry, exposure and biological data. We have developed a number of databases and applications that are delivering on the vision of developing a deeper understanding of chemicals and their effects on exposure and biological processes that are supporting a large community of scientists in their research efforts. This presentation will provide an overview of our work to bring together diverse large scale data from the chemical and biological domains, our approaches to integrate and disseminate these data, and the delivery of models supporting computational toxicology. This abstract does not reflect U.S. EPA policy.

 

DAY & TIME OF PRESENTATION: Wednesday, August, 24, 2016 from 11:10 AM – 11:40 AM
ROOM & LOCATION: Ormandy East – DoubleTree by Hilton Hotel Philadelphia Center City

Title: Data Aggregation, Curation and Modeling Approaches to Deliver Prediction Models to Support Computational Toxicology at the EPA

The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program develops and utilizes QSAR modeling approaches across a broad range of applications. In terms of physical chemistry we have a particular interest in the prediction of basic physicochemical parameters such as logP, aqueous solubility, vapor pressure and other parameters to invoke in our exposure models or for the purpose of modeling environmental toxicity. We are also interested in the development of models related to environmental fate. As a result of our efforts we have assembled and curated data sets for various physicochemical properties and, utilizing modern machine-learning modeling approaches, have developed a number of high performing models that we are now delivering to the public. Our website, the iCSS Chemistry Dashboard, provides access to data predicted for over 700,000 chemical compounds. The original training data are available for review and the details of prediction for each endpoint include the domain of applicability as well as a measure of performance accuracy.  This presentation will provide an overview of the existing aggregated data, our approaches to data curation and our progress towards an interactive environment for prediction of physicochemical and environmental fate parameters. The utilization of these parameters to support read-across approaches will also be discussed. This abstract does not reflect U.S. EPA policy.

 

DAY & TIME OF PRESENTATION: Thursday, August, 25, 2016 from 3:00 PM – 3:20 PM
ROOM & LOCATION:: Room 104A – Pennsylvania Convention Center

Title: The EPA iCSS Chemistry Dashboard to Support Compound Identification Using High Resolution Mass Spectrometry Data

There is a growing need for rapid chemical screening and prioritization to inform regulatory decision-making on thousands of chemicals in the environment. We have previously used high-resolution mass spectrometry to examine household vacuum dust samples using liquid chromatography time-of-flight mass spectrometry (LC-TOF/MS). Using a combination of exact mass, isotope distribution, and isotope spacing, molecular features were matched with a list of chemical formulas from the EPA’s Distributed Structure-Searchable Toxicity (DSSTox) database. This has further developed our understanding of how openly available chemical databases, together with the appropriate searches, could be used for the purpose of compound identification. We report here on the utility of the EPA’s iCSS Chemistry Dashboard for the purpose of compound identification using searches against a database of over 720,000 chemicals. We also examine the benefits of QSAR prediction for the purpose of retention time prediction to allow for alignment of both chromatographic and mass spectral properties. This abstract does not reflect U.S. EPA policy.

 

SESSION: Sci-Mix
SESSION TIME:
August 22, 2016 from 8:00 PM to 10:00 PM

and

SESSION TIME: Wednesday, August, 24, 2016, 6:00 PM – 8:00 PM
ROOM & LOCATION:
Hall D – Pennsylvania Convention Center

Poster Title: The EPA Online Prediction Physicochemical Prediction Platform to Support Environmental Scientists

As part of our efforts to develop a public platform to provide access to predictive models we have attempted to disentangle the influence of the quality versus quantity of data available to develop and validate QSAR models.  Using a thorough manual review of the data underlying the well-known EPI Suite software, we developed automated processes for the validation of the data using a KNIME workflow. This includes: approaches to validate different chemical structure representations (e.g. molfile and SMILES), identifiers (chemical names and registry numbers), and methods to standardize the data into QSAR-consumable formats for modeling. Our efforts to quantify and segregate data into various quality categories has allowed us to thoroughly investigate the resulting models developed from these data slices, as well as allowing us to examine whether or not efforts into the development of large high-quality datasets has the expected pay-off in terms of prediction performance. Machine-learning approaches have been applied to create a series of models that have been used to generate predicted physicochemical and environmental parameters for over 700,000 chemicals. These data are available online via the EPA’s iCSS Chemistry Dashboard. This abstract does not reflect U.S. EPA policy.

 

No Comments

Zika Virus and a hypothesis regarding the impact of Pyriproxyfen

I have been interested in the Zika Virus ever since I heard about it while visiting Brazil last year to give a talk at the Brazilian Natural Products conference. What I did not expect was the incredible surge in worldwide attention that Zika would attract. I am grateful to have been included in the work led by Sean Ekins (@collabchem) in the perspective “Open Drug Discovery for the Zika Virus” recently published on F1000Research. Up until last week the hypothesis was that Zika was a mosquito-borne disease but now the suggestion is that the disease may be related to a larvicide.

The chemical in question that is being named as the offending agent is Pyriproxyfen. I had never even heard of this chemical until a couple of days ago. At that time there was nothing on Wikipedia but, of course, it has since been updated with this

“In 2014, pyriproxifen was put into Brazilian water supplies to fight the proliferation of mosquito larvae.[2] Some Brazilian doctors have hypothesized that pyriproxyfen, not the Zika virus, is the cause of the 2015-2016 microcephaly epidemic in Brazil. [3]

Consequently, in 2016, the Brazilian state of Rio Grande do Sul suspended pyriproxyfen’s use. The Health Minister of Brazil, Marcelo Castro, criticized this step, noting that the claim is “a rumor lacking logic and sense. It has no basis.” They also noted that the insecticide is approved by the National Sanitary Monitoring Agency and “all regulatory agencies in the whole world”. The manufacturer of the insecticide, Sumitomo Chemical, stated “”there is no scientific basis for such a claim” and also referred to the approval of pyriproxyfen by the World Health Organization since 2004 and the United States Environmental Protection Agency since 2001.[4]

Noted skeptic David Gorski discussed the claim and pointed out that anti-vaccine proponents had also claimed that the Tdap vaccine was the cause of the microcephaly epidemic, due to its introduction in 2014, along with adding “One can’t help but wonder what else the Brazilian Ministry of Health did in 2014 that cranks can blame microcephaly on.” Gorski also pointed out the extensive physiochemical understanding of pyriproxyfen that the WHO has, which concluded in a past evaluation that the insecticide is not genotoxic, and that the doctor organization making the claim has been advocating against all pesticides since 2010, complicating their reliability.[2][5]

Because we live in a time of Open Data, and at a time when there is soooooo much information available on open databases, I thought I would go after any evidence-based identification of the chemical as a potential contributor to the explosion in Microcephaly.

PubChem exposes a LOT of useful data under the Safety and Hazards tab. The long-term exposure points to issues with blood and liver. FIFRA requirements are listed on PubChem and toxicity data is also available here. Reproductive toxicity is limited to reports in animals that reports

/LABORATORY ANIMALS: Developmental or Reproductive Toxicity/ In /a/ developmental study in rats, a maternal NOAEL/LOAEL were determined to be 100 mg/kg/day and 300 mg/kg/day, respectively. These findings were based on increased incidences in mortality and clinical signs at 1,000 mg/kg/day with decreased in food consumption, body weight, and body weight gain together with increases in water consumption at 300 and 1,000 mg/kg/day. The developmental NOAEL /and/ /LOAEL were 100 mg/kg/day and 300 mg/kg/day /respectively/ based on the incr of skeletal variations at 300 mg/kg/day and above.

64 FR 56681 (10/21/99). Available from, as of April 28, 2003: http://www.epa.gov/EPA-PEST/1999/October/Day-21/p27398.htm
/LABORATORY ANIMALS: Developmental or Reproductive Toxicity/ In /a/ developmental study in rabbits, the maternal NOAEL/LOAEL for maternal toxicity were 100 and 300 mg/kg/day based on premature delivery/abortions, soft stools, emaciation, decreased activity and bradypnea. The developmental NOAEL was determined to be 300 mg/kg/day and developmental LOAEL was /not/ … determined; no dose related anomalies occurred in the four remaining litters studied at 1,000 mg/kg/day.

64 FR 56681 (10/21/99). Available from, as of April 28, 2003: http://www.epa.gov/EPA-PEST/1999/October/Day-21/p27398.htm
/LABORATORY ANIMALS: Developmental or Reproductive Toxicity/ In a 2-generation reproduction study in rats, the systemic NOAEL was 1,000 ppm (87 mg/kg/day). The LOAEL for systemic toxicity was 5,000 ppm (453 mg/kg/day). Effects were based on decreased body weight, weight gain and food consumption in both sexes and both generations, and increased liver weights in both sexes associated with liver and kidney histopathology in males. The reproductive NOAEL was 5,000 ppm. A reproductive LOAEL was not established.

64 FR 56681 (10/21/99). Available from, as of April 28, 2003: http://www.epa.gov/EPA-PEST/1999/October/Day-21/p27398.htm
Just to point out that this information, and it is valuable, is sourced from HSDB.
Pyriproxyfen reports on PubMed doesn’t seem to turn up anything about birth defects that I can find.
There is no evidence, yet, for the potential impact of this chemical on the incidence of microcephaly but the hypothesis is now out there and it will be interesting to see what happens as investigations are pursued. As yet I have no opinion….but will be watching with interest to see what comes out.