RSS

Category Archives: ACS Meetings

Presentations at the Spring ACS Meeting in Orlando, April 2019

I am giving a number of presentations at the ACS meeting in Orlando in April 2019. If you are interested in coming to listen and maybe chat after please see the list below.

1) PAPER ID: 3080890 
PAPER TITLE: Consensus ranking and fragmentation prediction for identification of unknowns in high resolution mass spectrometry (final paper number: AGFD 10)


DIVISION: Division of Agricultural and Food Chemistry
SESSION: Recent Advances in Food Fraud & Authenticity Analysis
SESSION TIME: 8:30 AM – 10:55 AM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Sunday, March, 31, 2019 from 9:25 AM – 9:50 AM
ROOM & LOCATION: Florida Ballroom B  – Hyatt Regency Orlando 

Title: Consensus ranking and fragmentation prediction for identification of unknowns in high resolution mass spectrometry

Antony J. Williams1, Andrew McEachran2, Tommy Cathey3, Tom Transue3, Jon Sobus4

High resolution mass spectrometry (HRMS) and non-targeted analysis (NTA) are advancing the identification of emerging contaminants in environmental and agricultural matrices.  However, confidence in structure identification of unknowns in NTA presents challenges to analytical chemists.  Structure identification requires integration of complementary data types such as reference databases, fragmentation prediction tools, and retention time prediction models.  The goal of this research is to optimize and implement structure identification functionality within the US EPA’s CompTox Chemicals Dashboard, an open chemistry resource and web application containing data for ~760,000 substances.  Rank-ordering the number of sources associated with chemical records within the Dashboard (Data Source Ranking) improves the identification of unknowns by bringing the most likely candidate structures to the top of a search results list.  Incorporating additional data streams contained within the database underlying the Dashboard further enhances identifications.  Integrating tandem mass spectrometry data into NTA workflows enables spectral match scores and increases confidence in structural assignments.  We have generated and stored predicted MS/MS fragmentation spectra for the entirety of the Chemistry Dashboard using the in silico prediction tool CFM-ID.  Predicted fragments incorporated into the identification workflow were used as both a scoring term and as a candidate threshold cutoff.  Combining these steps within an open chemistry resource provides a freely available software tool for structure identification and NTA. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

2) PAPER ID: 3081133 
PAPER TITLE: Applications of the US EPA’s CompTox chemicals dashboard to support structure identification and chemical forensics using mass spectrometry (final paper number: ANYL 320)


DIVISION: Division of Analytical Chemistry
SESSION: Frontiers in Forensic Mass Spectrometry
SESSION TIME: 8:00 AM – 12:10 PM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Tuesday, April, 02, 2019 from 11:40 AM – 12:10 PM
ROOM & LOCATION: Plaza International Ballroom K  – Hyatt Regency Orlando

Title: Applications of the US EPA’s CompTox Chemicals Dashboard to support structure identification and chemical forensics using mass spectrometry

Antony J. Williams, Andrew D. McEachran, Jon R. Sobus and Emma Schymanski

High resolution mass spectrometry (HRMS) and non-targeted analysis (NTA) are of increasing interest in chemical forensics for the identification of emerging contaminants and chemical signatures of interest. At the US Environmental Protection Agency, our research using HRMS for non-targeted and suspect screening analyses utilizes databases and cheminformatics approaches that are applicable to chemical forensics. The CompTox Chemicals Dashboard is an open chemistry resource and web-based application containing data for ~760,000 substances. Basic functionality for searching through the data is provided through identifier searches, such as systematic name, trade names and CAS Registry Numbers. Advanced Search capabilities supporting mass spectrometry include mass and formula-based searches, combined substructure-mass searches and searching experimental mass spectral data against predicted fragmentation spectra. A specific type of data mapping in the underpinning database, using “MS-Ready” structures, has proven to be a valuable approach for structure identification that links structures that can be identified via HRMS with related substances in the form of salts, and other multi-component mixtures that are available in commerce. This presentation will provide an overview of the CompTox Chemicals Dashboard and demonstrate its utility for supporting structure identification and NTA in chemical forensics. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

3) PAPER ID: 3084559 
PAPER TITLE: Antony Williams, the ChemConnector: A career path through a diverse series of roles and responsibilities (final paper number: CINF 25)

DIVISION: Division of Chemical Information
SESSION: Careers in Chemical Information
SESSION TIME: 1:30 PM – 4:25 PM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Sunday, March, 31, 2019 from 3:05 PM – 3:25 PM
ROOM & LOCATION: West Hall B4 – Theater 11  – Orange County Convention Center

Antony Williams, the ChemConnector – a career path through a diverse series of roles and responsibilities

Authors: Antony Williams

Antony Williams is a Computational Chemist at the US Environmental Protection Agency in the National Center for Computational Toxicology. He has been involved in cheminformatics and the dissemination of chemical information for over twenty-five years. He has worked for a Fortune 500 company (Eastman Kodak), in two successful start-ups (ACD/Labs and ChemSpider), for the Royal Society of Chemistry (in publishing) and, now, at the EPA. Throughout his career path he has experienced multiple diverse work cultures and focused his efforts on understanding the needs of his employers and the often unrecognized needs of a larger community. Antony will provide a short overview of his career path and discuss the various decisions that helped motivate his change in career from professional spectroscopist to website host and innovator, to working for one of the world’s foremost scientific societies and now for one of the most impactful government organizations in the world. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

4) PAPER ID: 3084590 
PAPER TITLE: US-EPA CompTox chemicals dashboard: A web-based data integration hub for environmental chemistry data (final paper number: CINF 43)


DIVISION: Division of Chemical Information
SESSION: Web-Based Chemoinformatics Platforms
SESSION TIME: 8:00 AM – 11:50 AM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Monday, April, 01, 2019 from 11:20 AM – 11:50 AM
ROOM & LOCATION: West Hall B4 – Theater 10  – Orange County Convention Center

The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environmental Chemistry Data

Authors: Antony Williams, Andrew McEachran, Imran Shah, Richard Judson, John Wambaugh, Nancy Baker, George Helman, Chris Grulke, Kamel Mansouri, Grace Patlewicz, Ann Richard and Jeff Edwards.

The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrates advances in biology, chemistry, and computer science to help prioritize chemicals for further research based on potential human health risks. This involves computational and data-driven approaches that integrate chemistry, exposure and biological data. The National Center for Computational Toxicology (NCCT) has measured, assembled and delivered an enormous quantity and diversity of data for the environmental sciences, including high-throughput in vitro screening data, in vivo and functional use data, exposure models and chemical databases with associated properties. The CompTox Chemicals Dashboard is a web-based application providing access to data associated with ~760,000 chemical substances. New data are continuously added to the database on an ongoing basis, along with registration of new and emerging chemicals. This includes data extracted from the literature, identified by our analytical labs, and otherwise of interest to support specific research projects to the agency. By adding these data, with their associated chemical identifiers (names and CAS Registry Numbers), the dashboard uses linking approaches to allow for automated searching of PubMed, Google Scholar and an array of public databases. This presentation will provide an overview of the CompTox Chemicals Dashboard, how it has developed into an integrated data hub for environmental data, and how it can be used for the analysis of emerging chemicals in terms of sourcing related chemicals of interest, and deriving read-across as well as QSAR predictions in real time. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

5) PAPER ID: 3084575 
PAPER TITLE: EPA CompTox chemicals dashboard: An online resource for environmental chemists (final paper number: CINF 94)


DIVISION: Division of Chemical Information
SESSION: Applications of Cheminformatics to Environmental Science
SESSION TIME: 8:00 AM – 12:00 PM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Wednesday, April, 03, 2019 from 8:25 AM – 8:45 AM

ROOM & LOCATION: West Hall B4 – Theater 10  – Orange County Convention Center 

EPA CompTox Chemicals Dashboard – an online resource for environmental chemists

Authors: Antony Williams, Chris Grulke, Jennifer Smith, Kamel Mansouri, Andrew McEachran, Kathie Dionisio, Katherine Phillips, Grace Patlewicz, Jeremy Fitzpatrick, Nancy Baker, Todd Martin, Ann Richard and Jeff Edwards

The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrates advances in biology, chemistry, and computer science to help prioritize chemicals for further research based on potential human health risks. This work involves computational and data driven approaches that integrate chemistry, exposure and biological data. As an outcome of these efforts the National Center for Computational Toxicology (NCCT) has measured, assembled and delivered an enormous quantity and diversity of data for the environmental sciences including high-throughput in vitro screening data, in vivo and functional use data, exposure models and chemical databases with associated properties. A series of software applications and databases have been produced over the past decade to deliver these data. Recent work has focused on the development of a new architecture that assembles the resources into a single platform. With a focus on delivering access to Open Data streams, web service integration accessibility and a user-friendly web application the CompTox Chemicals Dashboard provides access to data associated with ~720,000 chemical substances. These data include research data in the form of bioassay screening data associated with the ToxCast program, experimental and predicted physicochemical properties, product and functional use information and related data of value to environmental scientists. This presentation will provide an overview of the CompTox Chemicals Dashboard and its value to the community as an informational hub. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

6) PAPER ID: 3095464 
PAPER TITLE: Cheminformatics approaches to support chemical identification delivered via the EPA CompTox Chemicals Dashboard (final paper number: ENVR 173)


DIVISION: Division of Environmental Chemistry
SESSION: Accurate Mass/High Resolution Mass Spectrometry for Environmental Monitoring & Remediation
SESSION TIME: 1:00 PM – 4:10 PM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Monday, April, 01, 2019 from 1:25 PM – 1:45 PM
ROOM & LOCATION: Valencia Ballroom B-D – Theater 8  – Orange County Convention Center

Cheminformatics approaches to support chemical identification delivered via the EPA CompTox Chemicals Dashboard

Antony J. Williams, Andrew McEachran, Chris M. Grulke, Elin M. Ulrich and Jon R. Sobus

The identification of chemicals in environment media depends on the application of analytical methods, the primary approach being one of the multiple mass spectrometry techniques. Cheminformatics solutions are critical to supporting the chemical identification process. This includes the assembly of large chemical substance databases, prioritization ranking of potential candidate search hits, and search approaches that support both targeted and non-targeted screening approaches. The US Environmental Protection Agency CompTox Chemicals Dashboard is a web-based application providing access to data for over 760,000 chemical substances. This includes access to physicochemical property, environmental fate and transport data, both human and ecological toxicity data, information regarding chemicals contained in products in commerce, and in vitro bioactivity data. Searches are allowed based on chemical identifiers, product and use, genes and assays associated with the EPA ToxCast assays and, specific to supporting mass spectrometry, searches based on masses and formulae. These searches make use of a novel “MS-Ready structures” approach collapsing chemicals related as mixtures, salts, stereoforms and isotopomers. The dashboard supports both singleton or batch searching by accurate mass/chemical formula, supported by MS-ready structures, and utilizes rich meta data to facilitate candidate ranking and the prioritization of chemicals of concern based on toxicity and exposure data. The dashboard also hosts tens of chemical lists that have been assembled from public databases, many supporting non-targeted analysis and mass spectrometry databases.

This presentation will provide an overview of the dashboard and will review our latest research into structure identification by searching experimental mass spectrometry data against predicted fragmentation spectra for LC-MS (positive and negative ion mode) and GC-MS (EI), a total of 3 million predicted spectra. We will also provide an overview of our progress supporting structure and substructure searching, using mass and formula-based filtering, and report on the latest applications of the dashboard to support structure identification projects of interest to the EPA. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

7) PAPER ID: 3084594 
PAPER TITLE: US-EPA comptox chemicals dashboard: an information hub for over five thousand per- & polyfluoroalkyl chemical substances (final paper number: ENVR 217)


DIVISION: Division of Environmental Chemistry
SESSION: Per- & Polyfluoroalkyl Substances in the Environment: From Legacy To Emerging Contaminants
SESSION TIME: 8:30 AM – 12:00 PM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Tuesday, April, 02, 2019 from 10:10 AM – 10:30 AM
ROOM & LOCATION: Valencia Ballroom B-D – Theater 10  – Orange County Convention Center

Title: The US-EPA CompTox Chemicals Dashboard – an information hub for over five thousand per- & polyfluoroalkyl chemical substances

Authors: Antony Williams, Chris Grulke, Grace Patlewicz and Ann Richard

The EPA’s CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard) is a publicly accessible website providing access to data for ~770,000 chemical substances, the majority of these represented as chemical structures. The web application delivers a wide array of computed and measured physicochemical properties, in vitro high-throughput screening data and in vivo toxicity data, product use information extracted from safety data sheets, and integrated chemical linkages to a growing list of literature, toxicology, and analytical chemistry websites. The application provides access to segregated lists of chemicals that are of specific interest to relevant stakeholders, including Per- & Polyfluoroalkyl Substances (PFAS) containing thousands of chemicals. A procured testing library of hundreds of PFAS chemicals annotated into chemical categories has been integrated into the dashboard with a number of resulting benefits: a searchable database of chemical properties, with hazard and exposure predictions, and links to the open literature. Several specific search types have been developed to directly support the mass spectrometry non-targeted screening community, enabling cohesive workflows to support data generation for the detection and assessment of environmental exposures to chemicals contained within DSSTox. This presentation will provide an overview of the dashboard, the ongoing expansion of the PFAS chemical library, with associated categorization, and new physicochemical property and environmental fate and transport QSAR prediction models developed for these chemicals. The application of the dashboard to support mass spectrometry non-targeted analysis studies for the identification of PFAS chemicals will also be reviewed. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

8) PAPER ID: 3084611 
PAPER TITLE: CompTox chemicals dashboard: Data and tools to support chemical and environmental risk assessment and the ENTACT project (final paper number: ENVR 648)


DIVISION: Division of Environmental Chemistry
SESSION: True Positives in EPA’S Non-Targeted Analysis Collaborative Trial (ENTACT)
SESSION TIME: 1:30 PM – 5:00 PM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Wednesday, April, 03, 2019 from 2:15 PM – 2:35 PM
ROOM & LOCATION: Valencia Ballroom B-D – Theater 13  – Orange County Convention Center

Title: The CompTox Chemicals Dashboard: Data and Tools to Support Chemical and Environmental Risk Assessment and the ENTACT project

Authors and affiliations: Antony J. Williams1, Christopher M. Grulke1, Andrew D. McEachran2, Emma L. Schymanski3,4, Jon Sobus5, Elin Ulrich5, Ann M. Richard1, Jeremy Dunne1 and Jeff Edwards1

1 EPA, National Center for Computational Toxicology, RTP, NC, USA

2 ORISE Fellow, Oak Ridge Institute for Science and Education, Oak Ridge, TN, USA

3 Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, 6, avenue du Swing, L-4367 Belvaux, Luxembourg

4 EPA, National Exposure Research Laboratory, RTP, NC, USA

Information and data on chemicals is used by scientists to evaluate potential health and ecological risks due to environmental exposures. EPA’s CompTox Chemicals Dashboard (https://comptox.epa.gov) helps evaluate the safety of chemicals by providing public access to a variety of information on over 760,000 chemicals. Within the Dashboard, users can access chemical structures, chemistry information, toxicity data, hazard data, exposure information, and additional links to relevant websites and applications. These data are compiled from sources including EPA’s computational toxicology research databases, from public domain databases and with collaborators across the world. Chemical lists have been added that provide access to various classes of chemicals and project-based datasets are under constant development. Specific functionality has been delivered within the Dashboard to support mass spectrometry including “MS-ready forms” of chemical substances that would be detectable by mass spectrometry. Workflows have been developed to assist in candidate identification and have now been proven with multiple published studies. An integration path between the dashboard and MetFrag has also been established to provide users the significant benefits resulting from the marriage between the two applications. The datasets underpinning the dashboard are freely available (https://comptox.epa.gov/dashboard/downloads) for integration into third party databases. This presentation will provide an overview of the available data types and functionality of the dashboard prior to examining how it is developing to support mass spectrometry based analyses within the agency and for the community in general. This will include a review of our research efforts to enhance the dashboard using in silico MS/MS fragmentation prediction for spectral matching. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

 

PRESENTATION ACS SPRING 2018: Structure identification by Mass Spectrometry Non-Targeted Analysis using the US EPA’s CompTox Chemistry Dashboard

Structure identification by Mass Spectrometry Non-Targeted Analysis using the US EPA’s CompTox Chemistry Dashboard

Identification of unknowns in mass spectrometry based non-targeted analyses (NTA) requires the integration of complementary pieces of data to arrive at a confident, consensus structure. Researchers use chemical reference databases, spectral matching, fragment prediction tools, retention time prediction tools, and a variety of other data to arrive at tentative, probable, and confirmed, if possible, identifications. With the diverse, robust data contained within the US EPA’s CompTox Chemistry Dashboard (https://comptox.epa.gov), the goal of this research is to identify and implement a harmonized identification tool and workflow using previously generated chemistry data. Data has been compiled from product use, functional use prediction models, environmental media occurrence prediction models, and PubMed references, among other sources. We will report on our development of a visualization tool whereby users can visualize the relative contribution of identification-based metrics on a list of candidate structures and observe the greatest likelihood of occurrence. These data and visualization tools support NTA identification via the Dashboard and demonstrate an open, accessible tool for all users of HRMS data. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

https://doi.org/10.6084/m9.figshare.6030893.v1

 
Leave a comment

Posted by on March 26, 2018 in ACS Meetings

 

PRESENTATION ACS SPRING 2018: US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for chemical sources of risk

US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for chemical sources of risk

Chemical risk assessment is both time-consuming and difficult because it requires the assembly of data for chemicals generally distributed across multiple sources. The US EPA CompTox Chemistry Dashboard is a publicly accessible web-based application providing access to various data streams on ~760,000 chemical substances. These data include experimental and predicted physicochemical property data, bioassay screening data associated with the ToxCast program, consumer product and functional use information and a myriad of related data of value to environmental scientists and toxicologists. At this stage of development, the public dashboard provides access to almost 20 predicted physicochemical and environmental fate and transport endpoints with full transparency in terms of model performance. Experimental and predicted human and ecological toxicity data are also available, as are in vitro to in vivo extrapolation dosimetry predictions and predicted exposure and functional use. In parallel to the CompTox Chemistry Dashboard we are developing RapidTox, a web-based application that enables a rapid, flexible and transparent prioritization process for sets of chemicals using several previously used workflows focused on scoring of traditional risk metrics and the inclusion of alternative hazard and exposure estimates. This presentation will give an overview of the CompTox Chemistry Dashboard, RapidTox, our approaches to building transparent and open prediction models, and our efforts to provide access to real time predictions. This abstract does not necessarily represent U.S. EPA policy.

https://doi.org/10.6084/m9.figshare.6027377.v1

 
 

PRESENTATION ACS SPRING 2018: Accessing information for chemicals in hydraulic fracturing fluids using the US EPA CompTox Chemistry Dashboard

Accessing information for chemicals in hydraulic fracturing fluids using the US EPA CompTox Chemistry Dashboard

EPA’s National Center for Computational Toxicology is developing automated workflows for curating large databases and providing accurate linkages of data to chemical structures, exposure and hazard information. The data are being made available via the EPA’s CompTox Chemistry Dashboard (https://comptox.epa.gov/dashboard), a publicly accessible website providing access to data for almost 760,000 chemical substances, the majority of these represented as chemical structures. The web application delivers a wide array of computed and measured physicochemical properties, in vitro high-throughput screening data and in vivo toxicity data as well as integrated chemical linkages to a growing list of literature, toxicology, and analytical chemistry websites. In addition, several specific search types are in development to directly support the mass spectroscopy non-targeted screening community, who are generating important data for detecting and assessing environmental exposures to chemicals contained within DSSTox. The application provides access to segregated lists of chemicals that are of specific interests to relevant stakeholders including, for example, scientists interested in algal toxins and hydraulic fracturing chemicals. This presentation will provide an overview of the challenges associated with the curation of data from EPA’s December 2016 Hydraulic Fracturing Drinking Water Assessment Report that represented chemicals reported to be used in hydraulic fracturing fluids and those found in produced water. The data have been integrated into the dashboard with a number of resulting benefits: a searchable database of chemical properties, with hazard and exposure predictions, and open literature. The application of the dashboard to support mass spectrometry non-targeted analysis studies will also be reviewed. This abstract does not reflect U.S. EPA policy.

https://doi.org/10.6084/m9.figshare.6027326.v1

 
 

PRESENTATION ACS SPRING 2018: Development of a Tool for Systematic Integration of Traditional and New Approach Methods for Prioritizing Chemical Lists

Development of a Tool for Systematic Integration of Traditional and New Approach Methods for Prioritizing Chemical Lists

Multiple regulatory bodies (EPA, ECHA, Health Canada) are currently tasked with prioritizing chemicals for data collection and risk assessments. These prioritization efforts are in response to regulatory mandates to identify chemicals for further assessment. We have developed a web-based application that enables a rapid, flexible and transparent prioritization process. The tool includes multiple data streams related to human and ecological hazard, exposure, and physicochemical properties (persistence and bioaccumulation). For human hazard, the data streams include quantitative points of departure (PODs) that are compiled from multiple sources such as EPA ToxRefDB, ECHA, COSMOS; estimated PODs from high-throughput in vitro screening assays and computational models; and qualitative measurements and predictions of specific endpoints (e.g., genotoxicity, endocrine activity). For ecological hazard, quantitative PODs are taken from the EPA ECOTOX database. Exposure information includes production volume, quantitative predictions using the EPA ExpoCast and SHEDS models, biomonitoring data, and qualitative information such as media occurrence, use profiles and likelihood of consumer and childhood exposures. The use of the tool is illustrated by prioritizing chemicals related to TSCA and the Safer Choice Ingredient List. The underpinning data streams for this application are already available in the EPA CompTox Chemistry Dashboard and have been repurposed to deliver this application. This is in keeping with our overarching software development methodology of providing multiple “building blocks” in the form of databases, web services and visualization components to deliver fit-for purpose applications to the relevant audiences. This abstract does not necessarily represent U.S. EPA policy.

https://doi.org/10.6084/m9.figshare.6027068.v1

 
 

PRESENTATION ACS SPRING 2018: New developments in delivering public access to data from the National Center for Computational Toxicology at the EPA

New developments in delivering public access to data from the National Center for Computational Toxicology at the EPA

Researchers at EPA’s National Center for Computational Toxicology integrate advances in biology, chemistry, and computer science to examine the toxicity of chemicals and help prioritize chemicals for further research based on potential human health risks. The goal of this research program is to quickly evaluate thousands of chemicals, but at a much reduced cost and shorter time frame relative to traditional approaches. The data generated by the Center includes characterization of thousands of chemicals across hundreds of high-throughput screening assays, consumer use and production information, pharmacokinetic properties, literature data, physical-chemical properties as well as the predictive computational modeling of toxicity and exposure. We have developed a number of databases and applications to deliver the data to the public, academic community, industry stakeholders, and regulators. This presentation will provide an overview of our work to develop an architecture that integrates diverse large-scale data from the chemical and biological domains, our approaches to disseminate these data, and the delivery of models supporting predictive computational toxicology. In particular, this presentation will review our new CompTox Chemistry Dashboard and the developing architecture to support real-time property and toxicity endpoint prediction. This abstract does not reflect U.S. EPA policy.

https://doi.org/10.6084/m9.figshare.6026957.v1

 
 

PRESENTATION ACS SPRING 2018: Overview of open resources to support automated structure verification and elucidation

Overview of open resources to support automated structure verification and elucidation

Cheminformatics methods form an essential basis for providing analytical scientists with access to data, algorithms and workflows. There are an increasing number of free online databases (compound databases, spectral libraries, data repositories) and a rich collection of software approaches that can be used to support automated structure verification and elucidation, specifically for Nuclear Magnetic Resonance (NMR) and Mass Spectrometry (MS). This presentation will provide an overview of freely available data, tools, databases and approaches available to support chemical structure verification and elucidation and highlight some of the known issues regarding data quality and suggest approaches for resolving some of the issues. The importance of structure and spectral standards for data exchange will be discussed, especially with regard to how spectral data can be made openly available to the community via online tools and through scientific publishing. This work does not necessarily reflect U.S. EPA policy.

https://doi.org/10.6084/m9.figshare.6026930.v1

 
 

PRESENTATION ACS SPRING 2018: Sharing chemical structures with peer-reviewed publications. Are we there yet?

Sharing chemical structures with peer-reviewed publications. Are we there yet?

In the domain of chemistry one of the greatest benefits to publishing research is that data can be shared. Unfortunately, the vast majority of chemical structure data associated with scientific publications remain locked up in document form, primarily in PDF files or trapped on webpages. Despite the explosive growth of online chemical databases and the overall maturity of cheminformatics platforms, many barriers stifle the exchange of chemical structures via publications. These challenges include incomplete support by accepted standards (especially InChI) for advanced stereochemistry, organometallic compounds and generic “Markush” representations, the difference between human-readable and computer-readable forms of data, and challenges with the computer representation of chemical structures. To address these obstacles to chemical structure sharing, US EPA National Center for Computational Toxicology scientists are using a combination of cheminformatics applications and online repositories to distribute chemical structure data associated with their publications. This presentation will describe how EPA-NCCT chemical structure data that is amenable to indexing and distribution are shared and highlight the benefit of open data sharing for modeling, data integration, and increasing research impact. This abstract does not reflect U.S. EPA policy.

https://doi.org/10.6084/m9.figshare.6026906.v1

 
 

PRESENTATION ACS SPRING 2018: Using the US EPA’s CompTox Chemistry Dashboard for structure identification and non-targeted analyses

Using the US EPA’s CompTox Chemistry Dashboard for structure identification and non-targeted analyses

Antony J. Williams, Andrew D. McEachran, Seth Newton, Kristin Isaacs, Katherine Phillips, Nancy Baker, Christopher Grulke and Jon R. Sobus

High resolution mass spectrometry (HRMS) and non-targeted analysis (NTA) are advancing the identification of emerging contaminants in environmental matrices, improving the means by which exposure analyses can be conducted. However, confidence in structure identification of unknowns in NTA presents challenges to analytical chemists. Structure identification requires integration of complementary data types such as reference databases, fragmentation prediction tools, and retention time prediction models. The goal of this research is to optimize and implement structure identification functionality within the US EPA’s CompTox Chemistry Dashboard, an open chemistry resource and web application containing data for ~760,000 substances. Rank-ordering the number of sources associated with chemical records within the Dashboard (Data Source Ranking) improves the identification of unknowns by bringing the most likely candidate structures to the top of a search results list. Database searching has been further optimized with the generation of MS-Ready Structures. MS-Ready structures are de-salted, stripped of stereochemistry, and mixture separated to replicate the form of a chemical observed via HRMS. Functionality to conduct batch searching of molecular formulae and monoisotopic masses was designed and released to improve searching efforts. Finally, a scoring-based identification scheme was developed, optimized, and surfaced via the Dashboard using multiple data streams contained within the database underlying the Dashboard. The scoring-based identification scheme improved the identification of unknowns over previous efforts using data source ranking alone. Combining these steps within an open chemistry resource provides a freely available software tool for structure identification and NTA. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

https://doi.org/10.6084/m9.figshare.6026081.v1

 
Leave a comment

Posted by on March 25, 2018 in ACS Meetings

 

PRESENTATION ACS SPRING 2018: Adding Complex Expert Knowledge into Chemical Database and Transforming Surfactants in Wastewater

Adding Complex Expert Knowledge into Chemical Databases: Transforming Surfactants in Wastewater

PRESENTED by Emma Schymanski

The increasing popularity of high mass accuracy non-target mass spectrometry methods has yielded extensive identification efforts based on chemical compound databases. Candidate structures are often retrieved with either exact mass or molecular formula from large resources such as PubChem, ChemSpider or the EPA CompTox Chemistry Dashboard. Additional data (e.g. fragmentation, physicochemical properties, reference and data source information) is then used to select potential candidates, depending on the experimental context. However, these strategies require the presence of substances of interest in these compound databases, which is often not the case as no database can be fully inclusive. A prominent example with clear data gaps are surfactants, used in many products in our daily lives, yet often absent as discrete structures in compound databases. Linear alkylbenzene sulfonates (LAS) are a common, high use and high priority surfactant class that have highly complex transformation behaviour in wastewater. Despite extensive reports in the environmental literature, few of the LAS and none of the related transformation products were reported in any compound databases during an investigation into Swiss wastewater effluents, despite these forming the most intense signals. The LAS surfactant class will be used to demonstrate how the coupling of environmental observations with high resolution mass spectrometry and detailed literature data (expert knowledge) on the transformation of these species can be used to progressively “fill the gaps” in compound databases. The LAS and their transformation products have been added to the CompTox Chemistry Dashboard (https://comptox.epa.gov/) using a combination of “representative structures” and “related structures” starting from the structural information contained in the literature. By adding this information into a centralized open resource, future environmental investigations can now profit from the expert knowledge previously scattered throughout the literature. Note: This abstract does not reflect US EPA policy.

https://doi.org/10.6084/m9.figshare.6025826.v1

 

 
 
 
Stop SOPA