Category Archives: MS Structure Identification

GUEST POST by Emma Schymanski: Suspect Screening with MetFrag and the CompTox Chemistry Dashboard

Identifying “known unknowns” via suspect and non-target screening of environmental samples with the in silico fragmenter MetFrag ( typically relies on the large compound databases ChemSpider and PubChem (see e.g. Ruttkies et al 2016). The size of these databases (over 50 and 90 million structures, respectively), yield many false positive hits of structures that were never produced in sufficient amounts to be realistically found in the environment (e.g. McEachran et al 2016). One motivation behind the US EPA’s CompTox Chemistry Dashboard is to provide access to compounds of environmental relevance – currently approx. 760,000 chemicals. While the web services are not yet available to incorporate the Dashboard in MetFrag as a database like ChemSpider and PubChem, there are a number of features in MetFragBeta that enables users to use the CompTox Chemistry Dashboard to perform “known unknown” identification with MetFrag. This post highlights the Suspect Screening Functionality.

First we have our (charged) mass. Take m/z = 256.0153. This was measured in positive mode and we assume (correctly) that it’s [M+H]+. Make sure you set this correctly in MetFrag.


Then retrieve your candidates, e.g. using ChemSpider or PubChem and a 5 ppm error margin:

Take the peak list from MassBank here: and copy into the Fragmentation settings:

You could now process the candidates … but we have not done anything with the Dashboard! This is hidden in the middle in the “Candidate Filter & Score Settings” tab:

You can use the Candidate Filter to process ONLY candidates that are in the CompTox Chemistry Dashboard, excluding all other candidates, by clicking on “Suspect Inclusion Lists” and selecting the “DSSTox” box (see screenshot), which retains (currently) 11 of the 156 ChemSpider candidates:

Once finished the processing, the plot in the “Statistics” tab should look something like this – depending on what additional scores you selected:

It is also possible to use one (or more!) suspect lists to SCORE the different candidates without excluding any matches from ChemSpider or PubChem, by selecting the same box under the “MetFrag Scoring Terms” part instead (see screenshot). Additional lists like the Swiss Pharma list shown below can be downloaded from the NORMAN Suspect Exchange ( and also viewed under the lists tab in the CompTox Chemistry Dashboard ( MetFrag only needs a text file containing InChIKeys of the substances for the upload – which can be obtained from the Dashboard or Suspect Exchange downloads.

Using the Suspect Lists as a “Scoring term”, along with some other criteria and restrictions, will give you a results plot looking more like this:

Curious to find out more? MetFrag comes with a built-in example and you can try this exact example yourself by visiting and using the peak list copied from the bottom of the spectrum available at

There are many more features to discover: try the website, read the paper (Ruttkies et al 2016) and if you have any questions, please comment below!

Author: Emma Schymanski, 21/11/2017

Leave a comment

Posted by on December 8, 2017 in MS Structure Identification


Tags: ,

Open Science for Identifying “Known Unknown” Chemicals

I am happy to announce the publishing of an article regarding “Open Science for Identifying “Known Unknown” Chemicals” at I have been involved with two other articles about the identification of “Known Unknowns”.

The first one was a ChemSpider article: “”Identification of “known unknowns” utilizing accurate mass data and ChemSpider”. Journal of The American Society for Mass Spectrometry. 23: 179–185. doi:10.1007/s13361-011-0265-y.”

The second one was a recent article from the EPA: “”Identifying known unknowns using the US EPA’s CompTox Chemistry Dashboard”. Analytical and Bioanalytical Chemistry. 409: 1729–1735. doi:10.1007/s00216-016-0139-z.”

The most recent publication was a collaboration with Emma Schymanski from Eawag and it was a real pleasure to write this together. If you are interested in how Open Science can contribute to the challenges associated with the identification of known unknowns check out our latest publication!


Comparing the EPA CompTox Dashboard with ChemSpider for MS-based Structure Identification

It’s almost ten years, this April, since ChemSpider was released to the public at the 233rd ACS meeting in Chicago. For two years, prior to being acquired by RSC in May 2009, we worked very closely with a number of mass spectrometry vendors including Waters (Micromass), Thermo and Agilent. I always considered that the work that we did with ChemSpider could be highly valued by the mass spectrometry community. This was especially true after we published the work for the identification of known unknowns with James Little (  Certainly ChemSpider has become highly recognized, and used, by an increasing number of mass spectrometry vendors (through the ChemSpider Web Services).

A few months ago Andrew McEachran joined our team as a postdoc. Combining my experience with bringing ChemSpider to bear for the purpose of structure identification, his mass spectrometry skills and experience, and our tremendous development team to the development of the CompTox Chemistry Dashboard, we were able to make some further advances in the “identification known unknowns”. Our efforts were recently reported in this publication “Identifying known unknowns using the US EPA’s CompTox Chemistry Dashboard” ( Readers are pointed to the summary tables in the article (results) demonstrating the improved performance of the CompTox Chemistry Dashboard based on high quality data sources and new approaches to rank ordering results based on formula and mass searching.

We recently rolled out new functionality and “MS-Ready structure batch-based searching” to offer even greater support for MS-structure identification . We will report on further extensions to this work at the Spring ACS Meeting.

The AltMetrics for the Article are shown below

%d bloggers like this: