Identifying “known unknowns” via suspect and non-target screening of environmental samples with the in silico fragmenter MetFrag (http://msbi.ipb-halle.de/MetFragBeta/) typically relies on the large compound databases ChemSpider and PubChem (see e.g. Ruttkies et al 2016). The size of these databases (over 50 and 90 million structures, respectively), yield many false positive hits of structures that were never produced in sufficient amounts to be realistically found in the environment (e.g. McEachran et al 2016). One motivation behind the US EPA’s CompTox Chemistry Dashboard is to provide access to compounds of environmental relevance – currently approx. 760,000 chemicals. While the web services are not yet available to incorporate the Dashboard in MetFrag as a database like ChemSpider and PubChem, there are a number of features in MetFragBeta that enables users to use the CompTox Chemistry Dashboard to perform “known unknown” identification with MetFrag. This post highlights the Suspect Screening Functionality.
First we have our (charged) mass. Take m/z = 256.0153. This was measured in positive mode and we assume (correctly) that it’s [M+H]+. Make sure you set this correctly in MetFrag.
Then retrieve your candidates, e.g. using ChemSpider or PubChem and a 5 ppm error margin:
Take the peak list from MassBank here: https://massbank.eu/MassBank/jsp/RecordDisplay.jsp?id=EA267612&dsn=Eawag and copy into the Fragmentation settings:
You could now process the candidates … but we have not done anything with the Dashboard! This is hidden in the middle in the “Candidate Filter & Score Settings” tab:
You can use the Candidate Filter to process ONLY candidates that are in the CompTox Chemistry Dashboard, excluding all other candidates, by clicking on “Suspect Inclusion Lists” and selecting the “DSSTox” box (see screenshot), which retains (currently) 11 of the 156 ChemSpider candidates:
Once finished the processing, the plot in the “Statistics” tab should look something like this – depending on what additional scores you selected:
It is also possible to use one (or more!) suspect lists to SCORE the different candidates without excluding any matches from ChemSpider or PubChem, by selecting the same box under the “MetFrag Scoring Terms” part instead (see screenshot). Additional lists like the Swiss Pharma list shown below can be downloaded from the NORMAN Suspect Exchange (http://www.norman-network.com/?q=node/236) and also viewed under the lists tab in the CompTox Chemistry Dashboard (https://comptox.epa.gov/dashboard/chemical_lists). MetFrag only needs a text file containing InChIKeys of the substances for the upload – which can be obtained from the Dashboard or Suspect Exchange downloads.
Using the Suspect Lists as a “Scoring term”, along with some other criteria and restrictions, will give you a results plot looking more like this:
Curious to find out more? MetFrag comes with a built-in example and you can try this exact example yourself by visiting http://msbi.ipb-halle.de/MetFragBeta/ and using the peak list copied from the bottom of the spectrum available at https://massbank.eu/MassBank/jsp/RecordDisplay.jsp?id=EA267612&dsn=Eawag
There are many more features to discover: try the website, read the paper (Ruttkies et al 2016) and if you have any questions, please comment below!
Author: Emma Schymanski, 21/11/2017