Online networking, data sharing and research activity distribution tools for scientists

This is just a short post, and I need to write more when I have time, about the result of a writing collaboration with Lou Peck and Sean Ekins on an article entitled “The new alchemy: Online networking, data sharing and research activity distribution tools for scientists” (http://dx.doi.org/10.12688/f1000research.12185.1). This took a LONG time to get published, and morphed from the original concept, but there appears to be a lot of interest judging by the views and downloads stats in the first few days (775 views and 20% of this number as downloads). That’s a good conversion rate. It’s open for PUBLIC COMMENTS and we welcome your feedback.

No Comments

Predicting organ toxicity using in vitro bioactivity data and chemical structure

I get to work with some great scientists in my job. I am getting to work on projects that a couple of years ago were way out of my depth. Let’s be honest, I have no formal training as a toxicologist and my training is formally as an analytical scientist, then cheminformatician, then into publishing and informatics and now in the National Center for Computational Toxicology. I didn’t realize that the trial by fire would be so stimulating and fun but working at EPA is great. So many people make flippant comments about working for the government, leaving early, etc. We work HARD and are productive and, for me at least, I feel we are doing important work and making real contributions. The latest paper I am involved with is “Predicting organ toxicity using in vitro bioactivity data and chemical structure” (http://dx.doi.org/10.1021/acs.chemrestox.7b00084). The abstract is listed below…

“Animal testing alone cannot practically evaluate the health hazard posed by tens of thousands of environmental chemicals. Computational approaches making use of high-throughput experimental data may provide more efficient means to predict chemical toxicity. Here, we use a supervised machine learning strategy to systematically investigate the relative importance of study type, machine learning algorithm, and type of descriptor on predicting in vivo repeat-dose toxicity at the organ-level. A total of 985 compounds were represented using chemical structural descriptors, ToxPrint chemotype descriptors, and bioactivity descriptors from ToxCast in vitro high-throughput screening assays. Using ToxRefDB, a total of 35 target organ outcomes were identified that contained at least 100 chemicals (50 positive and 50 negative). Supervised machine learning was performed using Naïve Bayes, k-nearest neighbor, random forest, classification and regression trees, and support vector classification approaches. Model performance was assessed based on F1 scores using five-fold cross-validation with balanced bootstrap replicates. Fixed effects modeling showed the variance in F1 scores was explained mostly by target organ outcome, followed by descriptor type, machine learning algorithm, and interactions between these three factors. A combination of bioactivity and chemical structure or chemotype descriptors were the most predictive. Model performance improved with more chemicals (up to a maximum of 24%) and these gains were correlated (ρ= 0.92) with the number of chemicals. Overall, the results demonstrate that a combination of bioactivity and chemical descriptors can accurately predict a range of target organ toxicity outcomes in repeat-dose studies, but specific experimental and methodologic improvements may increase predictivity.”

No Comments

Open Science for Identifying “Known Unknown” Chemicals http://dx.doi.org/10.1021/acs.est.7b01908

I am happy to announce the publishing of an article regarding “Open Science for Identifying “Known Unknown” Chemicals” at http://dx.doi.org/10.1021/acs.est.7b01908. I have been involved with two other articles about the identification of “Known Unknowns”.

The first one was a ChemSpider article: “”Identification of “known unknowns” utilizing accurate mass data and ChemSpider”. Journal of The American Society for Mass Spectrometry. 23: 179–185. doi:10.1007/s13361-011-0265-y.”

The second one was a recent article from the EPA: “”Identifying known unknowns using the US EPA’s CompTox Chemistry Dashboard”. Analytical and Bioanalytical Chemistry. 409: 1729–1735. doi:10.1007/s00216-016-0139-z.”

The most recent publication was a collaboration with Emma Schymanski from Eawag and it was a real pleasure to write this together. If you are interested in how Open Science can contribute to the challenges associated with the identification of known unknowns check out our latest publication!

No Comments

In Silico Prediction of Physicochemical Properties of Environmental Chemicals Using Molecular Fingerprints and Machine Learning

Recently we published on the curation of physicochemical data sets that were then made available as Open Data. The work was reported in:

“An automated curation procedure for addressing chemical errors and inconsistencies in public datasets used in QSAR modeling, SAR and QSAR in Environmental Research, K. Mansouri, C.Grulke, R. Judson and A.J. Williams, SAR and QSAR in Environmental Research,Volume 27 2016 – Issue 11, Pages 911-937 http://dx.doi.org/10.1080/1062936X.2016.1253611

The data has since been modeled using an alternative approach to that we used and is now reported in http://dx.doi.org/10.1021/acs.jcim.6b00625.

 

“In Silico Prediction of Physicochemical Properties of Environmental Chemicals Using Molecular Fingerprints and Machine Learning, Q. Zang, K. Mansouri, A.J. Williams, R.S. Judson, D.G. Allen, W.M. Casey, and N.C. Kleinstreuer, J. Chem. Inf. Model., 2017, 57 (1), pp 36–49″

The abstract for the article is below

ABSTRACT

There are little available toxicity data on the vast majority of chemicals in commerce. High-throughput screening (HTS) studies, such as those being carried out by the U.S. Environmental Protection Agency (EPA) ToxCast program in partnership with the federal Tox21 research program, can generate biological data to inform models for predicting potential toxicity. However, physicochemical properties are also needed to model environmental fate and transport, as well as exposure potential. The purpose of the present study was to generate an open-source quantitative structure–property relationship (QSPR) workflow to predict a variety of physicochemical properties that would have cross-platform compatibility to integrate into existing cheminformatics workflows. In this effort, decades-old experimental property data sets available within the EPA EPI Suite were reanalyzed using modern cheminformatics workflows to develop updated QSPR models capable of supplying computationally efficient, open, and transparent HTS property predictions in support of environmental modeling efforts. Models were built using updated EPI Suite data sets for the prediction of six physicochemical properties: octanol–water partition coefficient (logP), water solubility (logS), boiling point (BP), melting point (MP), vapor pressure (logVP), and bioconcentration factor (logBCF). The coefficient of determination (R2) between the estimated values and experimental data for the six predicted properties ranged from 0.826 (MP) to 0.965 (BP), with model performance for five of the six properties exceeding those from the original EPI Suite models. The newly derived models can be employed for rapid estimation of physicochemical properties within an open-source HTS workflow to inform fate and toxicity prediction models of environmental chemicals.

No Comments

How Poor Altmetrics are for my old articles…

In preparation for a talk later this week I have been investigating adding Altmetric and Plum analytics scores into my online CV as we as Kudos Resources. I would expect that Altmetric scores would be VERY low for old articles as they were published way before the social networking tools existed. However, the Plum Widget should be useful in terms of showing citations, views and downloads etc. The Kudos resources will be meaningful since I have been working SLOWLY through my articles with the latest first.

I think the Altmetric scores shown below bears out my opinion since MOST don’t have any score whatsoever. However, this blog post should lift a number of them over the next few days.


ARTICLES

1989
1. F.L. Lee, K.F. Preston, A.J. Williams, L.H. Sutcliffe, A.J. Banister, S.T. Wait, A single-crystal electron paramagnetic resonance study of the 4-phenyl-1,2,3,5-dithiadiazolyl radical   Magn. Reson. Chem. 27, 1161-1165 (1989). Link
AltMetrics Analytics

PLUMX Analytics

Kudos Resources

__________________________________________

1990
2. D.G. Gillies, S.J. Matthews, L.H. Sutcliffe and A.J. Williams, The Evaluation of Two Correlation Times for Methyl Groups from Carbon-13 Spin-lattice Relaxation Times and nOe Data  J. Magn. Reson., 86, 371 (1990) Link
AltMetrics Analytics

PLUMX Analytics

Kudos Resources

__________________________________________

3. P.J. Bratt, D.G. Gillies, L.H. Sutcliffe and A.J. Williams, NMR Relaxation Studies of Internal Motions – A Comparison between Micelles and Related Systems, J. Phys. Chem., 94(7), 2727 (1990) Link
AltMetrics Analytics

PLUMX Analytics

Kudos Resources

__________________________________________

4. R.C. Hynes, J.R. Morton, J.A. Hriljac, Y. LePage, K.F. Preston, A.J. Williams, F. Evans, M.C. Grossel and L.H. Sutcliffe,  Isolated Free Radical Pairs in Rb+TCNQ- 18-crown-6 Single Crystals, J.Chem. Soc.,Chem. Commun., 5, 439 (1990) Link
AltMetrics Analytics

PLUMX Analytics

Kudos Resources

__________________________________________

5. P.J. Krusic, J.R. Morton, K.F. Preston, A.J. Williams and F. Lee, EPR Spectrum of the Fe2(CO)8- Radical Trapped in Single Crystals of PPN+HFe2(CO)8- , Organometallics 9, 697 (1990). Link
AltMetrics Analytics

PLUMX Analytics

Kudos Resources

__________________________________________

6. R. Hynes, K.F. Preston, J.J. Springs, and A.J. Williams, Single-crystal EPR Study of Radical Pairs in [Fe(mesitylene)22+] {C3[C(CN)2]3-}2, J. Chem. Phys. 93(4), 2222, 1990 Link
AltMetrics Analytics

PLUMX Analytics

Kudos Resources

__________________________________________

7. R. Hynes, K.F. Preston, J.J. Springs, and A.J. Williams, EPR Studies of Radical Pairs [M(CO)5]2 (M = Cr, Mo, W) Trapped in Single Crystals of PPN+ HM(CO)5-, Organometallics, 9, 2298 (1990) Link
AltMetrics Analytics

PLUMX Analytics

Kudos Resources

__________________________________________

8. R. Hynes, K.F. Preston, J.J. Springs, and A.J. Williams, Electron paramagnetic resonance study of the tetracarbonyl(trimethylphosphite)tungstate(1-) radical anion trapped in a single crystal of [N(PPh3)2][W(CO)4H{P(OMe)3}], Journal of the Chemical Society, Dalton Transactions:  Inorganic Chemistry (1972-1999)  12, 3655-61(1990) Link
AltMetrics Analytics

PLUMX Analytics

Kudos Resources

__________________________________________

1991
9. R. Hynes, K.F. Preston, J.J. Springs, J. Tse and A.J. Williams, EPR Studies of M(CO)5-  Radicals (M = Cr, Mo, W) Trapped in Single Crystals of PPh4+ HM(CO)5- , J. Chem. Soc. Faraday Trans., 87(19), 3121 (1991) Link
AltMetrics Analytics

PLUMX Analytics

Kudos Resources

__________________________________________

10. R.C. Hynes, J.R. Morton, K.F. Preston, A.J. Williams, F. Evans, M.C. Grossel, L.H. Sutcliffe, and S.C. Weston, An EPR Study of Isolated Free Radical Pairs in M+ 18-Crown-6 TCNQ-  salts (TCNQ:7,7,8,8-tetracyanoquinodimethane; M=K, Rb), J. Chem. Soc. Faraday Trans., 87(14), 2229 (1991) Link
AltMetrics Analytics

PLUMX Analytics

Kudos Resources

__________________________________________

To show what it looked like when I posted this blog entry the attached image shows a small number of the articles with zero scores.

altmetric scores

No Comments

Add Altmetric and PlumX scores and Kudos Resources to your online CV

Over the weekend I spent a little time working to integrate Altmetric and PlumX scores to my online CV here on my blog. I also integrated my Kudos resources associated with an article directly into the CV.it’s a breeze and requires only that you have DOIs for your article. See below for how ONE article in my CV is represented.

154. Programmatic Conversion of Crystal Structures into 3D Printable Files, V.F. Scalfani, <strong>A.J. Williams</strong>, V. Tkachenko, K. Karapetyan, A. Pshenichnov, R.M. Hanson, J.M. Liddie and J.E. Bara, Journal of Cheminformatics, 2016, 8:66 Article Type: Methodology <a href=”http://jcheminf.springeropen.com/articles/10.1186/s13321-016-0181-z”><strong>Link</strong> </a>
<strong>AltMetrics Analytics</strong>
<div class=”altmetric-embed” data-badge-type=”medium-donut” data-badge-details=”right” data-doi=”10.1186/s13321-016-0181-z“></div>
<strong>PLUMX Analytics</strong>
<a href=’https://plu.mx/plum/a?doi=10.1186/s13321-016-0181-z‘ class=’plumx-plum-print-popup’></a>
<strong>Kudos Resources</strong>
<script src=”//api.growkudos.com/widgets/resources/10.1186/s13321-016-0181-z“></script>

Literally all you have to do is copy these few lines and swap out the DOI and the scores and Kudos resources will show up in your CV. Simple.

Altmetric, PlumX and Kudos Embedded widgets

No Comments

Comparing the EPA CompTox Dashboard with ChemSpider for MS-based Structure Identification

It’s almost ten years, this April, since ChemSpider was released to the public at the 233rd ACS meeting in Chicago. For two years, prior to being acquired by RSC in May 2009, we worked very closely with a number of mass spectrometry vendors including Waters (Micromass), Thermo and Agilent. I always considered that the work that we did with ChemSpider could be highly valued by the mass spectrometry community. This was especially true after we published the work for the identification of known unknowns with James Little (http://link.springer.com/article/10.1007/s13361-011-0265-y)  Certainly ChemSpider has become highly recognized, and used, by an increasing number of mass spectrometry vendors (through the ChemSpider Web Services).

A few months ago Andrew McEachran joined our team as a postdoc. Combining my experience with bringing ChemSpider to bear for the purpose of structure identification, his mass spectrometry skills and experience, and our tremendous development team to the development of the CompTox Chemistry Dashboard, we were able to make some further advances in the “identification known unknowns”. Our efforts were recently reported in this publication “Identifying known unknowns using the US EPA’s CompTox Chemistry Dashboard” (http://link.springer.com/article/10.1007%2Fs00216-016-0139-z). Readers are pointed to the summary tables in the article (results) demonstrating the improved performance of the CompTox Chemistry Dashboard based on high quality data sources and new approaches to rank ordering results based on formula and mass searching.

We recently rolled out new functionality and “MS-Ready structure batch-based searching” to offer even greater support for MS-structure identification . We will report on further extensions to this work at the Spring ACS Meeting.

 
The AltMetrics for the Article are shown below

No Comments

Spring ACS Meeting San Francisco, April 2017

The Spring ACS Meeting is coming, and it’s coming quickly. Every time the New Year starts I think I have a long time before I have to assemble posters and write talks for the ACS Meeting. When I worked at the RSC it was easier in some ways as NO ONE reviewed them, no one gave comments on them and there was no clearance process involved. Mostly I was writing the talks on the flight out to the ACS or, more commonly, was writing them the evening before or morning of the presentations. There have been days when I got up in the morning at 4am to write two talks on the day I presented. Quite exhausting but at least I got to show the latest and greatest capabilities.

As an employee at the EPA there are different expectations especially in regards to the clearance process where the presentations are reviewed and signed off, pushed through our internal repository and, post-presentation, released to the community via Science Inventory. Some, not all, of the presentations and papers I have been involved with since joining EPA, are here.

I will be going to the ACS meeting with a number of colleagues and chairing a session on Thursday, all day, with Chris Grulke for the Division of Environmental Chemistry. I will be presenting a number of posters and presentations as listed below. A number of my colleagues will also be presenting. Andrew McEachran, a recent postdoc with the center will be presenting on a lot of the work that has been done in terms of the use of the Chemistry Dashboard to facilitate structure identification. The recent publication “Identifying known unknowns using the US EPA’s CompTox Chemistry Dashboard” (http://link.springer.com/article/10.1007%2Fs00216-016-0139-z) reported on a comparison of the dashboard versus ChemSpider. Since then we have rolled out a lot of new functionality to support structure identification and Andrew will report on that.

PAPER ID: 2624963
PAPER TITLE: Twenty five years in cheminformatics: A career path through a diverse series of roles and responsibilities

DIVISION: Division of Chemical Information
SESSION: Careers in Chemical Information
PRESENTATION FORMAT: Oral
DAY & HALF DAY OF PRESENTATION: Sunday, April, 02, 2017 – AM

PAPER ID: 2616719
PAPER TITLE: Evaluating suspect screening and non-targeted analysis approaches using a collaborative research trial at the US EPA

DIVISION: Division of Analytical Chemistry
SESSION: Analytical Division Poster Session
PRESENTATION FORMAT: Poster
DAY & HALF DAY OF PRESENTATION: Sunday, April, 02, 2017 – EVE

PAPER ID: 2624980
PAPER TITLE: EPA CompTox chemistry dashboard: An online resource for environmental chemists

DIVISION: Division of Chemical Health and Safety
SESSION: Information Flow in Environmental Health & Safety
PRESENTATION FORMAT: Oral
DAY & HALF DAY OF PRESENTATION: Tuesday, April, 04, 2017 – PM
PAPER ID: 2624984
PAPER TITLE: Delivering an informational hub for data at the National Center for Computational Toxicology

DIVISION: Division of Environmental Chemistry
SESSION: Applications of Cheminformatics & Computational Chemistry in Environmental Health
PRESENTATION FORMAT: Poster
DAY & HALF DAY OF PRESENTATION: Wednesday, April, 05, 2017 – EVE

Looking forward to seeing you at ACS!

 

No Comments

Where did all of these Articles Associated With Me Come From on Mendeley

Recently I posted that Google must have changed their algorithm and as a result introduced a lot of new articles to my profile automagically that were nothing to do with me. It took work to prune them off and hopefully they do not reappear. Tonight I went through the process of updating the past few months of publications to get my Mendeley profile up to date and, lo and behold, there were a whole series of new publications that were NOT there the last time that I checked Mendeley. Interestingly they were all articles about superconducting materials as many of those that had appeared on my Google profile were. Is it possible that Elsevier is somehow sourcing the information from Scholar? Or is Elsevier sourcing these articles from within its own library? Of course the articles all have an author “A. Williams” associated with them. I have already started the process of pruning them out. Not happy…

Articles associated with A. Williams on Mendeley

Articles associated with A. Williams on Mendeley

1 Comment

Mendeley Expanding my Worldwide Followers in a Big Way

I adopted Mendeley very early and was a defender of their decision to join Elsevier. I didn’t beat them up in the mediasphere for moving from the Open start-up to the publishers corporate mode. I did that myself when ChemSpider was acquired by the Royal Society of Chemistry (RSC is a charity but is also a publisher).

Over the past few weeks I have noticed new followers showing up on my profile. In the first couple of years most of my Mendeley followers were actually names I recognized from my domains of experience of cheminformatics and Nuclear Magnetic Resonance. Most of the followers were scientists whose papers I had read and whose work I was aware of. But things are now different.

I have pasted a picture below of the past month or so of new followers. I don’t recognize any of them at all and as far as I can see they are not from my domain, based on me drilling down into their profile. I cannot figure out whether these are just random followers or not but I guess I should appreciate Mendeley and Elsevier for exposing my work, and publications, to a worldwide community of new followers. I am surprised by the new international exposure! THANKS

The past few days of new Mendeley followers

The past few days of new Mendeley followers

 

 

No Comments

%d bloggers like this: