RSS

Category Archives: General Communications

Presentations at the Spring ACS Meeting in Orlando, April 2019

I am giving a number of presentations at the ACS meeting in Orlando in April 2019. If you are interested in coming to listen and maybe chat after please see the list below.

1) PAPER ID: 3080890 
PAPER TITLE: Consensus ranking and fragmentation prediction for identification of unknowns in high resolution mass spectrometry (final paper number: AGFD 10)


DIVISION: Division of Agricultural and Food Chemistry
SESSION: Recent Advances in Food Fraud & Authenticity Analysis
SESSION TIME: 8:30 AM – 10:55 AM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Sunday, March, 31, 2019 from 9:25 AM – 9:50 AM
ROOM & LOCATION: Florida Ballroom B  – Hyatt Regency Orlando 

Title: Consensus ranking and fragmentation prediction for identification of unknowns in high resolution mass spectrometry

Antony J. Williams1, Andrew McEachran2, Tommy Cathey3, Tom Transue3, Jon Sobus4

High resolution mass spectrometry (HRMS) and non-targeted analysis (NTA) are advancing the identification of emerging contaminants in environmental and agricultural matrices.  However, confidence in structure identification of unknowns in NTA presents challenges to analytical chemists.  Structure identification requires integration of complementary data types such as reference databases, fragmentation prediction tools, and retention time prediction models.  The goal of this research is to optimize and implement structure identification functionality within the US EPA’s CompTox Chemicals Dashboard, an open chemistry resource and web application containing data for ~760,000 substances.  Rank-ordering the number of sources associated with chemical records within the Dashboard (Data Source Ranking) improves the identification of unknowns by bringing the most likely candidate structures to the top of a search results list.  Incorporating additional data streams contained within the database underlying the Dashboard further enhances identifications.  Integrating tandem mass spectrometry data into NTA workflows enables spectral match scores and increases confidence in structural assignments.  We have generated and stored predicted MS/MS fragmentation spectra for the entirety of the Chemistry Dashboard using the in silico prediction tool CFM-ID.  Predicted fragments incorporated into the identification workflow were used as both a scoring term and as a candidate threshold cutoff.  Combining these steps within an open chemistry resource provides a freely available software tool for structure identification and NTA. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

2) PAPER ID: 3081133 
PAPER TITLE: Applications of the US EPA’s CompTox chemicals dashboard to support structure identification and chemical forensics using mass spectrometry (final paper number: ANYL 320)


DIVISION: Division of Analytical Chemistry
SESSION: Frontiers in Forensic Mass Spectrometry
SESSION TIME: 8:00 AM – 12:10 PM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Tuesday, April, 02, 2019 from 11:40 AM – 12:10 PM
ROOM & LOCATION: Plaza International Ballroom K  – Hyatt Regency Orlando

Title: Applications of the US EPA’s CompTox Chemicals Dashboard to support structure identification and chemical forensics using mass spectrometry

Antony J. Williams, Andrew D. McEachran, Jon R. Sobus and Emma Schymanski

High resolution mass spectrometry (HRMS) and non-targeted analysis (NTA) are of increasing interest in chemical forensics for the identification of emerging contaminants and chemical signatures of interest. At the US Environmental Protection Agency, our research using HRMS for non-targeted and suspect screening analyses utilizes databases and cheminformatics approaches that are applicable to chemical forensics. The CompTox Chemicals Dashboard is an open chemistry resource and web-based application containing data for ~760,000 substances. Basic functionality for searching through the data is provided through identifier searches, such as systematic name, trade names and CAS Registry Numbers. Advanced Search capabilities supporting mass spectrometry include mass and formula-based searches, combined substructure-mass searches and searching experimental mass spectral data against predicted fragmentation spectra. A specific type of data mapping in the underpinning database, using “MS-Ready” structures, has proven to be a valuable approach for structure identification that links structures that can be identified via HRMS with related substances in the form of salts, and other multi-component mixtures that are available in commerce. This presentation will provide an overview of the CompTox Chemicals Dashboard and demonstrate its utility for supporting structure identification and NTA in chemical forensics. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

3) PAPER ID: 3084559 
PAPER TITLE: Antony Williams, the ChemConnector: A career path through a diverse series of roles and responsibilities (final paper number: CINF 25)

DIVISION: Division of Chemical Information
SESSION: Careers in Chemical Information
SESSION TIME: 1:30 PM – 4:25 PM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Sunday, March, 31, 2019 from 3:05 PM – 3:25 PM
ROOM & LOCATION: West Hall B4 – Theater 11  – Orange County Convention Center

Antony Williams, the ChemConnector – a career path through a diverse series of roles and responsibilities

Authors: Antony Williams

Antony Williams is a Computational Chemist at the US Environmental Protection Agency in the National Center for Computational Toxicology. He has been involved in cheminformatics and the dissemination of chemical information for over twenty-five years. He has worked for a Fortune 500 company (Eastman Kodak), in two successful start-ups (ACD/Labs and ChemSpider), for the Royal Society of Chemistry (in publishing) and, now, at the EPA. Throughout his career path he has experienced multiple diverse work cultures and focused his efforts on understanding the needs of his employers and the often unrecognized needs of a larger community. Antony will provide a short overview of his career path and discuss the various decisions that helped motivate his change in career from professional spectroscopist to website host and innovator, to working for one of the world’s foremost scientific societies and now for one of the most impactful government organizations in the world. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

4) PAPER ID: 3084590 
PAPER TITLE: US-EPA CompTox chemicals dashboard: A web-based data integration hub for environmental chemistry data (final paper number: CINF 43)


DIVISION: Division of Chemical Information
SESSION: Web-Based Chemoinformatics Platforms
SESSION TIME: 8:00 AM – 11:50 AM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Monday, April, 01, 2019 from 11:20 AM – 11:50 AM
ROOM & LOCATION: West Hall B4 – Theater 10  – Orange County Convention Center

The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environmental Chemistry Data

Authors: Antony Williams, Andrew McEachran, Imran Shah, Richard Judson, John Wambaugh, Nancy Baker, George Helman, Chris Grulke, Kamel Mansouri, Grace Patlewicz, Ann Richard and Jeff Edwards.

The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrates advances in biology, chemistry, and computer science to help prioritize chemicals for further research based on potential human health risks. This involves computational and data-driven approaches that integrate chemistry, exposure and biological data. The National Center for Computational Toxicology (NCCT) has measured, assembled and delivered an enormous quantity and diversity of data for the environmental sciences, including high-throughput in vitro screening data, in vivo and functional use data, exposure models and chemical databases with associated properties. The CompTox Chemicals Dashboard is a web-based application providing access to data associated with ~760,000 chemical substances. New data are continuously added to the database on an ongoing basis, along with registration of new and emerging chemicals. This includes data extracted from the literature, identified by our analytical labs, and otherwise of interest to support specific research projects to the agency. By adding these data, with their associated chemical identifiers (names and CAS Registry Numbers), the dashboard uses linking approaches to allow for automated searching of PubMed, Google Scholar and an array of public databases. This presentation will provide an overview of the CompTox Chemicals Dashboard, how it has developed into an integrated data hub for environmental data, and how it can be used for the analysis of emerging chemicals in terms of sourcing related chemicals of interest, and deriving read-across as well as QSAR predictions in real time. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

5) PAPER ID: 3084575 
PAPER TITLE: EPA CompTox chemicals dashboard: An online resource for environmental chemists (final paper number: CINF 94)


DIVISION: Division of Chemical Information
SESSION: Applications of Cheminformatics to Environmental Science
SESSION TIME: 8:00 AM – 12:00 PM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Wednesday, April, 03, 2019 from 8:25 AM – 8:45 AM

ROOM & LOCATION: West Hall B4 – Theater 10  – Orange County Convention Center 

EPA CompTox Chemicals Dashboard – an online resource for environmental chemists

Authors: Antony Williams, Chris Grulke, Jennifer Smith, Kamel Mansouri, Andrew McEachran, Kathie Dionisio, Katherine Phillips, Grace Patlewicz, Jeremy Fitzpatrick, Nancy Baker, Todd Martin, Ann Richard and Jeff Edwards

The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrates advances in biology, chemistry, and computer science to help prioritize chemicals for further research based on potential human health risks. This work involves computational and data driven approaches that integrate chemistry, exposure and biological data. As an outcome of these efforts the National Center for Computational Toxicology (NCCT) has measured, assembled and delivered an enormous quantity and diversity of data for the environmental sciences including high-throughput in vitro screening data, in vivo and functional use data, exposure models and chemical databases with associated properties. A series of software applications and databases have been produced over the past decade to deliver these data. Recent work has focused on the development of a new architecture that assembles the resources into a single platform. With a focus on delivering access to Open Data streams, web service integration accessibility and a user-friendly web application the CompTox Chemicals Dashboard provides access to data associated with ~720,000 chemical substances. These data include research data in the form of bioassay screening data associated with the ToxCast program, experimental and predicted physicochemical properties, product and functional use information and related data of value to environmental scientists. This presentation will provide an overview of the CompTox Chemicals Dashboard and its value to the community as an informational hub. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

6) PAPER ID: 3095464 
PAPER TITLE: Cheminformatics approaches to support chemical identification delivered via the EPA CompTox Chemicals Dashboard (final paper number: ENVR 173)


DIVISION: Division of Environmental Chemistry
SESSION: Accurate Mass/High Resolution Mass Spectrometry for Environmental Monitoring & Remediation
SESSION TIME: 1:00 PM – 4:10 PM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Monday, April, 01, 2019 from 1:25 PM – 1:45 PM
ROOM & LOCATION: Valencia Ballroom B-D – Theater 8  – Orange County Convention Center

Cheminformatics approaches to support chemical identification delivered via the EPA CompTox Chemicals Dashboard

Antony J. Williams, Andrew McEachran, Chris M. Grulke, Elin M. Ulrich and Jon R. Sobus

The identification of chemicals in environment media depends on the application of analytical methods, the primary approach being one of the multiple mass spectrometry techniques. Cheminformatics solutions are critical to supporting the chemical identification process. This includes the assembly of large chemical substance databases, prioritization ranking of potential candidate search hits, and search approaches that support both targeted and non-targeted screening approaches. The US Environmental Protection Agency CompTox Chemicals Dashboard is a web-based application providing access to data for over 760,000 chemical substances. This includes access to physicochemical property, environmental fate and transport data, both human and ecological toxicity data, information regarding chemicals contained in products in commerce, and in vitro bioactivity data. Searches are allowed based on chemical identifiers, product and use, genes and assays associated with the EPA ToxCast assays and, specific to supporting mass spectrometry, searches based on masses and formulae. These searches make use of a novel “MS-Ready structures” approach collapsing chemicals related as mixtures, salts, stereoforms and isotopomers. The dashboard supports both singleton or batch searching by accurate mass/chemical formula, supported by MS-ready structures, and utilizes rich meta data to facilitate candidate ranking and the prioritization of chemicals of concern based on toxicity and exposure data. The dashboard also hosts tens of chemical lists that have been assembled from public databases, many supporting non-targeted analysis and mass spectrometry databases.

This presentation will provide an overview of the dashboard and will review our latest research into structure identification by searching experimental mass spectrometry data against predicted fragmentation spectra for LC-MS (positive and negative ion mode) and GC-MS (EI), a total of 3 million predicted spectra. We will also provide an overview of our progress supporting structure and substructure searching, using mass and formula-based filtering, and report on the latest applications of the dashboard to support structure identification projects of interest to the EPA. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

7) PAPER ID: 3084594 
PAPER TITLE: US-EPA comptox chemicals dashboard: an information hub for over five thousand per- & polyfluoroalkyl chemical substances (final paper number: ENVR 217)


DIVISION: Division of Environmental Chemistry
SESSION: Per- & Polyfluoroalkyl Substances in the Environment: From Legacy To Emerging Contaminants
SESSION TIME: 8:30 AM – 12:00 PM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Tuesday, April, 02, 2019 from 10:10 AM – 10:30 AM
ROOM & LOCATION: Valencia Ballroom B-D – Theater 10  – Orange County Convention Center

Title: The US-EPA CompTox Chemicals Dashboard – an information hub for over five thousand per- & polyfluoroalkyl chemical substances

Authors: Antony Williams, Chris Grulke, Grace Patlewicz and Ann Richard

The EPA’s CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard) is a publicly accessible website providing access to data for ~770,000 chemical substances, the majority of these represented as chemical structures. The web application delivers a wide array of computed and measured physicochemical properties, in vitro high-throughput screening data and in vivo toxicity data, product use information extracted from safety data sheets, and integrated chemical linkages to a growing list of literature, toxicology, and analytical chemistry websites. The application provides access to segregated lists of chemicals that are of specific interest to relevant stakeholders, including Per- & Polyfluoroalkyl Substances (PFAS) containing thousands of chemicals. A procured testing library of hundreds of PFAS chemicals annotated into chemical categories has been integrated into the dashboard with a number of resulting benefits: a searchable database of chemical properties, with hazard and exposure predictions, and links to the open literature. Several specific search types have been developed to directly support the mass spectrometry non-targeted screening community, enabling cohesive workflows to support data generation for the detection and assessment of environmental exposures to chemicals contained within DSSTox. This presentation will provide an overview of the dashboard, the ongoing expansion of the PFAS chemical library, with associated categorization, and new physicochemical property and environmental fate and transport QSAR prediction models developed for these chemicals. The application of the dashboard to support mass spectrometry non-targeted analysis studies for the identification of PFAS chemicals will also be reviewed. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

8) PAPER ID: 3084611 
PAPER TITLE: CompTox chemicals dashboard: Data and tools to support chemical and environmental risk assessment and the ENTACT project (final paper number: ENVR 648)


DIVISION: Division of Environmental Chemistry
SESSION: True Positives in EPA’S Non-Targeted Analysis Collaborative Trial (ENTACT)
SESSION TIME: 1:30 PM – 5:00 PM

PRESENTATION FORMAT: Oral
DAY & TIME OF PRESENTATION: Wednesday, April, 03, 2019 from 2:15 PM – 2:35 PM
ROOM & LOCATION: Valencia Ballroom B-D – Theater 13  – Orange County Convention Center

Title: The CompTox Chemicals Dashboard: Data and Tools to Support Chemical and Environmental Risk Assessment and the ENTACT project

Authors and affiliations: Antony J. Williams1, Christopher M. Grulke1, Andrew D. McEachran2, Emma L. Schymanski3,4, Jon Sobus5, Elin Ulrich5, Ann M. Richard1, Jeremy Dunne1 and Jeff Edwards1

1 EPA, National Center for Computational Toxicology, RTP, NC, USA

2 ORISE Fellow, Oak Ridge Institute for Science and Education, Oak Ridge, TN, USA

3 Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, 6, avenue du Swing, L-4367 Belvaux, Luxembourg

4 EPA, National Exposure Research Laboratory, RTP, NC, USA

Information and data on chemicals is used by scientists to evaluate potential health and ecological risks due to environmental exposures. EPA’s CompTox Chemicals Dashboard (https://comptox.epa.gov) helps evaluate the safety of chemicals by providing public access to a variety of information on over 760,000 chemicals. Within the Dashboard, users can access chemical structures, chemistry information, toxicity data, hazard data, exposure information, and additional links to relevant websites and applications. These data are compiled from sources including EPA’s computational toxicology research databases, from public domain databases and with collaborators across the world. Chemical lists have been added that provide access to various classes of chemicals and project-based datasets are under constant development. Specific functionality has been delivered within the Dashboard to support mass spectrometry including “MS-ready forms” of chemical substances that would be detectable by mass spectrometry. Workflows have been developed to assist in candidate identification and have now been proven with multiple published studies. An integration path between the dashboard and MetFrag has also been established to provide users the significant benefits resulting from the marriage between the two applications. The datasets underpinning the dashboard are freely available (https://comptox.epa.gov/dashboard/downloads) for integration into third party databases. This presentation will provide an overview of the available data types and functionality of the dashboard prior to examining how it is developing to support mass spectrometry based analyses within the agency and for the community in general. This will include a review of our research efforts to enhance the dashboard using in silico MS/MS fragmentation prediction for spectral matching. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

 

A TERRIBLE implementation of Name Searching on ACS Journals

Yes, I am a Williams. And THAT is an incredibly common surname. But I am an Antony Williams, notice no H in the name, i.e. NOT Anthony. In the field of chemistry there are not many of us around…a couple I know of, but not many overall. Google Scholar does an extremely good job of automatically associating my newly published articles with my Citations profile here: https://scholar.google.com/citations?user=O2L8nh4AAAAJ

The last five articles automatically associated with my profile. I do NOT make any associations manually at this point.

The last five articles automatically associated with my profile. I do NOT make any associations manually at this point.

I am assuming that this is done by understanding the type of work I publish on, some of the co-author network maps that have been established as my profile has developed etc. I assume that there approach is very intelligent relative to some of the more commonplace searches that have been implemented….certainly the results are GOOD.

I noticed one disastrous example today when our article “ChemTrove: Enabling a Generic ELN to Support Chemistry Through the Use of Transferable Plug-ins and Online Data Sources” was published on the Journal of Chemical Information and Modeling here. Right there to the left of the abstract is an offer to look at other content by the authors.

Look for related content by the authors on JCIM

Look for related content by the authors on JCIM

I was interested to see what else ACS knew about my content so I clicked on my name…which performed this search: http://pubs.acs.org/action/doSearch?ContribStored=Williams%2C+A  and provided me with 96 articles by Andrew Williams (mostly), by Aaron Williams, by Anthony Williams (not me) and Allan Williams (to name a few). Eventually I managed to find 3 that were associated with me by searching the list for Antony Williams but none of those I published as Antony J. Williams were recovered.

Also, my colleague Valery Tkachenko is listed as an author with a misspelling as Valery Tkachenkov. What is simply inappropriate in my opinion is how the process involved taking the list of our submitted names..copied below directly from the submitted manuscript and changing them to their own interpretation of how we would want to see our names listed.

From this:

Aileen E. Day*†, Simon J. Coles, Colin L. Bird, Jeremy G. Frey, Richard J. Whitby, Valery E. Tkachenko§, Antony J. Williams§

To This:

Names changed from the original manuscript to those produced at submission

Names changed from the original manuscript to those produced at submission

Notice that for Aileen and Jeremy the middle initials were expanded, Colin had his middle initial changed from L. to I.,  Richard, Valery and I had our middle initials dropped and Valery had a v added to his surname. Why not simply copy and paste the names from the manuscript?

I will point out that this is a “Just Accepted” manuscript and likely the changes in names will be caught and edited, especially now I have just pointed them out. “Just accepted” does have some disclaimers:

The disclaimers regarding Just Accepted manuscripts

The disclaimers regarding Just Accepted manuscripts

While they can edit the names to match what we originally provided I don’t think it will fix the issue regarding finding all of my articles on ACS journals as when  navigated to one of my other articles here, http://pubs.acs.org/doi/abs/10.1021/es0713072, and did the search from my listed name it found exactly the same 96 hits.

Maybe a thought to use my ORCID profile http://orcid.org/0000-0002-2668-4821 to look for ACS journal articles associated with my name?

Unfortunately the data is already out in the wild as when I claimed the article on Kudos all of the name spelling issues had clearly spilled over via the DOI: https://www.growkudos.com/articles/10.1021%252Fci5005948

Names transferred via DOI to the Grow Kudos Platform

Names transferred via DOI to the Grow Kudos Platform

Ah…the things that surprise me….or not.

 

Name disambiguation, ORCIDs and author IDs for Science Books

Those of you who follow my blog will know that I am a fan of ORCIDs and it is great to hear that there are now over a MILLION ORCIDS issued! The sooner the better as far as I am concerned that I can start claiming all of my books and book chapters against MY ORCID and then moving that information to other platforms. My Amazon Author Page is here: http://www.amazon.com/Antony-J.-Williams/e/B004YRPRV2 and I am glad to say that despite the fact that there is a book called “I Hate Sex” with the author Antony J. Williams, exactly the spelling of my name, is NOT associated with me. Phew…

If we could start to make sure, somehow, that ORCIDs, or at least some form of AUTHOR IDs were utilized by all publishers and associated with books that are published (and listed on Amazon and Google Books) then maybe we wouldn’t have this problem listed below….

My GREAT FRIEND Gary Martin (and often times mentor in NMR) and I are editing a two volume series with David Rovnyak. Volume 1 is listed on Amazon here and Volume 2 is here. Now then…Gary is rather well known in the world of NMR….his Wikipedia page is here. On Amazon his skill set is listed as under “About the Author” as:

“Gary E. Martin graduated with a B.S. in Pharmacy in 1972 from the University of Pittsburgh and a Ph.D. in Pharmaceutical Sciences from the University of Kentucky in 1975, specializing in NMR spectroscopy. He was a Professor at the University of Houston from 1975 to 1989, assuming the position of Section Head responsible for US NMR spectroscopy at Burroughs Wellcome, Co. in Research Triangle Park, NC, eventually being promoted to the level of Principal Scientist. In 1996 he assumed a position at what was initially the Upjohn Company in Kalamazoo, MI and held several positions there through 2006 by which time he was a Senior Fellow at what was then Pfizer, Inc. In 2006 he assumed a position as a Distinguished Fellow at Schering-Plough responsible for the creation of the Rapid Structure Characterization Laboratory. He is presently a Distinguished Fellow at Merck Research Laboratories.”

So HOW interesting to see who Google Books thinks he is! See the link here… it reads as

“Gary Martin’s career as a freelance comic book artist spans over twenty years. He’s worked for all the major companies, including Marvel, DC, Dark Horse, Image, and Disney, and on such titles as, Spider-man, X-men, Batman, Star Wars, and Mickey Mouse. Gary is best known for his popular how-to books entitled, ‘The Art of Comic Book Inking’. Recently, Gary wrote a comic book series called ‘The Moth’, which he co-created with artist Steve Rude.”

I am not listed as an editor and for sure the information is out of date since David Rovnyak joined as an editor this year.

googlebooks

This is Gary Martin, the inker.

So…I am very interested in any hypotheses regarding how Google Books picked up a comic inker as an author when Amazon lists Gary as a scientist, clearly. By the way, Gary Martin, NMR spectroscopist extraordinaire is a brilliant photographer, especially of lighthouses…but manipulates light…not ink.

Imagine, if you would, the potential power of ORCIDs in keeping this clear, platform to platform, if the publisher used them, if Amazon adopted them and if Google Books used the data. With time…

 

 

 
Leave a comment

Posted by on November 18, 2014 in Book Reviews, General Communications, ORCID

 

Converting Crystal Structures into 3D Printable Files

We have been working with Vincent Scalfani from the University of Alabama towards supporting a community of 3D printing crystal structure enthusiasts. There is a listserv, [3DP-XTAL] hosted by the university of Alabama and if you would like to be added to the listserv, simply email Vincent at vfscalfaniATuaDOTedu. They are also in the process of creating a 3D printing crystal structure wiki/blog for the community.

With Vincent as the driver we are creating a public on-line repository for 3D printable structure files (.stl and .wrl). He used Jmol to prepare ~30,000 molecules and solids in .wrl and .stl format and we will be hosting them on part of our data repository.  We are very excited about this project and there will be more information at the upcoming 248th American Chemical Society Meeting in San Francisco, CA. See CINF Abstract # 125.

The flier that will be distributed at the IUCr meeting in Montreal in August is available on Slideshare here:

 
Leave a comment

Posted by on July 22, 2014 in General Communications

 

What LinkedIn Contacts Think I Know…

There has been a new capability on LinkedIn for awhile….the ability to add your judgments about people you are LinkedTo in your network. What this looks like is shown below. LinkedIn2 It’s been interesting to see what I have been “endorsed” for on my profile

 

Linked_In

I would agree…I am a Chemist first, then an NMR Spectroscopist but I would put cheminformatics and analytical chemistry above Drug Discovery.   I DO like this approach for “tagging” skillsets though and I can see it has a natural role in other ways…a fun project for the New Year. Watch this space.

 
1 Comment

Posted by on December 26, 2012 in General Communications

 

Google Scholar Citations continues to impress

I continue be impressed with Google Scholar Citations. I receive regular emails, similar to the one below,telling me when papers are referencing articles I have authored/co-authored. In this case this article referred to a paper that I co-authored in 1996 while I was at Kodak….regarding silver-catalyzed cyclizations. I would not have expected a paper about photographic based organic chemistry to show up in a Toxicology journal. But thanks to Google Scholar Citations now I know…

 
Leave a comment

Posted by on October 17, 2012 in General Communications

 

Tags:

Social networking tools as public representations of a scientist

This is one of my presentations at the ACS meeting today in San Diego regarding how to use social networking tools to expose yourself as a scientist

Social networking tools as public representations of a scientist

The web has revolutionized the manner by which we can represent ourselves online by providing us the ability to exposure our data, experiences and skills online via blogs, wikis and other crowdsourcing venues. As a result it is possible to contribute to the community while developing a social profile as a scientist. At present many scientists are still measured by their contributions using the classical method of citation statistics and a number of freely available online tools are now available for scientists to manage their profile. This presentation will provide an overview of tools including Google Scholar Citations and Microsoft Academic Search and will discuss how these are and other tools, when integrated with the ORCID identifier, may more fully recognize the collective contributions to science. I will also discuss how an increasingly public view of us as scientists online will likely contribute to our reputation above and beyond citations.

 

Adding SORD Database (Selected Organic Reactions Database) to ChemSpider

As discussed over on the ChemSpider blog we will soon be depositing data from the SORD databases (Selected Organic Reactions Database) onto ChemSpider. This will be done as two separate but related datasets until the SORD data source: Reactants and Products. If you don’t know what SORD is then who better to explain than Dick Wife, the “host” of the SORD database. Dick wrote the overview article below to provide an overview about what SORD is…ENJOY!

The Selected Organic Reactions (SOR) Database: capturing “Lost Chemistry”

Dick Wife, SORD B.V. The Netherlands (www.sord.nl; dick.wife@sord.nl)

A new database is capturing the 80% of Lost Chemistry from theses and dissertations which doesn’t make it into publications and chemists who contribute their data get access to the entire database for free.

SORD, an independent Dutch company, is carefully selecting the synthetic chemistry focused on Life Science research and making this chemistry available in their Selected Organic Reactions (SOR) Database. For the theses/dissertations which they select, SORD excerpts all of the reactions in the Experimental section are excerpted. This means there will still be a small overlap of data with full publications. There will also be a larger overlap with publications such as Notes, Letters or Communications but these do not contain the experimental details. The SOR Database brings all this chemistry to the desktop, every last detail written by the author.

Some time back, SORD looked at around 300k interesting drug-like compounds in the literature and which countries they had come from, and the native language. The English-speaking countries accounted for only 37% of the total. German/Swiss dissertations are often written in English but this is new. The theses and dissertations in the other languages represent more than half of the total. SORD routinely translates German and French experimental texts into English. They are about to start on Chinese and Japanese translations and, if anyone can give them access to Russian theses, they will translate these as well!

A thesis or dissertation is the result of several years of hard work by a research student under the constant supervision of the research leader whose reputation is at stake if the work described is wrong or inaccurate. It is also examined by a committee who decide on awarding the degree, or not. They scrutinize closely  the Results & Discussion as well as the Experimental sections. The chemistry is reliable.

Advanced Chemistry Development, Inc (ACD/Labs) is partnering SORD in developing this Database. The SOR Database is available for in-house use with ChemFolder Enterprise or on the Internet with ACD/Web Librarian™. This is a screen-shot of a typical SOR Database record in Web Librarian.

 

The Reaction Scheme shows every atom (there are no abbreviations). The Experimental text is edited to ASCII format and the key parameters (Reagent(s), Solvent(s), yield(s), MP(s) and Optical Rotation(s) are displayed in separate Fields, as are the full bibliographic data, making data-mining possible. There is also a link which enables the user to bring up the PDF of each reaction, containing all of the spectral and other physical data which SORD does not excerpt. The PDF link is a powerful and unique feature of the SOR Database.

Now some explanation about SORD’s excerption rules. What they call the Reaction Scheme (A + B à C, etc.) contains only the reacting and product compound structures. A Reagent is an essential reaction component of which no part ends up in the product – if it does, it becomes a Reactant! When several reactions are performed before the product is isolated (and characterized) the Reagents and Solvents are listed in Steps. Failed reactions are not excerpted but reactions with poor yields are.

The SOR Database currently contains 170k reactions; the target is one million at the end of 2013. Even this number is a lot smaller than what you find today in the major commercial reaction databases. Back in the nineties, SORD researchers looked at one such large commercial database which then contained 9 million compounds. Sifting through the content for drug-like compounds resulted in just 450k or 5% of the records[1]. Size is one database metric; quality is much more important! In the SOR Database, you will only find characterized products – and no polymers, or compounds with no molecular structure.

Users of the SOR Database also have access to the separate databases which contain the Reagents (ca. 3,000) and Solvents (ca. 450) which have been encountered so far. Often a Reagent is a catalyst (organic/organometallic) but they can also be simple entities like bases, acids, ammonium salts, etc. or complex chiral ligands. Authors give Reagents many different names and so each Reagent (and Solvent) in the SOR Database has been assigned a unique name. This enables rapid searches using the assigned names, again a novel feature of the database. Such searches can bring you to really nice chemistry.

As an Example, the second generation Grubbs olefin metathesis catalyst has been given the name Grubbs 2 catalyst. In the current SOR Database, there are more than 500 reactions where it has been used. Some of these are straightforward; some are not and generate novel ring systems like this one from the Martin group at North Carolina at Chapel Hill:

Searches in the Reactions Scheme, or using Reagent/Solvent names and hit refinement brings you to new chemistry which until now was only found on a dusty shelf in a library. The “Lost Chemistry” is now getting smaller as SORD carefully selects and excerpts the reactions which deserve a new life. The SOR Database is essential for novelty searches and it is a powerful supplement for the other commercial reaction databases.

Finally some more good news for academic research chemists; your data will be readily accessible to the whole chemical world who will cite your work in their publications. The chemistry which you never published may be just what others are looking for. Routinely SORD excerpts the complete collection of theses and dissertations from research supervisors; they will be more than happy to see your work appear in the next SOR Database!


[1] de Laet, A.; Hehenkamp, J. J.; Wife, R. L. Finding Drug Candidates in Lost/Emerging Chemistry. J. Heterocycl. Chem. 2000, 37, 669–674.

 

The long term cost of inferior database quality

Our recent Drug Discovery Today article

One of the highlights of the past year has been my continued collaborations with Sean Ekins on the issues of data quality, modeling of data and the applications of mobile technologies. Recently our commentary on the long term cost of inferior database quality was published in Drug Discovery Today and is available online here.

 
Leave a comment

Posted by on December 30, 2011 in Data Quality, General Communications

 

Why am I suddenly so popular as a potential Open Access journal editor

I have become SOOOOOOOOO popular as a journal editor for Open Access journals. In the past week I have been invited to be a journal editor for three separate Open Access journals. These are simply emails with sign up here, we are a popular publisher of Open Access journals and a “editors are encouraged to submit articles message”. My favorite invitation this week is below. Don’t forget I’m a PhD CHEMIST, NMR spectroscopist and cheminformatician….

I chose NOT to respond…

Subject: Invitation to Join the Editorial/Review Board of Journal of Communication Technology and Human Behaviors

Dear Dr,

I am writing to introduce Journal of Communication Technology and Human Behaviors to you. Journal of Communication Technology and Human Behaviors is a new journal launched recently by the Columbia International Publishing (CIP) team. CIP is committed to rapidly delivering high-quality research findings and results to the world. We aim to make all CIP journals top publications in their fields.

Based on your outstanding scientific contribution to your field, the CIP team would like to invite you to join the Editorial/Review Board of Journal of Communication Technology and Human Behaviors

Print ISSN:           2163-128X

Online ISSN:       2163-1298

Journal link:        http://uscip.org/JournalsDetail.aspx?journalID=19

 

Acceptance of submissions to Journal of Communication Technology and Human Behaviors is based solely on decisions of the Editorial Board Committee and peer reviewers. If you are interested in serving on the Editorial Board committee, please send your CV to jcthb@uscip.org and indicate the position (Editor-in-Chief, Associate Editor, Regular Editorial Board Committee member, or Reviewer) you are interested in. CIP will make a selection based on the competition. To accept an Editorial/Review board position, you are required to agree to the terms and conditions given at the end of this invitation letter. The names of Editorial Board Members will be listed online and in print copies.

 

Only with contributions from Editorial Board Committee members and peer reviewers can we make Journal of Communication Technology and Human Behaviors a top journal. If you have any questions or suggestions, please do not hesitate to contact us. We are keenly looking forward to hearing from you.

Sincerely,

Editorial Team

 

Email: jcthb@uscip.org

Phone: 1-573-886-8964

Fax: 1-573-886-8901

 

Columbia International Publishing LLC

3610 Buttonwood Drive Suite 200

Columbia, MO 65201, USA

 

 

 

______________________________________________________________

 

Terms and Conditions of Editorial/Review Board Committee positions

1.   All Editorial/Review Board members should try to promote the journal as a top publication

in the field.

2.   This is a voluntary and honorary position. No payment from Columbia International Publishing is associated with this position.

3.   The term is typically two years. It is renewable with approval by the Administration Department of Columbia International Publishing.

4.   The Editor-in-Chief should send manuscripts to at least two experts in the field for review.

Editorial/Review Board members should provide timely, fair, objective, and professional comments on the manuscripts assigned by the Editor-in-Chief.

5.   Editorial/Review Board members should never disclose information pertaining to any manuscript under their review.

6.   Columbia International Publishing reserves the exclusive right to change any rules and terms and conditions without prior notice.

 

 

 
3 Comments

Posted by on December 14, 2011 in General Communications

 
 
Stop SOPA