Archive for category Open Science..all its forms
This week/weekend I will attend the ScienceOnline2013 conference here in Raleigh, North Carolina. This is my favorite conference of the year, bar none. I feel privileged every time I attend to be surrounded by people who are challenging the status quo and are passionate about making science more available and consumable to their peers and the community. I have met some great people at this conference and every year I walk away tired yet invigorated. I walk away feeling that my own contributions to science, especially my work to enable access to chemistry data, is coherent with the efforts of many of the crowd attending this meeting. The meeting has a commitment to scientific truth, collaboration, communication and openness. YES!!!
While I am a chemist by training what I enjoy so much about the meeting is meeting NON-chemists and learning about their world, their interests, their adventures and challenges. By keeping my head in my own box at many other conferences, primarily chemistry of course, I limit what I can learn from the experiences outside of my domain. ScienceOnline frees me up from these boundaries by throwing me into a mix of wildly different engagement. It is, quite simply, a joy! And coming at the beginning of the year it is the first conference I attend…always good!
The conference is well organized, wall to wall entertainment in various forms (including science comedians!), is socially engaging (lots of opportunities for after hours play!) and is full of “my kind of people”. I am lucky to be so close and, this year, to be able to share space with one of my closest friends. Sean Ekins (@collabchem) and I will host a discussion on “Leading Chemists Into Openness“. Sean and I hung out at the conference last year and it had a good impact on him as he describes here.
If you are attending ScienceOnline2013, are interested in Open Science and the advantages, challenges and “unknowns” of how to get there, then please come and join the conversation. We are the hosts…you define where we go! The slides below are for you to review/consider/digest in advance of the session. See you there???!!!
There are a number of people in my domain that I have great appreciation for and that I enjoy working with. So, an opportunity to co-author on rules for licensing data with Sean Ekins and John Wilbanks was an opportunity too good to miss. There are a lot of opinions, rants and views on data licensing floating around the internet, discussed at conferences and over beverages. Meanwhile we have opinions too and have shared them through this perspective on PLoS Computational Biology through this paper: “Why Open Drug Discovery Needs Four Simple Rules for Licensing Data and Models”
When writing a publication how many of us conduct complete literature searches? For those of us who do not have access to Scifinder how are we finding our literature? Probably through Google Scholar? When I write a paper I admit that some of my searches may be less than complete but I do try and stay informed in regards to what is going on in my domain. VERY occasionally I get feedback from reviewers pointing me to references that they feel I either ignored or was unaware of. Many times they are co-authored by the reviewer themselves…and it is pretty easy to figure out who the reviewers are
Today I received an email in my inbox about the latest article in the Journal of Cheminformatics. It is OMG: Open Molecule Generator. The article is here. The abstract opens with “Computer Assisted Structure Elucidation has been used for decades to discover the chemical structure of unknown compounds. In this work we introduce the first open source structure generator, Open Molecule Generator (OMG), which for a given elemental composition produces all non-isomorphic chemical structures that match that elemental composition.”
Having been involved with Computer-Assisted Structure Elucidation for many years, having co-authored a book about it (here) and probably the definitive review article from the past 5 years (here) I would have assumed that our work would have been referenced. I was surprised to see that our work was not referenced while other CASE systems were. Articles we’ve issued over the past few years are below. I’ve gathered them here to point the authors to in case they want to reference any of them and missed them in the literatire search.
I am taking advantage of the fact that I can leave comments on the provisional manuscript here (what a great capability!!!) and will let them know about this list. it would be good to compare the performance of the OMG with the structure generator under ACD/Structure Elucidator sometime….
1) M.E. Elyashberg, K.A. Blinov and A.J. Williams, Computer-aided Molecular Structure Elucidation on the Basis of 1D and 2D NMR Spectra, Applied Magnetic Resonance, (May 2000)
2) K.A. Blinov, M.E. Elyashberg, S.G. Molodtsov, A.J. Williams and E.R. Martirosian, An Expert System for Automated Structure Elucidation Utilizing 1H-1H, 13C-1H, and 15N-1H 2D NMR correlations, Fresenius J. Anal. Chem., 369, 709 (2001)
3) G.E. Martin, C.E. Hadden, D.J. Russell, B.D. Kaluzny, J.E. Guido, W.K. Duholke, B.A. Stiemsma, T.J. Thamann, R.C. Crouch, K.A. Blinov, M.E. Elyashberg, E.R. Martirosian, S.G. Molodtsov, A.J. Williams, P.L. Schiff, Jr., Identification of Degradants of a Complex Alkaloid Using NMR Cryoprobe Technology and ACD/Structure Elucidator, J. Heterocyclic Chem. 39, 1241 (2002)
4) M.E. Elyashberg, K.A. Blinov, A.J. Williams, E.R. Martirosian, S.G. Molodtsov, Application of a New Expert System for the Structure Elucidation of Natural Products from the 1D and 2D NMR Data, J. Nat. Prod., 65, 693 (2002)
5) G . E. Martin, C .E. Hadden, D. J. Russell, B. D. Kaluzny, J. E. Guido, W. K. Duholke, B. A. Stiemsma, T. J. Thamann, R. C. Crouch, K. A. Blinov, M. E. Elyashberg, E. R. Martirosian, S. G. Molodotsov, A. J. Williams, and P. L. Schiff, Jr., Identification of Degradants of a Complex Alkaloid Using NMR Cryoprobe Technology and ACD/Structure Elucidator, J. Heterocyclic Chem., 39 1241-1250 (2002).
6) K. A. Blinov, D. Carlson, M. E. Elyashberg, G. E. Martin, E. R. Martirosian, S. Molodtsov, and A. J. Williams, Computer-Assisted Structure Elucidation of Natural Products with Limited 2D NMR Data: Applications of the StrucEluc System, Magn. Reson. Chem., 41, 359-372 (2003).
7) G. E. Martin, D. J. Russell, K. A. Blinov, M. E. Elyashberg and A. J. Williams, Applications and Advances in Cryogenic NMR Probes & Computer-Assisted Structure Elucidation. Ann. Magn. Reson., 2, 1-31 (2003)
8) K. Blinov, M. Elyashberg, E. R. Martirosian, S. G. Molodtsov, A. J. Williams, M. H. M. Sharaf, P. L. Schiff, Jr., R. C. Crouch, G. E. Martin, C. E. Hadden, and J. E. Guido, “Quindolinocryptotackieine: The Elucidation of a Novel Indoloquinoline Alkaloid Structure through the Use of Computer-Assisted Structure Elucidation and 2D-NMR,” Magn. Reson. Chem., 41, 577-584 (2003).
9) M. E. Elyashberg, K. A. Blinov, E. R. Martirosian, S. G. Molodtsov, A. J. Williams, and G. E. Martin, Automated Structure Elucidation – The Benefits of a Symbiotic Relationship between the Spectroscopist and the Expert System, J. Heterocyclic Chem., 40, 1017-1029 (2003).
10) M. E. Elyashberg, K. A. Blinov, A. J. Williams, S. G. Molodtsov, G. E. Martin, and E. R. Martirosian, Structure Elucidator: A Versatile Expert System for Molecular Structure Elucidation from 1D and 2D NMR Data and Molecular Fragments, J. Chem. Inf. Comput. Sci. 44, 771-792 (2004).
11) S. G. Molodtsov, M. E. Elyashberg, K. A. Blinov, A. J. Williams, E. E. Martirosian, G. E. Martin, and B. Lefebvre. Structure Elucidation from 2D NMR Spectra Using the StrucEluc Expert System: Detection and Removal of Contradictions in the Data. J. Chem. Inf. Comp. Sci., 44, 1737-1751 (2004)
12) G. J. Sharman, I. C. Jones, M. P. Parnell, M. C. Willis, M. F. Mahon, D. V. Carlson, A. J. Williams, M. E. Elyashberg, K. A. Blinov, S. G. Molodtsov. Automated structure elucidation of two products in a reaction of an a,b-unsaturated pyruvate. Magn. Reson. Chem. 42, 567 (2004)
13) Y. D. Smurnyy, M. E. Elyashberg, K. A. Blinov, B. A. Lefebvre, G. E. Martin, and A. J. Williams, Computer-Aided Determination of Relative Stereochemistry and 3D Models of Complex Organic Molecules from 2D NMR Spectra, Tetrahedron, 61, 9980-9989 (2005).
14) M. E. Elyashberg, K. A. Blinov, A. J. Williams, S. G. Molodtsov, and G. E. Martin, Are Deterministic Expert Systems for Computer-Assisted Structure Elucidation Obsolete? J. Chem. Inf. Model. 46, 1643-1656 (2006).
15) M. E. Elyashberg, K. A. Blinov, S. G. Molodtsov, A. J. Williams, and G. E. Martin, Fuzzy Structure Generation: An Efficient New Tool for Computer-Aided Structure Elucidation (CASE), J. Chem. Inf. Model., 47, 1053-1066 (2007). 10.1021/ci600528g
16) M. E. Elyashberg, A. J. Williams, and G. E. Martin. Computer-Assisted Structure Verification and Elucidation Tools In NMR-Based Structure Elucidation. Review article. Progress in NMR Spectroscopy (2007) 10.1016/j.pnmrs.2007.04.003
17) Y. D. Smurnyy, K. A. Blinov, T. S. Churanova, M. E. Elyashberg, and A. J. Williams. Toward More Reliable 13C and 1H Chemical Shift Prediction: A Systematic Comparison of Neural-Network and Least-Squares Regression Based Approaches, J. Chem. Inf. Model. 48, 128-134, (2008)
18) M. E. Elyashberg, A. J. Williams, D. C. Lankin, G. E. Martin, J. Porco, W. F. Reynolds, and C. Singleton, Applying Computer-Assisted Structure Elucidation Algorithms for the Purpose of Structure Validation – Revising the NMR Assignments of Hexacyclinol, J. Nat. Prod., 71, 581-588 (2008).
19) M.E. Elyashberg, K.A. Blinov and A.J. Williams, A Systematic Approach for the Generation and Verification of Structural Hypotheses. Magn. Reson. Chem. 47, 371-389, (2009)
20) M. E. Elyashberg, A. J. Williams, and K.A. Blinov, The Application of Empirical Methods of 13C NMR Chemical Shift Prediction as a Filter for Determining Possible Relative Stereochemistry. Magn. Reson. Chem. 47, 333-341 (2009)
21) Y. D. Smurnyy, K. A. Blinov, T. S. Churanova, M. E. Elyashberg, and A. J. Williams. Development of a fast and accurate method of 13C NMR chemical shift prediction. Chemometrics and Intelligent Laboratory Systems, 97(1), 91-97, (2009)
22) M. E. Elyashberg, A. J. Williams and K. A. Blinov, Structural revisions of natural products by Computer Assisted Structure Elucidation (CASE) Systems, Nat. Prod. Rep., 2010, DOI: 10.1039/c002332a
23) Blind trials of computer-assisted structure elucidation software, Journal of cheminformatics 4 (1), 5, A Moser, ME Elyashberg, AJ Williams, KA Blinov, JC DiMartino
24) Elucidating ‘undecipherable’chemical structures using computer‐assisted structure elucidation approaches, Mikhail Elyashberg, Kirill Blinov, Sergey Molodtsov, Antony Williams, Magnetic Resonance in Chemistry, 50(1), 22–27, 2012 DOI: 10.1002/mrc.2849
BOOK: Contemporary Computer Assisted Approaches to Molecular Structure Elucidation by Kirill Blinov, Mikhail Elyashberg and Antony J. Williams, Royal Society of Chemistry
Second talk delivered today at ACS Philadelphia…
Mining public domain data as a basis for drug repurposing
Online databases containing high throughput screening and other property data continue to proliferate in number. Many pharmaceutical chemists will have used databases such as PubChem, ChemSpider, DrugBank, BindingDB and many others. This work will report on the potential value of these databases for providing data to be used to repurpose drugs using cheminformatics-based approaches (e.g. docking, ligand-based machine learning methods). This work will also discuss the potentially related applications of the Open PHACTS project, a European Union Innovative Medicines Initiative project, that is utilizing semantic web based approaches to integrate large scale chemical and biological data in new ways. We will report on how compound and data quality should be taken into account when utilizing data from online databases and how their careful curation can provide high quality data that can be used to underpin the delivery of molecular models that can in turn identify new uses for old drugs.
This is one of my presentations at the ACS meeting today in San Diego regarding how to use social networking tools to expose yourself as a scientist
Social networking tools as public representations of a scientist
The web has revolutionized the manner by which we can represent ourselves online by providing us the ability to exposure our data, experiences and skills online via blogs, wikis and other crowdsourcing venues. As a result it is possible to contribute to the community while developing a social profile as a scientist. At present many scientists are still measured by their contributions using the classical method of citation statistics and a number of freely available online tools are now available for scientists to manage their profile. This presentation will provide an overview of tools including Google Scholar Citations and Microsoft Academic Search and will discuss how these are and other tools, when integrated with the ORCID identifier, may more fully recognize the collective contributions to science. I will also discuss how an increasingly public view of us as scientists online will likely contribute to our reputation above and beyond citations.
This is my presentation at the InChI Symposium today:
The great promise of navigating the internet using InChIs
The InChI, the International Chemical Identifier, has been the basis of both indexing and deduplication of the ChemSpider database since the inception of the platform. When the InChI was adopted we envisaged a future whereby the identifier would proliferate across journals, databases and the internet in general providing us a basis for “structure searching the internet”. This presentation will provide an overview of how the InChI has facilitated the integration of ChemSpider to chemistry on the internet, some of the surprising findings that have resulted from this work and extrapolate the influence of InChIs into the future for a chemically enabled web.
This presentation was just given at the ACS meeting in San Diego…
The Royal Society of Chemistry hosts an online resource, ChemSpider, as a structure centric database for chemists linking over 25 million chemicals to 400 internet sites. As a crowdsourced environment members of the chemistry community can deposit spectral data to the database. Almost 2000 NMR spectra have been submitted to the database and these are the basis of both a gaming environment for learning NMR spectroscopy, the SpectralGame, as well as a new teaching environment known as SpectraSchool. This presentation will provide an overview of these two online resources and how they may be utilized for the purpose of teaching NMR spectroscopy in an Undergraduate Curriculum.
There are many social networking tools for scientists that can be used to share information, engage the social network and move information about activities across the web. This presentation provides an overview of some of the tools available and how they can be used by scientists to expose their activities, manage their profile publicly and participate in the network.
A few weeks ago I was invited to give a presentation to the Board of Directors at Burroughs Wellcome. I was very interested in taking this opportunity to discuss my views on Open Science, Open Notebook Science, Open Data etc with this group of very esteemed scientists. However, it turned out it clashed with a planned vacation. Since my friend and frequent co-author Sean Ekins is an evangelist for open science for drug discovery, improving data quality, and Mobile Apps, and since we think alike on so many levels, I asked Sean whether he’d want to give the presentation. And, always welcoming adventure Sean jumped at the chance to present.
As it turned out Hurricane Rina resulted in us cancelling our vacation so I ended up attending the presentation with Sean. While we had bounced the slides between each other prior to the presentation Sean did a terrific job as the presenter and we had some very interesting questions regarding what is standing in the way of open science, especially around chemistry databases (of compounds), what are good examples of bioinformatics projects that are successful, and whether there are “risks” inherent to Open Science, especially in regards to what is shared online in public compound databases. I thoroughly enjoyed the meeting, short as it was and am glad that we were given the opportunity.
Sean has eloquently outlined the nature of the presentation at his site (he is Collabchem) and the presentation is below for your comments and review. I recommend that you check out Sean’s other presentations too!
I had the pleasure of co-presenting with my friend Jean-Claude Bradley today at the “3rd Annual Drug Discovery Partnership: Filling the Pipeline“. Jean-Claude gave a great talk, available on Slideshare here, and discussed the issue of data quality, how improve data gives improved models, the cross-validation of data and proliferation of errors. My talk is on Slideshare here and embedded below. In many ways I discussed similar issues, though not focused on melting point data but rather on structures, structure-identifier relationships, the cross-linking of multiple resources on the internet and how online resources can support Open Drug Discovery Systems. In this presentation I discussed some of the work we are doing on Open PHACTS.