The Application of Text and Data Mining to Enhance the Royal Society of Chemistry Publication Archive
I just found the video of my presentation given at the 2014 Emerging Trends in Scholarly Publishing™ Seminar
The Application of Text and Data Mining to Enhance the Royal Society of Chemistry Publication Archive
The Royal Society of Chemistry (RSC) is one of the world’s most prominent scientific societies and STM publishers. Our contributions to the scientific community include the delivery of a myriad of resources to support the chemistry community to access chemistry-related data, information and knowledge. This includes ChemSpider, a compound centric platform linking together over 30 million chemical compounds with internet-based resources. Using this compound database and its associated chemical identifiers as a basis the RSC is utilizing text and data mining approaches to data enable our published archive of scientific publications. This presentation will provide an overview of our technical approaches to text and data enable our archive of scientific articles, how we are developing an integrated database of chemical compounds, reactions, physical and analytical data and how it will be used to facilitate scientific discovery.
Both the SLideshare presentation and my presentation are posted below:
A presentation that I am giving around UK universities in September/October 2014
A chemistry data repository to serve them all
Over the past five years the Royal Society of Chemistry has become world renowned for its public domain compound database that integrates chemical structures with online resources and available data. ChemSpider regularly serves over 50,000 users per day who are seeking chemistry related data. In parallel we have used ChemSpider and available software services to underpin a number of grant-based projects that we have been involved with: Open PHACTS – a semantic web project integrating chemistry and biology data, PharmaSea – seeking out new natural products from the ocean and the National Chemical Database Service for the United Kingdom. We are presently developing a new architecture that will offer broader scope in terms of the types of chemistry data that can be hosted. This presentation will provide an overview of our Cheminformatics activities at RSC, the development of a new architecture for a data repository that will underpin a global chemistry network, and the challenges ahead, as well as our activities in releasing software and data to the chemistry community.
Beyond the paper CV and developing a scientific profile through social media, AltMetrics and micropublication
This is a presentation that I gave during a UK tour in Sept/Oct 2014 at a number of UK universities
Beyond the paper CV and developing a scientific profile through social media, AltMetrics and micropublications
Many of us nowadays invest significant amounts of time in sharing our activities and opinions with friends and family via social networking tools. However, despite the availability of many platforms for scientists to connect and share with their peers in the scientific community the majority do not make use of these tools, despite their promise and potential impact and influence on our future careers. We are being indexed and exposed on the internet via our publications, presentations and data. We also have many more ways to contribute to science, to annotate and curate data, to “publish” in new ways, and many of these activities are as part of a growing crowdsourcing network. This presentation will provide an overview of the various types of networking and collaborative sites available to scientists and ways to expose your scientific activities online. Many of these can ultimately contribute to the developing measures of you as a scientist as identified in the new world of alternative metrics. Participating offers a great opportunity to develop a scientific profile within the community and may ultimately be very beneficial, especially to scientists early in their career.
On September 8th 2014 a memorial gathering was held at Drexel University to honor the work and life of Jean-Claude Bradley. I could not attend in person but put together a short presentation and video to be played at the gathering. The slides are on SlideShare here and the movie on YouTube here
An invitation to contribute a paper
Jean-Claude Bradley Memorial Issue
Antony J. Williams, Cheminformatics, Royal Society of Chemistry
Andrew Lang, Oral Roberts University
In May of 2014 we lost one of our colleagues, Jean-Claude Bradley (JC), way too soon.
JC was, in many ways, a man ahead of his time. He foresaw the future of science likely a decade ahead of the new shift that is occurring in academia, that of Open Notebook Science. The last decade has seen a dramatic shift toward openness in science that has encompassed Open Access Publishing, Open Source in software development, Open Data in the majority of branches of science and Open Standards primarily as a result of people like JC. As a result of these shifts the amount of data now available online for scientists to consume and interrogate is enormous and grows daily. Much of this data is however already “aged” having been extracted from published articles or assembled into databases from historical data that often lacks provenance.
Jean-Claude Bradley’s drive was towards something more immediate with his concept of Open Notebook Science, the practice of making the entire primary record of research activities publicly available online as it is recorded (http://en.wikipedia.org/wiki/Open_notebook_science). Through his leadership in this area he motivated, cajoled and guided a number of scientists who operated in a more generally closed manner of science into the domain of Open Science. He mentored young students into the new world and encouraged us all to consider the benefits that could result in being more open.
Jean-Claude was also a master collaborator and networker bringing together scientists from various domains to work together. But in his own work he also stimulated participation and contributions from instrument manufacturers, chemical vendors, journal publishers and software developers. Most of you reading this will have almost certainly have heard of, worked with or benefited from some of his activities.
We, Andrew Lang and Antony Williams, intend to celebrate the work and vision of JC and are presently editing a memorial issue that crosses both the domains of chemistry and cheminformatics that he operated in. Since he was a member of the editorial advisory board for Journal of Cheminformatics and Co-Editor-in-Chief of Chemistry Central Journal our intention is to encourage participation and submission of papers from areas of chemistry and cheminformatics that will be assembled into a single memorial issue. If you are receiving this communication then please accept it as an invitation to submit an article to the most appropriate of the two journals that you choose.
Timelines and how to submit your paper
In recognition of the contributions that Jean-Claude Bradley has made to science and Open Science in particular, we hope that you will consider our invitation and contribute a paper to help us in celebrating and evangelizing his work. Please do not hesitate to contact either of us with questions, to confirm participation and for instructions on how to submit your paper at Andrew Lang (firstname.lastname@example.org) or Antony Williams (email@example.com).
If you wish contribute to this thematic issue please use the online submission system for the appropriate journal, found here:
Chemistry Central Journal: http://journal.chemistrycentral.com/manuscript.
Journal of Cheminformatics: http://www.jcheminf.com/manuscript
Please ensure that you state in your cover letter that your paper is an invited paper for ‘Insert relevant Journal name’ as part of the cross journal thematic issue entitled ‘Jean-Claude Bradley Memorial Issue’. The deadline for submitting your paper is 1 December 2014, to publish the thematic issue in early 2015.
About Chemistry Central
Chemistry Central Journal and Journal of Cheminformatics are open access journals published by Chemistry Central. The benefits of open access are particularly attractive in these fields, ensuring that scientists working throughout the community on different aspects all have shared access to the latest research.
Chemistry Central (www.chemistrycentral.com) is part of the Springer chemistry publishing unit, having been set up in 2006 as a service dedicated to the open access publishing of chemistry research.
Today I received notification that an app to accompany a forthcoming RSC book ” The Handbook of Medicinal Chemistry: Principles and Practice” went live on iTunes.
“The Medicinal Chemistry Toolkit app is a suite of resources to support the day to day work of a medicinal chemist. Based on the experiences of medicinal chemistry experts, we developed otherwise difficult-to-access tools in a portable format for use in meetings, on the move and in the lab. The app is optimised for iPad and contains calculator functions designed to ease the process of calculating values of: Cheng-Prusoff; Dose to man; Gibbs free energy to binding constant; Maximum absorbable dose calculator; Potency shift due to plasma protein binding.
If you have an iPad then you can download the app from here.
The book itself will be published in November 2014 and will provide a comprehensive, everyday resource for a practicing medicinal chemist throughout the drug development process
The app will be updated on an ongoing basis with new algorithms and calculators so make sure you check back or update when it tells you.
Recently returned from the ACS meeting in San Francisco it was a busy and very successful conference. We presented to a number of different divisions on a lot of our activities and many of our collaborators presented also. The list of talks is below and as more links become available I will update this page. What I learned is that we need to present in MANY other divisions other than CINF…the attendees of the CHED and ANLY divisions for sure were interested in what we have to say. We will do more of this…
Applying Royal Society of Chemistry cheminformatics skills to support the PharmaSEA project, A.J. Williams. A. Pshenichnov, V. Tkachenko, K. Karapetyan and D. Sharpe, ACS Fall Meeting, San Francisco, August 2014 Link
How the InChI identifier is used to underpin our online chemistry databases at the Royal Society of Chemistry, A.J. Williams, V. Tkachenko and K. Karapetyan, ACS Fall Meeting, San Francisco, August 2014 (Invited Talk) Link
Dealing with the complex challenge of managing diverse chemistry data online, A.J. Williams, A. Pshenichnov, V. Tkachenko and K. Karapetyan, ACS Fall Meeting, San Francisco, August 2014 Link
Encouraging undergraduate students to participate as authors of scientific publications, A.J. Williams, ACS Fall Meeting, San Francisco, August 2014 Link
Who knew I would get here from there: How I became the ChemConnector, A.J. Williams, ACS Fall Meeting, San Francisco, August 2014 (Invited Talk) Link
Open innovation and chemistry data management contributions from the Royal Society of Chemistry resulting from the Open PHACTS project, A.J. Williams. A. Pshenichnov, J. Steele, C. Batchelor, V. Tkachenko, K. Karapetyan and V. Tkachenko, ACS Fall Meeting, San Francisco, August 2014 Link
Using an online database of chemical compounds for the purpose of structure identification, A.J. Williams, A. Pshenichnov and V. Tkachenko, ACS Fall Meeting, San Francisco, August 2014
The Royal Society of Chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world, A.J. Williams and V. Tkachenko, ACS Fall Meeting, San Francisco, August 2014 Link
Accessing 3D printable chemical structures online. V. F. Scalfani, A. J. Williams, R. M. Hanson, J. E. Bara, A. Day, V. Tkachenko, ACS Fall Meeting, San Francisco, August 2014 Link
Using the BRAIN, biorelations and intelligence network, for knowledge discovery. A. Mons, B. Mons, A. Krol, A.Baak, A.J. Williams, V. Tkachenko, ACS Fall Meeting, San Francisco, August 2014
Navigating chemistry requirements for data management and electronic notebooks: A case study. L. R. McEwen, A. J. Williams, V. Tkachenko, J. G. Frey, S. J. Coles, A. E. Day, C. Willoughby, W. R. Dichtel, ACS Fall Meeting, San Francisco, August 2014
The Chemical Analysis Metadata Platform (ChAMP): Thoughts and Ideas on the Semantic Identification of Analytical Metrics, S. Chalk, A.J. Williams, V.Tkachenko San Francisco, August 2014 Link
Integrating Jmol/JSpecView into the Eureka Research Workbench. S. Chalk, M. Morse, I. Hurst, A.J. Williams, V.Tkachenko, A. Pshenichnov, R. Hanson, ACS Fall Meeting, San Francisco, August 2014
Clustering the Royal Society of Chemistry chemical repository to enable enhanced navigation across millions of chemicals. K. Karapetyan, V. Tkachenko, A. J. Williams, O. Kohlbacher, P. Thiel, ACS Fall Meeting, San Francisco, August 2014 Link
Experiences and adventures with noSQL and its applications to cheminformatics data. V. Tkachenko, A.J. Williams, K. Karapetyan, A. Pshenichnov, M. Rybalkin, ACS Fall Meeting, San Francisco, August 2014 Link
Faculty profiling and searching in the Eureka Research Workbench using VIVO and ScientistsDB. S. Chalk, M.Morse, I. Hurst, A.J. Williams, V. Tkachenko, A. Pshenichnov, ACS Fall Meeting, San Francisco, August 2014
Supporting the exploding dimensions of the chemical sciences via global networking. V. Tkachenko, A.J. Williams, S. Vatsadze, ACS Fall Meeting, San Francisco, August 2014 Link
Toward extracting analytical science metrics from the RSC archives. S. Chalk, A.J. Williams, V. Tkachenko, C.Batchelor, ACS Fall Meeting, San Francisco, August 2014
Dereplication applications for computer-assisted structure elucidation (CASE) and the ChemSpider database. P.Wheeler, A. Moser, J. DiMartio, M. Elyashberg, K. Blinov, S. Molodstov, A.J. Williams, ACS Fall Meeting, San Francisco, August 2014 (Invited talk)
Real structures for real natural products − really getting them right and getting them faster. P. Wheeler, A.J. Williams, M. Elyashberg, R. Pol, A. Moser, ACS Fall Meeting, San Francisco, August 2014
The increasing importance of chemical information literacy in the life of graduate students: Contributions from the ACS Division of Chemical Information (CINF, G. Baysinger, J. Currano, J. Garritano, L. R McEwen, A. J Williams
Using an online database of chemical compounds for the purpose of structure identification #ACSsanfran
Using an online database of chemical compounds for the purpose of structure identification
Online databases can be used for the purposes of structure identification. The Royal Society of Chemistry provides access to an online database containing tens of millions of compounds and this has been shown to be a very effective platform for the development of tools for structure identification. Since in many cases an unknown to an investigator is known in the chemical literature or reference database, these “known unknowns” are commonly available now on aggregated internet resources. The identification of these types of compounds in commercial, environmental, forensic, and natural product samples can be identified by searching against these large aggregated databases querying by either elemental composition or monoisotopic mass. Searching by elemental composition is the preferred approach as it is often difficult to determine a unique elemental composition for compounds with molecular weights greater than 600 Da. In these cases, searching by the monoisotopic mass is advantageous. In either case, the search results can be refined by appropriate filtering to identify the compounds. We will report on integrated filtering and search approaches on our aggregated compound database for the purpose of structure identification and review our progress in using the platform for natural product dereplication purposes.
Open innovation and chemistry data management contributions from the Royal Society of Chemistry resulting from the Open PHACTS project at #ACSsanfran
This is my presentation on Thursday 14th August at the ACS Meeting in San Francisco
Open innovation and chemistry data management contributions from the Royal Society of Chemistry resulting from the Open PHACTS project
The Royal Society of Chemistry was pleased to contribute to the Open PHACTS project, a 3 year project funded by the Innovative Medicines Initiative fund from the European Union. For three years we developed our existing platforms, created new and innovative widgets and data platforms to handle chemistry data, extended existing chemistry ontologies and embraced the semantic web open standards. As a result RSC served as the centralized chemistry data hub for the project. With the conclusion of the Open PHACTS project we will report on our experiences resulting from our participation in the project and provide an overview of what tools, capabilities and data have been released into the community as a result of our participation and how this may influence future projects. This will include the Open PHACTS open chemistry data dump including the chemistry related data in chemistry and semantic web consumable formats as well as some of the resulting chemistry software released to the community. The Open PHACTS project resulted in significant contributions to the chemistry community as well as the supporting pharmaceutical companies and biomedical community.
This presentation was given by Vincent Scalfani and covers the work we have done to provide access to 3D printable chemical structures online…
Accessing 3D Printable Chemical Structures Online
We have been exploring routes to create 3D printable chemical structure files (.WRL and .STL). These digital 3D files can be generated directly from crystallographic information files (.CIF) using a variety of software packages such as Jmol. After proper conversion to the .STL (or .WRL) file format, the chemical structures can be fabricated into tangible plastic models using 3D printers. This technique can theoretically be used for any molecular or solid structure. Researchers and educators are no longer limited to building models via traditional piecewise plastic model kits. As such, 3D printed molecular models have tremendous value for teaching and research. As the number of available 3D printable structures continues to grow, there is a need for a robust chemical database to store these files. This presentation will discuss our efforts to incorporate 3D printable chemical structures within the Royal Society of Chemistry’s online compound database.