My second talk of three on August 11th 2014 at the ACS Meeting in San Francisco.
Encouraging students to start publishing early in their career
Many students spend enormous amounts of their time engaged with their computers, accepting of course that mobile devices are simply computers of a different form factor. Engaged with the social networks, utilizing computer platforms to source and share content of various forms, their contributions of “data” into what is the cloud, and in many cases a void, is enormous. What community and career benefit might result from those students spending some of their time contributing chemistry related data to the world? What challenges lie in the way of their participation and how might participating have a positive, or negative impact on their future career. The Royal Society of Chemistry hosts a number of chemistry data platforms to which students can actively contribute and for which their participation can be measured. Moreover the RSC’s micropublishing platform allows chemists to learn how to write up their scientific work, obtain review from their peers and chemistry professors in a non-threatening environment and produce an online published work in less than day that is both citable and available as a shared resource for the community. This presentation will demonstrate how to participate and encourage engagement from students early in their education. There are no longer any technology barriers to the sharing of the majority of chemistry related data.
How the InChI identifier is used to underpin our online chemistry databases at the Royal Society of Chemistry #ACSsanfran
This is my presentation at the ACS San Francisco Fall Meeting on August 10th 2014
How the InChI identifier is used to underpin our online chemistry databases at the Royal Society of Chemistry
The Royal Society of Chemistry hosts a growing collection of online chemistry content. For much of our work the InChI identifier is an important component underpinning our projects. This enables the integration of chemical compounds with our archive of scientific publications, the delivery of a reaction database containing millions of reactions as well as a chemical validation and standardization platform developed to help improve the quality of structural representations on the internet. The InChI has been a fundamental part of each of our projects and has been pivotal in our support of international projects such as the Open PHACTS semantic web project integrating chemistry and biology data and the PharmaSea project focused on identifying novel chemical components from the ocean with the intention of identifying new antibiotics. This presentation will provide an overview of the importance of InChI in the development of many of our eScience platforms and how we have used it to provide integration across hundreds of websites and chemistry databases across the web. We will discuss how we are now expanding our efforts to develop a platform encompassing efforts in Open Source Drug Discovery and the support of data management for neglected diseases.
This is the first presentation I gave at the ACS meeting in San Francisco on Sunday morning (August 8th) in the CINF Natural Products session.
Applying Royal Society of Chemistry cheminformatics skills to support the PharmaSea project
The collaborative project PharmaSea brings European researchers to some of the deepest, coldest and hottest places on the planet. Scientists from the UK, Belgium, Norway, Spain, Ireland, Germany, Italy, Switzerland and Denmark are working together to collect and screen samples of mud and sediment from huge, previously untapped, oceanic trenches. The large-scale, four-year project is backed by almost 10 million euros of funding and brings together 24 partners from 13 countries from industry, academia and non-profit organisations. The PharmaSea project focuses on biodiscovery research and the development and commercialisation of new bioactive compounds from marine organisms, including deep-sea sponges and bacteria, to evaluate their potential as novel drug leads or ingredients for nutrition or cosmetic applications. The Royal Society of Chemistry is responsible for developing a number of capabilities to support the Pharmasea project including a chemical registration system for new compounds, dereplication technologies to assist in the identification of new compounds and search techniques for mass spectrometrists within the project. This presentation will provide an overview of the project and our progress to contributing chemical information technologies to support the effort.
This is a presentation I gave at the National Institute of Standards and Technology on July 30th 2014
Experiences in Hosting Big Chemistry Data Collections for the Community
Access to scientific information has changed dramatically as a result of the web and its underpinning technologies. The quantities of data, the array of tools available to search and analyze, the devices and the shift in community participation continues to expand while the pace of change does not appear to be slowing. RSC hosts a number of chemistry data resources for the community including ChemSpider, one of the community’s primary online public compound databases. Containing tens of millions of chemical compounds and its associated data ChemSpider serves data tens of thousands of chemists every day. The platform offers the ability for crowdsourcing enabling the community to deposit and curate data. This presentation will provide an overview of the expanding reach of this cheminformatics platform and the nature of the solutions that it helps to enable including structure validation and text mining and semantic markup. ChemSpider is limited in scope as a chemical compound database and we are presently architecting the RSC Data Repository, a platform that will enable us to extend our reach to include chemical reactions, analytical data, and diverse data depositions from chemists across various domains. We will also discuss the possibilities it offers in terms of supporting data modeling and sharing. The future of scientific information and communication will be underpinned by these efforts, influenced by increasing participation from the scientific community.
Data Mining Dissertations and Adventures and Experiences in the World of Chemistry
This presentation was given at the CLIR/DLF Postdoctoral Fellowship Summer Seminar at Bryn Mawr college in Pennsylvania on July 29th 2014. The intention was to communicate what we are doing in the fields of text and data mining in the domain of chemistry and specifically around mining the RSC archive publication and chemistry dissertations and theses. How would these experiences map over to the humanities?
I presented at the Food and Drug Administration today regarding some of our efforts to develop a research data repository for the community. The abstract and presentation from Slideshare is below.
Current Initiatives in Developing Research Data Repositories at the Royal Society of Chemistry
Access to scientific information has changed in a manner that was likely never even imagined by the early pioneers of the internet. The quantities of data, the array of tools available to search and analyze, the devices and the shift in community participation continues to expand while the pace of change does not appear to be slowing. RSC hosts a number of chemistry data resources for the community including ChemSpider, one of the community’s primary online public compound databases. Containing tens of millions of chemical compounds and its associated data ChemSpider serves data tens of thousands of chemists every day and it serves as the foundation for many important international projects to integrate chemistry and biology data, facilitate drug discovery efforts and help to identify new chemicals from under the ocean. This presentation will provide an overview of the expanding reach of this cheminformatics platform and the nature of the solutions that it helps to enable including structure validation and text mining and semantic markup. ChemSpider is limited in scope as a chemical compound database and we are presently architecting the RSC Data Repository, a platform that will enable us to extend our reach to include chemical reactions, analytical data, and diverse data depositions from chemists across various domains. We will also discuss the possibilities it offers in terms of supporting data modeling and sharing. The future of scientific information and communication will be underpinned by these efforts, influenced by increasing participation from the scientific community.
We have been working with Vincent Scalfani from the University of Alabama towards supporting a community of 3D printing crystal structure enthusiasts. There is a listserv, [3DP-XTAL] hosted by the university of Alabama and if you would like to be added to the listserv, simply email Vincent at vfscalfaniATuaDOTedu. They are also in the process of creating a 3D printing crystal structure wiki/blog for the community.
With Vincent as the driver we are creating a public on-line repository for 3D printable structure files (.stl and .wrl). He used Jmol to prepare ~30,000 molecules and solids in .wrl and .stl format and we will be hosting them on part of our data repository. We are very excited about this project and there will be more information at the upcoming 248th American Chemical Society Meeting in San Francisco, CA. See CINF Abstract # 125.
The flier that will be distributed at the IUCr meeting in Montreal in August is available on Slideshare here:
I give a lot of presentations. A lot. Maybe too many. At the impending ACS meeting in San Francisco I am giving nine presentations. When I give a presentation I like to share it afterwards. I need the distribution method to be quick, easy to use and hopefully let users of the platform find it if they were interested in it. I have used various platforms to disseminate my talks. There are really no usability issues with any of them….the various groups have done a good job building their platforms. I am a user of both Slideshare and Figshare and my accounts are here: Slideshare and Figshare. This week I received my weekly stats email and the numbers are below…>3000 views in one week and a total of 400,000 views total of my talks, preprints etc.
Compare this with my Figshare stats of >6600 views ever.
The majority of talks I upload to Slideshare have about 3000 views in 2 months as shown below…some have over 25000 now.
If I compare this with Figshare the most views I have is around 500 but that was over 18 months.
Clearly my presentations on Slideshare get way higher exposure. However, the usual question of quality vs quantity comes to bear. Likely the audience on Figshare, of scientists primarily, may be more my audience rather on Slideshare. What I should do, but it is time-consuming (but only a few additional minutes per presentation) is put the presentation to Slideshare, to Figshare, to my Academia.edu account, to my ResearchGate account, to Vimeo, to YouTube etc. But I only have so much time and right now my easiest deposition route is Slideshare. In terms of my actual prioritization of places to deposit, based on the number of views and downloads the order is
I have been working with the Kudos platform for a few weeks now…see here. Two weeks ago I chose to run an experiment. Here it is… (you may want to watch the video on the previous post first to understand what enriching an article is and I why I feel the platform is of value)
1) I enriched an article that I had authored in 2013. GENERALLY after I enrich an article I tweet it out and then look for the response… you can see some of the results below for the articles I have done…I am starting from most recent and going back to the 80s but with 150 articles to do it’s a long journey…
The important stats to take a look at are Kudos views, clickthroughs and Share referrals. ULTIMATELY we want clickthroughs and views on the publisher platform. Kudos views are good but Share Referrals are very useful I believe. In the list below notice that for the fifth article in the list that the referrals are ZERO and the Kudos Views are low relative to the others….but this is the only one I haven’t “shared”…i.e. no tweets and no facebook posts. My hypothesis was “Ok, so it’s not Kudos itself that is helping to drive the views/shares/clickthroughs but MY work to share…Can I prove this?”
2) In order to prove the hypothesis…and I think it’s done…I did the following.
- Choose one article that had been on Kudos for a while and had low views/shares (all do that have not been enriched)
- Enrich the article in increments and see if it makes a difference…see the A’s shown on the chart below as those are enriching activities
- Monitor the views and see if any enriching activities made a difference.
- Wait two weeks and share the article and see what happens
3) The chart below proves the point.
- Enrichment, while useful for me as it helps aggregate information of value to the article, does NOTHING to drive attention to the article…i.e. the community doesn’t know what I’ve done without me telling them
- Once I share then BOOM…views/accesses/share referrals go through the roof. I went from 7 to 42 Kudos views in <2 hours
So, an article languished on Kudos for two weeks with no real traffic. I enriched it…no real impact. Not until I released out to my networks, and it got retweeted and passed on to others did traffic increase. I have fairly good followings on the different social network tools built up over a number of years. But what will Kudos do for those people who don’t use Facebook or Twitter? Yes they can enrich the article but the only way to let people know then is via email. Pushing the Kudos articles out to networks on an authors behalf would be very useful of course. Things will get exciting if and when Kudos uses intelligent algorithms to deliver updates to people interested in specific article topics. Google Scholar Citations does this for me now…it uses my published articles to provide me with notifications and pointers to related articles, not just articles that cite me. If Kudos could send me an email with “You might be interested in these new articles claimed on Kudos…” then that may be of value also. I think a Follow button would make sense whereby I can follow an article and if it is enriched further by the author I am informed by Kudos regarding what new enrichment is added.
This presentation was given at the JC Bradley Memorial Symposium on 14th July 2014
Jean-Claude Bradley had an incredible passion for providing open science tools and data to the community. He had boundless energy, no shortage of ideas and ran so many projects in parallel that it was often difficult to keep up. But at RSC we tried. We provided access to our data, our application programming interfaces and lots of our out-of-hours time to help turn his vision into reality. As a result we helped in the delivery of the SpectralGame to help people learn about NMR and we supported the integration of our services into GoogleDocs underpinning the management and curation of physicochemical property data. We tweaked a number of our services based on JC’s input and as a result we have ended up with a suite of capabilities that serve many of our existing efforts to integrate to electronic lab notebooks and support the ongoing shift towards Open Chemistry. JC was very much ahead of his time….and we were glad to have supported his work. This presentation will give a snapshot of some of the work we did to support his vision.