Applying RSC cheminformatics skills to support the PharmaSea project at #ACSsanfran

This is the first presentation I gave at the ACS meeting in San Francisco on Sunday morning (August 8th) in the CINF Natural Products session.

Applying Royal Society of Chemistry cheminformatics skills to support the PharmaSea project

The collaborative project PharmaSea brings European researchers to some of the deepest, coldest and hottest places on the planet. Scientists from the UK, Belgium, Norway, Spain, Ireland, Germany, Italy, Switzerland and Denmark are working together to collect and screen samples of mud and sediment from huge, previously untapped, oceanic trenches. The large-scale, four-year project is backed by almost 10 million euros of funding and brings together 24 partners from 13 countries from industry, academia and non-profit organisations. The PharmaSea project focuses on biodiscovery research and the development and commercialisation of new bioactive compounds from marine organisms, including deep-sea sponges and bacteria, to evaluate their potential as novel drug leads or ingredients for nutrition or cosmetic applications. The Royal Society of Chemistry is responsible for developing a number of capabilities to support the Pharmasea project including a chemical registration system for new compounds, dereplication technologies to assist in the identification of new compounds and search techniques for mass spectrometrists within the project. This presentation will provide an overview of the project and our progress to contributing chemical information technologies to support the effort.

No Comments

Experiences in Hosting Big Chemistry Data Collections for the Community

This is a presentation I gave at the National Institute of Standards and Technology on July 30th 2014

Experiences in Hosting Big Chemistry Data Collections for the Community

Access to scientific information has changed dramatically as a result of the web and its underpinning technologies. The quantities of data, the array of tools available to search and analyze, the devices and the shift in community participation continues to expand while the pace of change does not appear to be slowing. RSC hosts a number of chemistry data resources for the community including ChemSpider, one of the community’s primary online public compound databases. Containing tens of millions of chemical compounds and its associated data ChemSpider serves data tens of thousands of chemists every day. The platform offers the ability for crowdsourcing enabling the community to deposit and curate data. This presentation will provide an overview of the expanding reach of this cheminformatics platform and the nature of the solutions that it helps to enable including structure validation and text mining and semantic markup. ChemSpider is limited in scope as a chemical compound database and we are presently architecting the RSC Data Repository, a platform that will enable us to extend our reach to include chemical reactions, analytical data, and diverse data depositions from chemists across various domains. We will also discuss the possibilities it offers in terms of supporting data modeling and sharing. The future of scientific information and communication will be underpinned by these efforts, influenced by increasing participation from the scientific community.

No Comments

Data Mining Dissertations and Adventures and Experiences in the World of Chemistry

Data Mining Dissertations and Adventures and Experiences in the World of Chemistry

This presentation was given at the CLIR/DLF Postdoctoral Fellowship Summer Seminar at Bryn Mawr college in Pennsylvania on July 29th 2014. The intention was to communicate what we are doing in the fields of text and data mining in the domain of chemistry and specifically around mining the RSC archive publication and chemistry dissertations and theses. How would these experiences map over to the humanities?

,

2 Comments

Current Initiatives in Developing Research Data Repositories at the Royal Society of Chemistry

I presented at the Food and Drug Administration today regarding some of our efforts to develop a research data repository for the community. The abstract and presentation from Slideshare is below.

Current Initiatives in Developing Research Data Repositories at the Royal Society of Chemistry

Access to scientific information has changed in a manner that was likely never even imagined by the early pioneers of the internet. The quantities of data, the array of tools available to search and analyze, the devices and the shift in community participation continues to expand while the pace of change does not appear to be slowing. RSC hosts a number of chemistry data resources for the community including ChemSpider, one of the community’s primary online public compound databases. Containing tens of millions of chemical compounds and its associated data ChemSpider serves data tens of thousands of chemists every day and it serves as the foundation for many important international projects to integrate chemistry and biology data, facilitate drug discovery efforts and help to identify new chemicals from under the ocean. This presentation will provide an overview of the expanding reach of this cheminformatics platform and the nature of the solutions that it helps to enable including structure validation and text mining and semantic markup. ChemSpider is limited in scope as a chemical compound database and we are presently architecting the RSC Data Repository, a platform that will enable us to extend our reach to include chemical reactions, analytical data, and diverse data depositions from chemists across various domains. We will also discuss the possibilities it offers in terms of supporting data modeling and sharing. The future of scientific information and communication will be underpinned by these efforts, influenced by increasing participation from the scientific community.

No Comments

Converting Crystal Structures into 3D Printable Files

We have been working with Vincent Scalfani from the University of Alabama towards supporting a community of 3D printing crystal structure enthusiasts. There is a listserv, [3DP-XTAL] hosted by the university of Alabama and if you would like to be added to the listserv, simply email Vincent at vfscalfaniATuaDOTedu. They are also in the process of creating a 3D printing crystal structure wiki/blog for the community.

With Vincent as the driver we are creating a public on-line repository for 3D printable structure files (.stl and .wrl). He used Jmol to prepare ~30,000 molecules and solids in .wrl and .stl format and we will be hosting them on part of our data repository.  We are very excited about this project and there will be more information at the upcoming 248th American Chemical Society Meeting in San Francisco, CA. See CINF Abstract # 125.

The flier that will be distributed at the IUCr meeting in Montreal in August is available on Slideshare here:

No Comments

Choosing Between Slideshare or Figshare to Share my Presentations

I give a lot of presentations. A lot. Maybe too many. At the impending ACS meeting in San Francisco I am giving nine presentations. When I give a presentation I like to share it afterwards. I need the distribution method to be quick, easy to use and hopefully let users of the platform find it if they were interested in it. I have used various platforms to disseminate my talks. There are really no usability issues with any of them….the various groups have done a good job building their platforms. I am a user of both Slideshare and Figshare and my accounts are here: Slideshare and Figshare. This week I received my weekly stats email and the numbers are below…>3000 views in one week and a total of 400,000 views total of my talks, preprints etc.

My Slideshare Stats Delivered by Email

Compare this with my Figshare stats of >6600 views ever.

My total Figshare Stats

The majority of talks I upload to Slideshare have about 3000 views in 2 months as shown below…some have over 25000 now.

>3000 downloads in 2 months on Slideshare

If I compare this with Figshare the most views I have is around 500 but that was over 18 months.

Top viewed presentations on Figshare

Clearly my presentations on Slideshare get way higher exposure. However, the usual question of quality vs quantity comes to bear. Likely the audience on Figshare, of scientists primarily, may be more my audience rather on Slideshare. What I should do, but it is time-consuming (but only a few additional minutes per presentation) is put the presentation to Slideshare, to Figshare, to my Academia.edu account, to my ResearchGate account, to Vimeo, to YouTube etc. But I only have so much time and right now my easiest deposition route is Slideshare. In terms of my actual prioritization of places to deposit, based on the number of views and downloads the order is

Slideshare>ResearchGate>Academia.edu>Figshare

I specifically like the fact that Slideshare is picked up by ImpactStory. impactstory4

No Comments

How important is my participation in driving traffic to my Kudos Articles?

I have been working with the Kudos platform for a few weeks now…see here.  Two weeks ago I chose to run an experiment. Here it is… (you may want to watch the video on the previous post first to understand what enriching an article is and I why I feel the platform is of value)

1)      I enriched an article that I had authored in 2013. GENERALLY after I enrich an article I tweet it out and then look for the response… you can see some of the results below for the articles I have done…I am starting from most recent and going back to the 80s but with 150 articles to do it’s a long journey…

kudos1

The important stats to take a look at are Kudos views, clickthroughs and Share referrals. ULTIMATELY we want clickthroughs and views on the publisher platform. Kudos views are good but Share Referrals are very useful I believe. In the list below notice that for the fifth article in the list that the  referrals are ZERO and the Kudos Views are low relative to the others….but this is the only one I haven’t “shared”…i.e. no tweets and no facebook posts. My hypothesis was “Ok, so it’s not Kudos itself that is helping to drive the views/shares/clickthroughs but MY work to share…Can I prove this?”

2)      In order to prove the hypothesis…and I think it’s done…I did the following.

  1. Choose one article that had been on Kudos for a while and had low views/shares (all do that have not been enriched)
  2. Enrich the article in increments and see if it makes a difference…see the A’s shown on the chart below as those are enriching activities
  3. Monitor the views and see if any enriching activities made a difference.
  4. Wait two weeks and share the article and see what happens

kudos2

3)      The chart below proves the point.

  1. Enrichment, while useful for me as it helps aggregate information of value to the article, does NOTHING to drive attention to the article…i.e. the community doesn’t know what I’ve done without me telling them
  2. Once I share then BOOM…views/accesses/share referrals go through the roof. I went from 7 to 42 Kudos views in <2 hours

kudos3So, an article languished on Kudos for two weeks with no real traffic. I enriched it…no real impact. Not until I released out to my networks, and it got retweeted and passed on to others did traffic increase. I have fairly good followings on the different social network tools built up over a number of years. But what will Kudos do for those people who don’t use Facebook or Twitter? Yes they can enrich the article but the only way to let people know then is via email. Pushing the Kudos articles out to networks on an authors behalf would be very useful of course. Things will get exciting if and when Kudos uses intelligent algorithms to deliver updates to people interested in specific article topics. Google Scholar Citations does this for me now…it uses my published articles to provide me with notifications and pointers to related articles, not just articles that cite me. If Kudos could send me an email with “You might be interested in these new articles claimed on Kudos…” then that may be of value also. I think a Follow button would make sense whereby I can follow an article and if it is enriched further by the author I am informed by Kudos regarding what new enrichment is added.

 

No Comments

Providing support for JC Bradleys vision of open science using RSC cheminformatics platforms

This presentation was given at the JC Bradley Memorial Symposium on 14th July 2014

Jean-Claude Bradley had an incredible passion for providing open science tools and data to the community. He had boundless energy, no shortage of ideas and ran so many projects in parallel that it was often difficult to keep up. But at RSC we tried. We provided access to our data, our application programming interfaces and lots of our out-of-hours time to help turn his vision into reality. As a result we helped in the delivery of the SpectralGame to help people learn about NMR and we supported the integration of our services into GoogleDocs underpinning the management and curation of physicochemical property data. We tweaked a number of our services based on JC’s input and as a result we have ended up with a suite of capabilities that serve many of our existing efforts to integrate to electronic lab notebooks and support the ongoing shift towards Open Chemistry. JC was very much ahead of his time….and we were glad to have supported his work. This presentation will give a snapshot of some of the work we did to support his vision.

No Comments

Dedications to the Legacy of Jean-Claude Bradley

On July 14th 2014 the Jean-Claude Bradley Memorial Symposium was held to celebrate the life and work of Professor Jean-Claude Bradley of Drexel University. This slide deck highlighting dedications made to JC on various blogs and the memorial symposium wiki helps to capture JC’s contributions to science and how we felt about him.

No Comments

A Photo Loop of Jean-Claude Bradley for the Bradley Symposium

On 14th July 2014 a memorial symposium to celebrate the life and work of Professor Jean-Claude Bradley, the father of Open Notebook Science, used this photo loop to connect us to some of his activities and give us a glimpse into his personal life.

 

No Comments