Category Archives: AltMetrics

Presentations at the ACS Meeting in Denver

Having just returned from Pittcon late last night I am now turning my attention to the next set of presentations to be given at the ACS Denver meeting. These are listed below. If any of the blog readers will be at the ACS meeting it would be great to catch-up. See you there.

PAPER TITLE: Importance of data standards for large scale data integration in chemistry (final paper number: CINF 39)
DAY & TIME OF PRESENTATION: Wednesday, March, 25, 2015 from 11:20 AM – 11:50 AM
ROOM & LOCATION: Room 110 – Colorado Convention Center

Increasingly online databases are being used for the purpose of structure identification. In many cases an unknown to an investigator is known in the chemical literature or online database and these “known unknowns” are commonly available in these aggregated internet resources. The identification of these types of compounds in commercial, environmental, forensic, and natural product samples can be identified by searching against these large aggregated databases querying by either elemental composition or monoisotopic mass. We will report on the search approaches that we offer on aggregated compound databases hosted by the Royal Society of Chemistry and how these resources can be used for the purpose of structure identification. We will also report on our progress in the area of hosting interactive spectral data, including assignments, on our data repository and how we are using our analytical data platform for the purpose of natural product dereplication.


PAPER TITLE: Give me kudos for taking responsibility for self-marketing my scientific publications and increase impact (final paper number: CINF 8)
DAY & TIME OF PRESENTATION: Sunday, March, 22, 2015 from 2:15 PM – 2:40 PM
ROOM & LOCATION: Room 110 – Colorado Convention Center

The authoring of a scientific publication can represent the culmination of many tens if not 100s of hours of data collection and analysis. The authoring and peer-review process itself often represents a major undertaking in terms of assembling the publication and passing through review. Considering the amount of work invested in the production of a scientific article it is therefore quite surprising that authors, post-publication, invest very little effort in communicating the value and potential impact of their article to the community. Social networking has clearly demonstrated the ability to self-market and drive attention. At the same time, the increasing volume of literature (over a million new articles are published every year), requires authors to take on a more direct role in ensuring their work gets read and cited. This requirement may grow with the emergence of a range of metrics at the article level, shifting attention away from where a researcher publishes to the performance of their individual articles. Therefore, a separate platform to facilitate social networking and other discovery tools to communicate the value of published science to the community would be of value. In parallel the possibility to enhance an article by linking to additional information (presentations, videos, blog posts etc) allows for enrichment of the article post-publication, a capability not available via the publishers platform. This presentation will provide a personal overview of the experiences of using the Kudos Platform and how it ultimately benefits my ability to communicate an integrated view of my research to the community.



PAPER TITLE: Providing access to a million NMR spectra via the web (final paper number: CHED 91)
SESSION: NMR Spectroscopy in the Undergraduate Curriculum
DAY & TIME OF PRESENTATION: Sunday, March, 22, 2015 from 4:15 PM – 4:35 PM
ROOM & LOCATION: Gold – Sheraton Denver Downtown Hotel

Access to large scale NMR collections of spectral data can be used for a number of purposes in terms of teaching spectroscopy to students. The data can be used for teaching purposes in lectures, as training data sets for spectral interpretation and structure elucidation, and to underpin educational resources such as the Royal Society of Chemistry’s SpectralGame ( These resources have been available for a number of years but have been limited to rather small collections of spectral data and specifically only about 3000 spectra. In order to expand the data collection and provide richer resources for the community we have been gathering data from various laboratories and, as part of a research project, we have used text-mining approaches to extract spectral data from articles and patents in the form of textual strings and utilized algorithms to convert the data into spectral representations. While these spectra are reconstructions of text representations of the original spectral data we are investigating their value in terms of utilizing for the purpose of structure identification. This presentation will report on the processes of extracting structure-spectral pairs from text, approaches to performing automated spectral verification and our intention to assemble a spectral collection of a million NMR spectra and make them available online.


PAPER TITLE: Using online chemistry databases to facilitate structure identification in mass spectral data (final paper number: ANYL 45)
SESSION: Advances in Mass Spectrometry
DAY & TIME OF PRESENTATION: Tuesday, March, 24, 2015 from 8:45 AM – 9:05 AM
ROOM & LOCATION: Aspen Room A – Embassy Suites Denver – Downtown Convention Center

The Royal Society of Chemistry hosts large scale data collections and provides access to the data to the chemistry community. The largest RSC data set of wide scale interest to the community offers access to tens of millions of compounds. The host platform, ChemSpider, is limited as it is a structure centric hub only. A new architecture, the RSC data repository, has been developed that extends support to reactions, spectral data, crystallography data and related property data. It is also the architecture underlying a series of exemplar projects for managing data for a number of diverse laboratories. The adoption of data standards for the integration and distribution of data has been essential. Specific standards include molecular structure formats such as molfiles and InChIs, and spectral data formats such as JCAMP. This presentation will report on our development of the data repository, the importance of utilizing standards for data integration, the flexible nature of the architecture to deliver solutions for various laboratories and our efforts to develop new large data collections. This includes text-mining efforts to extract large spectrum-structure collections from large corpuses.

Leave a comment

Posted by on March 13, 2015 in ACS Meetings, AltMetrics, Kudos


Can I use Social Networking Tools to Awaken Old Articles?

I am running a number of experiments right now on behalf of my friend and colleague Will Russell to see what ammunition I can give him for a presentation later this week. I am already running the experiment detailed here “Running an Experiment Regarding Growing AltMetrics Using Kudos” and the data is clear…it’s working. But what I think is working is that I am simply claiming articles on Kudos, enriching them as appropriate and I am doing the work to push the info out to the social networks…sharing it via email, Facebook and Twitter. I ran some bland tweets and facebook posts about some articles and got the expected resulted..low altmetric scores. I got a little creative about our article on Fuzzy Structure Generation and some quips about pulling my hair out over the science etc. and boom Altmetrics score went up dramatically. I am about to spike it again I hope with another tweet. This is simply pushing up the Altmetric score with NO INDICATION that anyone read the article, cared about the science, or even looked at it. So this does beg the question whether or not an increase in the Altmetric score means anything but this is a different conversation and one that has happened many times. This experiment is simply showing how important my own involvement is is shifting things along…well that’s my interpretation at least.

Now what I want to do is to NOT use Kudos to push out the social networking posts etc but simply do the work away from the platform and see whether the Altmetric score grows, and how fast can I move it. I have a whole set of articles regarding Electron Paramagnetic Resonance that are hard to make exciting. But the one on eight carbon alkyl chains and molecular motions is a good one so I have chosen that one to shift. Notice LOW kudos views and no Altmetric score…last column.



It’s from 1990 and, from my point of view, this was MY breakthrough work in my thesis…I was able to learn a lot about what it means to be a scientist, to develop a hypothesis and analyze data. I am very proud of this work….

May the experiment begin….


Leave a comment

Posted by on November 18, 2014 in AltMetrics, Kudos


A presentation at Research Square: The Benefits of Participation in the Social Web of Science

Yesterday I had the privilege of giving a presentation at Research Square in Durham. In terms of an audience, and an environment to present, it was certainly an ideal environment and very recipient audience….but how could it not be with their mission being to provide “research communication without roadblocks”. As the MC for the day commented about when she joined Research Square “I thought “I’d found my peeps””. So many of the conversations over lunch were about commonality of views..and it appears…our networks are so similar….yup, definitely my type of peeps. 🙂

If you don’t think you know Research Square then maybe you know some of their brands? Rubriq, Journal Guide and American Journal Experts.

The Benefits of Participation in the Social Web of Science

With the flourishing environment of platforms for sharing data, establishing an online profile and engaging in scientific discourse through alternative modes of publishing and participation, there are numerous potential benefits. However, while many scientists invest significant amounts of time in sharing their activities and opinions with friends and family the majority do not make use of the new opportunities to participate in the developing social web of science, despite the potential impact and influence on future careers. We now have many new ways to contribute to science outside of the classical publishing model. These include the ability to annotate and curate data, to “publish” in new ways on blogs and micropublishing sites, and many of these activities can be as part of a growing crowdsourcing network. Our efforts in this area are already being indexed and exposed on the internet via our publications, presentations and data and increasingly we are being quantified. This presentation will provide an overview of the various types of networking and collaborative sites available to scientists and ways to expose their scientific activities online. Many of these can ultimately contribute to the developing metrics of a scientist as identified in the new world of alternative metrics. Participation offers a great opportunity to develop a scientific profile within the community and may ultimately be very beneficial, especially to scientists early in their career.



Dealing with the Complex Challenge of Managing Diverse Chemistry Data Online to Enable Chemistry Across the World #ACSsanfran

This is my third presentation today at the ACS meeting in San Francisco on 11th August 2014

Dealing with the Complex Challenge of Managing Diverse Chemistry Data Online to Enable Chemistry Across the World

The Royal Society of Chemistry has provided access to data associated with millions of chemical compounds via our ChemSpider database for over 5 years. During this period the richness and complexity of the data has continued to expand dramatically and the original vision for providing an integrated hub for structure-centric data has been delivered across the world to hundreds of thousands of users. With an intention of expanding the reach to cover more diverse aspects of chemistry-related data including compounds, reactions and analytical data, to name just a few data-types, we are in the process of implementing a new architecture to build a Chemistry Data Repository. The data repository will manage the challenges of associated metadata, the various levels of required security (private, shared and public) and exposing the data as appropriate using semantic web technologies. Ultimately this platform will become the host for all chemicals, reactions and analytical data contained within RSC publications and specifically supplementary information. This presentation will report on how our efforts to manage chemistry related data has impacted chemists and projects across the world and will review specifically our contributions to projects involving natural products for collaborators in Brazil and China, for the Open Source Drug Discovery project in India, and our collaborations with scientists in Russia.



Encouraging students to start publishing early in their career #ACSsanfran

My second talk of three on August 11th 2014 at the ACS Meeting in San Francisco.

Encouraging students to start publishing early in their career

Many students spend enormous amounts of their time engaged with their computers, accepting of course that mobile devices are simply computers of a different form factor. Engaged with the social networks, utilizing computer platforms to source and share content of various forms, their contributions of “data” into what is the cloud, and in many cases a void, is enormous. What community and career benefit might result from those students spending some of their time contributing chemistry related data to the world? What challenges lie in the way of their participation and how might participating have a positive, or negative impact on their future career. The Royal Society of Chemistry hosts a number of chemistry data platforms to which students can actively contribute and for which their participation can be measured. Moreover the RSC’s micropublishing platform allows chemists to learn how to write up their scientific work, obtain review from their peers and chemistry professors in a non-threatening environment and produce an online published work in less than day that is both citable and available as a shared resource for the community. This presentation will demonstrate how to participate and encourage engagement from students early in their education. There are no longer any technology barriers to the sharing of the majority of chemistry related data.



Data Mining Dissertations and Adventures and Experiences in the World of Chemistry

Data Mining Dissertations and Adventures and Experiences in the World of Chemistry

This presentation was given at the CLIR/DLF Postdoctoral Fellowship Summer Seminar at Bryn Mawr college in Pennsylvania on July 29th 2014. The intention was to communicate what we are doing in the fields of text and data mining in the domain of chemistry and specifically around mining the RSC archive publication and chemistry dissertations and theses. How would these experiences map over to the humanities?


Tags: ,

Choosing Between Slideshare or Figshare to Share my Presentations

I give a lot of presentations. A lot. Maybe too many. At the impending ACS meeting in San Francisco I am giving nine presentations. When I give a presentation I like to share it afterwards. I need the distribution method to be quick, easy to use and hopefully let users of the platform find it if they were interested in it. I have used various platforms to disseminate my talks. There are really no usability issues with any of them….the various groups have done a good job building their platforms. I am a user of both Slideshare and Figshare and my accounts are here: Slideshare and Figshare. This week I received my weekly stats email and the numbers are below…>3000 views in one week and a total of 400,000 views total of my talks, preprints etc.

My Slideshare Stats Delivered by Email

Compare this with my Figshare stats of >6600 views ever.

My total Figshare Stats

The majority of talks I upload to Slideshare have about 3000 views in 2 months as shown below…some have over 25000 now.

>3000 downloads in 2 months on Slideshare

If I compare this with Figshare the most views I have is around 500 but that was over 18 months.

Top viewed presentations on Figshare

Clearly my presentations on Slideshare get way higher exposure. However, the usual question of quality vs quantity comes to bear. Likely the audience on Figshare, of scientists primarily, may be more my audience rather on Slideshare. What I should do, but it is time-consuming (but only a few additional minutes per presentation) is put the presentation to Slideshare, to Figshare, to my account, to my ResearchGate account, to Vimeo, to YouTube etc. But I only have so much time and right now my easiest deposition route is Slideshare. In terms of my actual prioritization of places to deposit, based on the number of views and downloads the order is


I specifically like the fact that Slideshare is picked up by ImpactStory. impactstory4


Being ignored during the review process and how I would address issues in a paper today

MOST people who are reading this blog post have likely performed peer review over the years. I have reviewed a lot of manuscripts over the years. It has changed a lot over the past decade in many ways. A couple of examples of how things have changed for me

1) More requests to review papers – and I increasingly turn down requests because they are from journals I have never heard of (some may call them “predatory publishers”), some are in areas for which I have no expertise (e.g. electrical engineering), and sometimes because I simply don’t have time.

2) I have seen papers I have reviewed show up essentially untouched in other journals (no edits and simply reformatted) and commonly these “refused papers” are accepted into what I deem to be “lower quality” publications.

Of course over the past ten years I’ve also had a lot of papers go through peer review for myself and my co-authors. This experience has also been very interesting, if not entertaining. Some examples:

1) I have experienced the third reviewer where an editor has held up a manuscript or demanded changes to match some of their own expectations while other reviewers were publish as is.

2) I have had the request to shorten excellent manuscripts to help with “page limits”….in the electronic age???

3) I have been on the receiving end of non-scientific reviews that have blocked a paper. My personal favorite “Mobile apps are a fad of the youth.”

My best story of peer review, and an example where modern technologies would have been so enabling at the time, is as follows.

I was asked to review a paper regarding the performance of Carbon-13 NMR prediction for this paper. A slice of the abstract says

“Further we compare the neural network predictions to those of a wide variety of other 13C chemical shift prediction tools including incremental methods (CHEMDRAW, SPECTOOL), quantum chemical calculation (GAUSSIAN, COSMOS), and HOSE code fragment-based prediction (SPECINFO, ACD/CNMR, PREDICTIT NMR) for the 47 13C-NMR shifts of Taxol, a natural product including many structural features of organic substances. The smallest standard deviations were achieved here with the neural network (1.3 ppm) and SPECINFO (1.0 ppm).”

This was an important time for me as this paper was comparing various NMR predictors and comparing the performance based on ONE chemical structure. And while any one point of comparison is up for discussion there were 47 shifts so you could argue it is a bigger data set. One of the programs under review was a PRODUCT that I managed at ACD/Labs, CNMR Predictor. Therefore I clearly had a concern as, essentially, the success of this product was partly responsible for my income. Any comparison that made the software look poor in performance was an issue. Was this a conflict of interest…maybe…but I judge myself to still be objective.

Table 3 listed the experimental shifts as well as the predicted shifts from the different algorithms and the size of the accompanying circle/ellipse was a visual indicator of a large difference between experimental and predicted. We will assume that all experimental assignments are correct and that there are no transcription errors between the predicted values from each algorithm and input into the table. A piece of Table 3 is shown below.

A portion of Table 3

A portion of Table 3


I kind of pride myself on being a little bit of a stickler for detail when it comes to reviewing data quality. Those of you who read this blog will know that. As I reviewed the data I was a little puzzled by the magnitude of the errors for certain Carbon nuclei, specifically for Carbons 23 and 27.

The ACD/CNMR 6.0 predicted values are in the right hand column. The size of the circles indicates size of errors

The ACD/CNMR 6.0 predicted values are in the right hand column. The size of the circles indicates size of errors – I suspected that 132.8 and 142.7 ppm had been switched. That led to a deeper analysis.

What was interesting to me was that the experimental shifts for 23 and 27 were 142.0, 133.2 ppm respectively yet the predicted shifts were 132.8, 142.7 ppm respectively. It struck me that they looked like they were switched. This was what drew my attention to reviewing the data in more detail. I will cut a long story short but I redrew the molecule of Taxol as input into the same version of software that was used for the publication and got a DIFFERENT answer than that reported. I was able to distinguish WHY it was different…it was down to the orientation of a bond in the input molecule that was input by the reporting authors and this made the CNMR prediction worse.

I reported this detail to the editors in a detailed letter and recommended the manuscript for publication with the caveat that the numbers for the column representing CNMR 6.0 be edited to accurately reflect the performance of the algorithm and provide the details. I was shocked to see the manuscript published later WITHOUT any of the edits made for the numbers and inaccurately representing the performance of the algorithm. I contacted the editors and after a couple of exchanges received quite a dressing down that the editor overseeing the manuscript refused to get between a commercial concern and reported science.

What does this mean? That software companies don’t do science and only academics do? I have similar experience of my colleagues in industry being treated with bias relative to my colleagues in academia. I believe my friends in industry, commercial concerns and academia can all be objective scientists….and after all, doesn’t academia teach the chemists that come out to industry and the commercial software world? These are my experiences…I welcome any comments you may have about the bias. BUT, back to the story…

The manuscript was published in June 2002 and as product manager I had to deal with questions around algorithmic performance for many months because “the peer-review literature said…”. This was NOT the only instance of a situation like this as a couple of years later it was reported that ACD/CNMR could not handle stereochemistry only to determine with the scientist who wrote the paper that he had thrown a software switch that affected his results. Software can be tricky and unfortunately the best performance can often come through the hands of those that write the software. Sad but true in many cases.

In August 2004 we published an addendum with one of the original authors regarding the work describing the entire situation in detail. It was over two years from the original publication to the final addendum. I do not believe there was any malicious intent on behalf of the authors of the original manuscript but that was in the days where the only place to issue a rebuttal was in the journal and we could not get editorial support to do it. How would it happen today if a paper came out that was suspicious. There are a myriad number of tools available now….

A Comparison of Errors - Left Column is Original Paper and Right Hand Side is Rebuttal

A Comparison of Errors – Left Column is Original Paper and Right Hand Side is Rebuttal. Notice the SMALL circles for the final paper – SMALL errors

Yes, I would blog the story here, as I am doing now. Yes I would express concern at the situation on Twitter with the hope of gaining redress. I would likely tell the story in a Slideshare presentation and make a narrated movie and make it available via an embed in the Slideshare presentation on my account. I would hope that the publisher nowadays would at least allow me to add a comment to the article but I do  understand that this comment would likely be monitored and mediated and they may choose not to expose it to the readers. I like the implementation on PLoS and have used it on one of our articles previously.

Could I maybe make use of a technology like Kudos that I have started using. I have reported it on this blog already here. I certainly could not claim the ORIGINAL article and start associating information with it regarding the performance of the algorithms…and that is a shame. But MAYBE in the future Kudos would consider letting OTHER people make comments and associate information/data with an article on Kudos. Risky? Maybe. However, I can claim the rebuttal that I was a co-author on and start associating information with that….certainly the original paper and ultimately linking to this blog. In fact, in the future is a rebuttal going to be a manuscript that I publish out on something like Figshare, grab a DOI there and maybe ask Kudos to treat that as a published rebuttal? Peer review of that rebuttal could then happen as comments on Figshare and Kudos directly and maybe in the future Kudos Views and Altmetric measures of that becomes a measure of the importance. We live in very interesting times as these technologies expand, mesh and integrate.


Give me KUDOS for my articles

Over the past few years I have learned how to use a lot of the social networking tools and platforms to host and share my publications (when I am allowed to), my presentations, videos etc. I have started using a new website,, to help me enrich, expose and measure my publications. This is VERY EARLY in my exposure and usage of the platform but I am already excited by the possibilities. I applied KUDOS to one of the articles I co-authored with Sean Ekins and Joe Olechno regarding “Dispensing Processes Impact Apparent Biological Activity as Determined by Computational and Statistical Analyses“. With almost 10,000 views it has become a very interesting article and has been discussed many times so there was a lot of online information to enrich the article with. The resulting KUDOS page is here:

Slideshare Article

Youtube Video


Beyond the paper CV and developing a scientific profile through social media, altmetrics and micropublication

This is a presentation that I will have delivered twice here in the UK this week…

Beyond the paper CV and developing a scientific profile through social media, altmetrics and micropublication

Many of us nowadays invest significant amounts of time in sharing our activities and opinions with friends and family via social networking tools. However, despite the availability of many platforms for scientists to connect and share with their peers in the scientific community the majority do not make use of these tools, despite their promise and potential impact and influence on our future careers. We are being indexed and exposed on the internet via our publications, presentations and data. We also have many more ways to contribute to science, to annotate and curate data, to “publish” in new ways, and many of these activities are as part of a growing crowdsourcing network. This presentation will provide an overview of the various types of networking and collaborative sites available to scientists and ways to expose your scientific activities online. Many of these can ultimately contribute to the developing measures of you as a scientist as identified in the new world of alternative metrics. Participating offers a great opportunity to develop a scientific profile within the community and may ultimately be very beneficial, especially to scientists early in their career.