Practical semantics in the pharmaceutical industry – the Open PHACTS project

This is ,y fourth talk at the ACS Indianapolis Conference:

Practical semantics in the pharmaceutical industry – the Open PHACTS project

The information revolution has transformed many business sectors over the last decade and the pharmaceutical industry is no exception. Developments in scientific and information technologies have unleashed an avalanche of content on research scientists who are struggling to access and filter this in an efficient manner. Furthermore, this domain has traditionally suffered from a lack of standards in how entities, processes and experimental results are described, leading to difficulties in determining whether results from two different sources can be reliably compared. The need to transform the way the life-science industry uses information has led to new thinking about how companies should work beyond their firewalls. In this talk we will provide an overview of the traditional approaches major pharmaceutical companies have taken to knowledge management and describe the business reasons why pre-competitive, cross-industry and public-private partnerships have gained much traction in recent years. We will consider the scientific challenges concerning the integration of biomedical knowledge, highlighting the complexities in representing everyday scientific objects in computerised form. This leads us to discuss how the semantic web might lead us to a long-overdue solution. The talk will be illustrated by focusing on the EU-Open PHACTS initiative (openphacts.org), established to provide a unique public-private infrastructure for pharmaceutical discovery. The aims of this work will be described and how technologies such as just-in-time identity resolution, nanopublication and interactive visualisations are helping to build a powerful software platform designed to appeal to directly to scientific users across the public and private sectors.

 

No Comments

Personal experiences in participating in the expanding social networks for science

This is the third presentation I gave at the ACS Meeting in Indianapolis:

Personal experiences in participating in the expanding social networks for science

The number of social networking sites available to scientists continues to grow. We are being indexed and exposed on the internet via our publications, presentations and data. We have many ways to contribute, annotate and curate, many of them as part of a growing crowdsourcing network. As one of the founders of the online ChemSpider database I was drawn into the world of social networking to participate in the discussions that were underway regarding our developing resource. As a result of my experiences in blogging, and as a result of developing collaborations and engagement with a large community of scientists, I have become very immersed in the expanding social networks for science. This presentation will provide an overview of the various types of networking and collaborative sites available to scientists and ways that I expose my scientific activities online. Many of these activities will ultimately contribute to the developing measures of me as a scientist as identified in the new world of alternative metrics.

No Comments

Accessing chemical health and safety data online using Royal Society of Chemistry resources

This is the second presentation I gave at the ACS Meeting in Indianapolis

Accessing chemical health and safety data online using Royal Society of Chemistry resources

The internet has opened up access to large amounts of chemistry related data that can be harvested and assembled into rich resources of value to chemists. The Royal Society of Chemistry’s ChemSpider database has assembled an electronic collection of over 28 million chemicals from over 400 data sources and some of the assembled data is certainly of value to those searching for chemical health and safety information. Since ChemSpider is a text and structure searchable database chemists are able to find relevant information using both of their general search approaches. This presentation will provide an overview of the types of chemical health and safety data and information made available via ChemSpider and discuss how the data are sourced, aggregated and validated. We will examine how the data can be made available via mobile devices and examine the issue of data quality and its potential impacts on such a database.

 

No Comments

Apps and approaches to mobilizing chemistry from the Royal Society of Chemistry

This is the first presentation I gave at the ACS Meeting in Indianapolis

Apps and approaches to mobilizing chemistry from the Royal Society of Chemistry

Mobilizing chemistry by delivering data and content from Royal Society of Chemistry resources has become an important component of our activities to increase accessibility. Content includes access to our publications, our magazine content and our chemistry databases. Mobile devices also allow us to deliver access to tools to support teaching, game-based learning, annotation and curation of data. This presentation will provide an overview of our varied activities in enhancing access to chemistry related data and materials. This will include providing data feeds associated with RSC graphical databases, our experiences in optical structure recognition using smartphone apps and our future vision for supporting chemistry on mobile devices.

No Comments

Calling on Co-facilitators for the Social Networking Symposium at ACS/Indianapolis

The ACS Fall meeting in Indianapolis is just around the corner really….and it is going to be a busy week for many of us. The RSC eScience team will be there in force and we have a LOT of presentations that we are involved with. One of the sessions that I am co-chairing (with Jennifer Maclachlan) will be on the “Role and Value of Social Networking in Advancing the Chemical Sciences“. There are going to be some great presentations so we hope to see you. Carmen Drahl will also be hosting a panel discussion at the end of the afternoon session.

After the workshop we will be hosting more of a hands on workshop for an hour and a half to discuss “Live Social Networking for the Chemical Sciences.” After a short introduction we will hold short discussions about each of the following topics:

Public Profile tools (sites/platforms where you can expose your work/activities/your profile/CV)

AltMetrics (How to feed them with data, use them, abuse them)

Reference Managers (Managing your references, citations, collaborative sharing)

Collaborative Platforms (Working together on data, projects, publications, presentations)

Online forums (Where do you go for advice, Collective wisdom, Online discussion groups)

While we could lead a discussion about each of these topics we wish to engage alternative discussion leaders to lead a conversation with the attendees and discuss the perks/benefits, dos-and-don’ts, war stories in each of these areas. So that we can know we have coverage can you please contact me directly if you are willing to lead a discussion for one of the topics. I am at TONY27587_AT_GMAIL_DOT_COM. While you may choose to represent your own profile our request is that you are neutral and lead a general discussion about what’s available. Looking forward to your participation!

 

 

No Comments

A List of Presentations that the Royal Society of Chemistry team is involved with at ACS Indianapolis

The ACS Indianapolis is going to be a very busy week for the RSC as evidenced by the long list of presentations we will be delivering at the conference….come along and see what we are up to…

SUNDAY

1. PRESENTER: Antony Williams

PAPER ID: 11394
PAPER TITLE: Apps and approaches to mobilizing chemistry from the Royal Society of Chemistry
SESSION: Chemistry on Tablet Computers
DAY & TIME OF PRESENTATION: September 08, 2013 from 8:10 am to 8:40 am
LOCATION: Indiana Convention Center, Room: 141

2. PRESENTER: Simon Coles (University of Southampton)

PAPER ID: 13
PAPER TITLE: Tablets in the lab: Enabling the flow of chemical synthesis data into a chemistry repository.
SESSION: Chemistry on Tablet Computers
DAY & TIME OF PRESENTATION: September 08, 2013 from 10:55 am to 11:25 am
LOCATION: Indiana Convention Center, Room: 141

3. PRESENTER: Antony Williams

PAPER ID: 12750
PAPER TITLE: Accessing chemical health and safety data online using Royal Society of Chemistry resources
SESSION: New Horizons in Chemical Health and Safety
DAY & TIME OF PRESENTATION: September 08, 2013 from 5:35 pm to 5:55 pm
LOCATION: Indiana Convention Center, Room: 115

MONDAY

4. PRESENTER: Antony Williams

PAPER ID: 12406
PAPER TITLE: @ChemConnector and my personal experiences in participating in the expanding social networks for science
SESSION: Role and Value of Social Networking in Advancing the Chemical Sciences
DAY & TIME OF PRESENTATION: September 09, 2013 from 8:20 am to 8:45 am
LOCATION: Indiana Convention Center, Room: 141

5. PRESENTER: Bibi Campos-Seijo

PAPER ID: 52
PAPER TITLE: Exploiting the digital landscape to advance the chemical sciences

SESSION: Role and Value of Social Networking in Advancing the Chemical Sciences
DAY & TIME OF PRESENTATION: September 09, 2013 from 1:00 pm to 1:25 pm
LOCATION: Indiana Convention Center, Room: 141

6. PRESENTER: Valery Tkachenko

PAPER ID:  57
PAPER TITLE: Building support for the semantic web for chemistry at the Royal Society of Chemistry
SESSION: Joint CINF-CSA Trust Symposium: Semantic Technologies in Translational Medicine and Drug Discovery
DAY & TIME OF PRESENTATION: September 09, 2013 from 1:35 pm to 2:05 pm
LOCATION: Indiana Convention Center, Room: 142

7. PRESENTER: Antony Williams

PAPER ID: 14637
PAPER TITLE: Practical semantics in the pharmaceutical industry: The Open PHACTS project
SESSION: Joint CINF-CSA Trust Symposium: Semantic Technologies in Translational Medicine and Drug Discovery
DAY & TIME OF PRESENTATION: September 09, 2013 from 3:20 pm to 3:50 pm
LOCATION: Indiana Convention Center, Room: 142

WEDNESDAY

8. PRESENTER: Antony Williams

PAPER ID: 10738
PAPER TITLE: Social profile of a chemist online: The potential profits of participation
SESSION: Before and After Lab: Instructing Students in ‘Non-Chemical’ Research Skills
DAY & TIME OF PRESENTATION: September 11, 2013 from 1:35 pm to 2:05 pm
LOCATION: Indiana Convention Center, Room: 141

9. PRESENTER: Antony Williams

PAPER ID: 11513
PAPER TITLE: Digitizing documents to provide a public spectroscopy database
SESSION: Back to the Future: Print Resources in a Digital World
DAY & TIME OF PRESENTATION: September 11, 2013 from 3:15 pm to 3:45 pm
LOCATION: Indiana Convention Center, Room: 141

 

THURSDAY

10. PRESENTER: Antony Williams

PAPER ID: 11519
PAPER TITLE: Importance of standards for data exchange and interchange on the Royal Society of Chemistry eScience platforms
SESSION: Exchangeable Molecular and Analytical Data Formats and their Importance in Facilitating Data Exchange
DAY & TIME OF PRESENTATION: September 12, 2013 from 9:10 am to 9:40 am
LOCATION: Indiana Convention Center, Room: 140

11. PRESENTER: Jean-Claude Bradley

PAPER ID:  114
PAPER TITLE: Practical open data exchange formats for open organic chemistry projects
SESSION: Exchangeable Molecular and Analytical Data Formats and their Importance in Facilitating Data Exchange
DAY & TIME OF PRESENTATION: September 12, 2013 from 10:55 am to 11:25 am
LOCATION: Indiana Convention Center, Room: 140

 

 

No Comments

What data do we trust now in the world of high-throughput screening and public compound databases

Let’s face it, the world of experimentation is fun, rewarding, challenging and depressing. Ok, that has been MY experience of the world of lab-based experimentation. I have made many discoveries and celebrated the true joy of being a lab-rat. Love it…always did. I remain polarized to this day by the number of hours I spent around large NMR magnets. No bias, but still polarized. But lab work is also challenging..sometimes not in a good way. Hours of “experiences”…read that as wasted time because of bad preparation on my part, or on a collaborator’s part, or bad chemicals, poorly calibrated equipment, the “person who came before me” scenario etc. Then there is the truly depressing that I experienced in some of my lab experience. Repeating work that someone else in my lab had done but the lack of a LIMS system didn’t allow me to know that; colleagues not checking materials shipped to them at a crucial stage of a synthesis and finding out what was ordered was not in the bottle (still their fault for not checking!); NMR solvents being really wet and causing nasty side effects on the compound; and, in my life….two magnet quenches in one day….a 500MHz and a 300Mhz. I shrugged and went home…

Some of my lab experiences were depressing but then I moved into cheminformatics. And in the past few years I have been depressed by the sad state of our public compound databases and the quality of data online. I have given dozens of presentations on the matter of data quality and these two blog posts are representative. We’ve also published on the issues of chemical compounds in the public databases and their correctness.

A Quality Alert and Call for Improved Curation of Public Chemistry Databases, A.J. Williams and S.Ekins, Drug Discovery Today, Link

Towards a gold standard: regarding quality in public domain chemistry databases and approaches to improving the situation, A.J. Williams, S. Ekins, V. Tkachenko, Drug discovery today, 5, 2012 Link

This work was always focused on chemical compound structure representations and their matches with synonyms, names etc. Were they what their names said they should be was the common question. After a couple of years of working on this, and publishing with Sean Ekins, we wondered about the data quality of the measured experimental data, especially in the public domain assay screening databases, PubChem of course being the granddaddy of them all. While work could be done to confirm name-structure relationships in PubChem the experimental data is what it is, as submitted. How to check for the data quality of measured experimental data – reproducibility, comparison between labs etc. Not easy.

When the opportunity came to investigate the possibilities of errors in experimental data we didn’t quite expect the results we obtained. Rather than explain the work in detail I encourage you to read the paper, Open Access on PLOS One and available here. The article, entitled “Dispensing Processes Impact Apparent Biological Activity as Determined by Computational and Statistical Analyses” can be summarized as follows:

* Serial dilution and dispensing using pipette tips versus acoustic dispensing with direct dilution can differ by orders of magnitude with no correlation

* The resulting computational 3D pharmacophores generated from data from both acoustic and tip-based transfer differ significantly

* Traditional dispensing processes are another important source of error in high-throughput screening that impacts computational and statistical analyses.

Derek Lowe on the “In the Pipeline” blog made some strong comments in his post about the paper. He called it a “truly disturbing paper” and said “…people who’ve actually done a lot of biological assays may well feel a chill at the thought, because this is just the sort of you’re-kidding variable that can make a big difference.” And he’s right. There is cause for concern. First of all we don’t know enough yet from this very small study to understand what classes of compounds are going to exhibit this effect of pipette vs. acoustic discrepancy. Secondly, there is no meta data associated with the assay data itself (that we are aware of) that captures the distinction in the dispensing process and this paper SHOULD encourage screeners to include this info in their data.

The difference in the tip vs. acoustic dispensing are of course only one of many issues that can accompany data measurements for compounds. Other obvious issues include what’s the purity of what’s being screened – is it one component or many….is an impurity showing the response and in terms of modeling does the compound being screened match the suggested compound that was purchased/synthesized? Classify this as analytical data required prior to screening. Reproducibility and replicates, assay performance, decomposition in storage, etc. Check out the comments on Derek’s blog as responses to his post and clearly the screening community understand many of the challenges and have to deal with them.

Once upon a time someone from pharma made a couple of comments that I found very interesting….1) it likely costs more to store the screening data long term and support the informatics systems that it does to regenerate the data with new and improved assays on an ongoing basis. 2) As assay performance is understood, and assuming that materials are available it is likely appropriate to flush any data older than three years and remeasure. Certainly with this observation of pipette vs. acoustic bias data measured with tips may need to get flushed and remeasured with acoustic dispensing methods.

This work describes the observed differences between tips and acoustic methods and improved pharmacophore correlations. It highlights issues that likely exist in the data sitting in the assay screening databases (compounded with chemistry issues) and brings into focus the question of what can be trusted in the data. For sure not all the data is bad but how to separate good from bad and what of the models that can be derived? As Derek summarized in his blog post “How many other datasets are hosed up because of this effect? Now there’s an important question, and one that we’re not going to have an answer for any time soon.” And it’s depressing to think about how many data sets might be hosed….

There is an entire back story to this publication also…that is the challenges that we had getting the work published and the multiple rejections we had in the process. But Sean has told that story in detail here. There’s also the story about the press release …and how editorial control extended from the paper itself to the press release (described here), a situation that I found inappropriate, over-reaching and simply not right. But it happened anyways…..

So…data quality is an issue. It is confusing, hard to tease out and identify for all its complexities. But it’s science, it’s incremental learning and it’s trial by fire. And we have to wonder how many projects might have been burned simply by the dispensing processes

 

 

, , ,

1 Comment

The future of scientific information & communication presented at the SUNY Potsdam Academic Festival

This is a LONG presentation….I talk about the “It’s All About Me” attitude that can positively feed science….we want to share OUR science, we want people to know about our opinions, our activities, our collaborators, we want to get funding, recognition and attribution. And why not…it can all be to the benefit of science.

This presentation was given at the SUNY Potsdam Academic Festival

The future of scientific information & communication

Our access to scientific information has changed in ways that were hardly imagined even by the early pioneers of the internet. The immense quantities of data and the array of tools available to search and analyze online content continues to expand while the pace of change does not appear to be slowing. While scientists now have access to the enormous capacities and capability of the internet the vast majority of scientific communication continues to be through peer-reviewed scientific journals. The measure of a scientist’s contribution is primarily represented by their publication profile and the citations to their published works and offers an incomplete view of their activities. However, we are at the beginning of a new revolution where the ability to communicate offers the opportunity to embrace new forms of publishing and where scientific participation and influence will be measured in new ways. This presentation will provide an overview of our new generation of “openness” in which open source, open standards, open access and open data are proliferating. The future of scientific information and communication will be underpinned by these efforts, influenced by increasing participation from the scientific community and facilitated collaboration and ultimately accelerate scientific progress.

, , , ,

No Comments

Navigating scientific resources using wiki based resources

Presentation given at ACS New Orleans Spring Meeting

There is an overwhelming number of new resources for chemistry that would likely benefit both librarians and students in terms of improving access to data and information. While commercial solutions provided by an institution may be the primary resources there is now an enormous range of online tools, databases, resources, apps for mobile devices and, increasingly, wikis. This presentation will provide an overview of how wiki-based resources for scientists are developing and will introduce a number of developing wikis. These include wikis that are being used to teach chemistry to students as well as to source information about scientists, scientific databases and mobile apps.

No Comments

Engaging students in publishing on the internet early in their careers

Presentation given at ACS New Orleans Spring Meeting

As a result of the advent of internet technologies supporting participation on the internet via blogs, wikis and other social networking approaches, chemists now have an opportunity to contribute to the growing chemistry content on the web. As scientists an important skill to develop is the ability to succinctly report in a published format the details of scientific experimentation. The Royal Society of Chemistry provides a number of online systems to share chemistry data, the most well known of these being the ChemSpider database. In parallel the ChemSpider SyntheticPages (CSSP) platform is an online publishing platform for scientists, and especially students, to publish the details of chemical syntheses that they have performed. Using the rich capabilities of internet platforms, including the ability to display interactive spectral data and movies, CSSP is an ideal environment for students to publish their work, especially syntheses that might not support mainstream publication.

 

No Comments