Presentation at the BAGIM Meeting in Boston

Tonight I gave a presentation at the BAGIM meeting in Boston. The abstract is below together with the embedded presentation from Slideshare

ChemSpider – Is This The Future of Linked Chemistry on the Internet?
ChemSpider was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge. There are now hundreds of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them.  Despite the diversity of databases available online their inherent quality, accuracy and completeness is lacking in many regards. ChemSpider was established to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data and experimental properties. ChemSpider has now grown into a database of almost 25 million chemical substances, grows daily, and is integrated with over 400 sources, many of these directly supporting the Life Sciences. This presentation will provide an overview of our efforts to improve the quality of data online, to provide a foundation for a linked web for chemistry and to provide access to a set online tools and services to support access to these data.  

Chemicalize From ChemAxon

If you are a chemist and looking for some useful internet tools to assist you in your work I recommend Chemicalize from ChemAxon. It’s a great addition to the suite of tools that chemists can bring to bear on their problems. The fastest way to learn about Chemicalize is to watch the YouTube video here, and embedded below.

This is a great way to mark up compounds in web pages and then move over to the data pages for predicted properties. The predicted property capabilities is a great offering to the community. The predictions are licensed under Creative Commons “Attribution-NonCommercial-ShareAlike 3.0 Unported”. The site has a few teeting troubles, especially in terms of layout on IE8, but this should not detract from the value of the predictions. I am not aware of any other site that will provide free access to pKa predictions as shown below. This will really commoditize the market at this point and shake up the other vendors in this domain. ChemSpider has recently integrated Chemicalize as discussed on the ChemSpider blog.

Continuing Conflicts in the Messy World of Internet Chemistry

I have been looking at the state of curated data on the internet and blogged last night about the messy world of curated data. I should emphasize…none of these commentaries are meant to be harsh. Believe me, I’ve gone through the process of validating data and it’s difficult. There will be mistakes but what we need are processes and systems to clean these data up efficiently. If I see an error I want to annotate it and let people know there is an error. With todays’s technologies it is not difficult.

Let’s take another example from DrugBank

That listed chemical name above the structure doesn’t look very consistent…I don’t see any stereochemistry, certainly no “dihydroxy” and overall…yes, it’s definitely wrong. The actual structure for that name is shown below. Looks like an entire half of the molecule is missing. The InChI and InChIKey are for the molecule shown in DrugBank but the link to KEGG is to the molecule shown below…here.

The links on DrugBank to PubCHem and ChEBI are to the molecule to the left. All of the data in the DrugBank record in terms of outlinks  are for the structure on the left EXCEPT the actual structure on the record, and its associated SMILEs and InChIs are for the  “2-amino-3,5-dihydro-4H-pyrrolo[2,3-d]pyrimidin-4-one” moiety. Oops.

Recently I pointed out to David Wishart, host of DrugBank, some of the issues I had been seeing and it appears there will be a major update to DrugBank in the next few weeks that, in theory, will address some, and hopefully all of these observations.

The Messy World of Even Curated Chemistry on the Internet

Recently I have been spending my night hours looking into the nature of curated chemistry on the internet. 3 years ago I made some assumptions around the quality of certain online datasets when they were deposited onto ChemSpider. It was clear that a lot of internet chemistry datasets were “impure”…I think messy, untrustworthy and confused would be a fairer statement! However, there were a number of datasets that were manually curated and, at initial viewing, were higher quality. With time however I have become increasingly concerned with some of the datasets that I had originally cited as high quality. Over the next few days/weeks I will examine some of these in detail and highlight some of the issues I am seeing. I want to clarify that all chemical compounds, in terms of  their connection tables, their stereochemistry and the association between the compound and the name(s) are assertions. However, there are “norms” for these structures….we would expect a particular structure for aspirin (acetylsalicylic acid ), a single structure for Cholesterol and a single structure for Taxol. By the way, the links to Wikipedia are not assertions that the structures that are presently on Wikipedia are correct representations…but I can confirm that PREVIOUSLY I did work to confirm that every one of these was consistent with my investigations to assert the association between the chemical name and the structure. SInce then it is possible that someone edited the structure…such is the world of Wikipedia!

Two of the linked data sources I have been investigating of late are DrugBank and the Protein Databank. Both of these are manually curated and are expected to be of high quality. In my discussions with various members of the Life Science industry I have heard many positive comments of these data sources as being trustworthy and high quality. I recently downloaded the drugbank small molecule set and started looking at it. Let’s take one example…

The Drugbank record DB02309 has the chemical name “5-Monophosphate-9-Beta-D-Ribofuranosyl Xanthine“. The structure on Drugbank is shown below.

The chemical name above is inconsistent with the structure…there is no stereochemistry in the molecule displayed despite the “-D-” in the name. The IUPAC name listed in the Drugbank record is “[(2R,3S,4R,5R)-5-(2,6-dioxo-3,7-dihydropurin-9-ium-9-yl)-3,4-dihydroxyoxolan-2-yl]methyl dihydrogen phosphate” and this clearly does not agree with the displayed structure.

The InChI listed on the record does not include a stereo layer (InChI=1/C10H13N4O9P/c15-5-3(1-22-24(19,20)21)23-9(6(5)16)14-2-11-4-7(14)12-10(18)13-8(4)17/h2-3,5-6,9,15-16H,1H2,(H2,19,20,21)(H2,12,13,17,18)/p+1/fC10H14N4O9P/h11-13,19-20H/q+1). The InChIKey is listed as:

DCTLYFZHFGENCW-NIVOTTSGCB

The Drugbank record links to a structure with full explicit stereochemistry on PubChem here and to the ligand on the PDB ligand database hosted by ChEBI here.

The molfile downloaded from DrugBank has no stereochemistry but lists both Canonical and Isomeric SMILES

Isomeric SMILES O[C@H]1[C@H](COP(O)(O)=O)O[C@H]([C@@H]1O)N1C=[NH+]C2=C1NC(=O)NC2=O
Canonical SMILES OC1C(COP(O)(O)=O)OC(C1O)N1C=[NH+]C2=C1NC(=O)NC2=O

It is clear what has happened, I believe….the Drugbank record has used the canonical SMILES to generate the structure image and has neglected the stereochemistry. However, the names carry the original stereochemistry information while the InChI comes from the structure with no stereo. I think that’s what happened.Let’s confirm.

ASSUMING that the isomeric SMILES string is the appropriate stereochemistry I can convert it and get the following InChIKey (generated using ACD/ChemSketch) and using ACD/Name get the name below). I trust ChemSketch and ACD/Name products to generate both appropriately as I managed these products while at ACD/Labs for over a decade.

DCTLYFZHFGENCW-NSVMUQOTBF

9-{(2R,3R,4R,5S)-3,4-dihydroxy-5-[(phosphonooxy)methyl]tetrahydrofuran-2-yl}-2,6-dioxo-2,3,6,9-tetrahydro-1H-purin-7-ium

On Drugbank the chemical name listed is:

[(2R,3S,4R,5R)-5-(2,6-dioxo-3,7-dihydropurin-9-ium-9-yl)-3,4-dihydroxyoxolan-2-yl]methyl dihydrogen phosphate

Okay…the names are subtly different….but there are 3R and 1S centers in each name but they differ, assuming that the nomenclature programs are using consistent numbering schemes. See below.

Name generated from Isomeric SMILES on DrugBank: 2R,3R,4R,5S

Chemical Name on DrugBank: 2R,3S,4R,5R

More on this later. Looking at the linked PubChem record gives the following name: [(2R,3S,4R,5R)-5-(2,6-dioxo-3,7-dihydropurin-9-ium-9-yl)-3,4-dihydroxyoxolan-2-yl]methyl dihydrogen phosphate, exactly the same one as listed on Drugbank….so one assumes that the chemical names on DrugBank come from PubChem. Downloading the molfile from PubChem into the same software used to generate InChIs and chemical names gives:

XHDARDSMKMUDDI-XWTUZWARBP

9-{(2R,3R,4S,5R)-3,4-dihydroxy-5-[(phosphonooxy)methyl]tetrahydrofuran-2-yl}-2,6-dioxo-2,3,6,7-tetrahydro-1H-purin-9-ium

DrugBank is linked out to the PDB ligands hosted by ChEBI and looking at the XMP ligand here we see:

[(2R,3S,4R,5R)-5-(2,6-dioxo-3H-purin-7-ium-9-yl)-3,4-dihydroxy-oxolan-2-yl]methyl dihydrogen phosphate

This is the SAME stereochemistry in the chemical name as on DrugBank, but actually a different chemical name. It is definitely possible, and common, for different systematic names to exist for the same chemical but it does indicate the challenges of linking based on different identifiers.

DrugBank:

[(2R,3S,4R,5R)-5-(2,6-dioxo-3H-purin-7-ium-9-yl)-3,4-dihydroxy-oxolan-2-yl]methyl dihydrogen phosphate

PDBeChem: [(2R,3S,4R,5R)-5-(2,6-dioxo-3,7-dihydropurin-9-ium-9-yl)-3,4-dihydroxyoxolan-2-yl]methyl dihydrogen phosphate

The InChIKeys between the different databases/tools are:

PDBeChem: DCTLYFZHFGENCW-KWDNBKPHDV

DrugBank: DCTLYFZHFGENCW-NIVOTTSGCB

PubChem: XHDARDSMKMUDDI-UUOKFMHZSA-O (this is a StdInChIKey)

ChemSketch: DCTLYFZHFGENCW-NSVMUQOTBF

ALL four are inconsistent.

If I convert the SMILES string listed on the PDBeChem ligand database using ACD/ChemSketch then

O[C@H]1[C@@H](O)[C@@H](O[C@@H]1CO[P](O)(O)=O)n2c[nH+]c3C(=O)NC(=O)Nc23

produces a structure with stereochemistry of 2R,3R,4S,5R and the InChIKey : DCTLYFZHFGENCW-XWTUZWARBW.

The stereochemistry on PDBeChem agrees with that on PubChem (based on the name), the connectivity part of the InChIKey is consistent with all other systems (except PubChem) but is different to all other InChIKeys. It is also possible to download “ideal” and “representative” molfiles from the PDBeChem database.

The InChIKeys between the different databases/tools are now:

PDBeChem: DCTLYFZHFGENCW-KWDNBKPHDV

DrugBank: DCTLYFZHFGENCW-NIVOTTSGCB

PubChem: XHDARDSMKMUDDI-UUOKFMHZSA-O (this is a StdInChIKey)

ChemSketch: DCTLYFZHFGENCW-NSVMUQOTBF

PDBeChem: DCTLYFZHFGENCW-XWTUZWARBW (from Isomeric SMILES converted via ChemSketch)

PDBeCHem: DCTLYFZHFGENCW-RDKRKOOMBN (from molfile from PDBeChem, (2R,3R,4S,5R)

DrugBank also links to the Protein Databank here. XMP is listed as a ligand as shown.

The XMP ligand links here to the detailed page containing the information linked below.

Name XANTHOSINE-5′-MONOPHOSPHATE
5′-xanthylic acid
[(2R,3S,4R,5R)-5-(2,6-dioxo-3H-purin-7-ium-9-yl)-3,4-dihydroxy-oxolan-2-yl]methyl dihydrogen phosphate
Synonyms 5-MONOPHOSPHATE-9-BETA-D-RIBOFURANOSYL XANTHINE
Formula C10 H14 N4 O9 P
Molecular Weight 365.21 g/mol
Type NON-POLYMER
Isomeric SMILES (OpenEye) c1[nH+]c2c(n1[C@H]3[C@@H]([C@@H]([C@H](O3)COP(=O)(O)O)O)O)NC(=O)NC2=O
InChI InChI=1/C10H13N4O9P/c15-5-3(1-22-24(19,20)21)23-9(6(5)16)14-2-11-4-7(14)12-10(18)13-8(4)17/h2-3,5-6,9,15-16H,1H2,(H2,19,20,21)(H2,12,13,17,18)/p+1/t3-,5-,6-,9-/m1/s1/fC10H14N4O9P/h11-13,19-20H/q+1
InChI key DCTLYFZHFGENCW-KWDNBKPHDV

The InChIKeys between the different databases/tools are now:

PDBeChem: DCTLYFZHFGENCW-KWDNBKPHDV

DrugBank: DCTLYFZHFGENCW-NIVOTTSGCB

PubChem: XHDARDSMKMUDDI-UUOKFMHZSA-O (this is a StdInChIKey)

ChemSketch: DCTLYFZHFGENCW-NSVMUQOTBF

PDBeCHem: DCTLYFZHFGENCW-RDKRKOOMBN (from molfile from PDBeChem)

PDB_ligand: DCTLYFZHFGENCW-KWDNBKPHDV

Aagghhhhh…InChIKeys get very convoluted! What we see is that the chemical structure on PDB and on PDBeChem are the same. This is good news at least! There is a difference in the InChIKeys when I download the molfile but this can be explained easily…and in a later blog post.

We believe that the structure on PDB should be expected to be correct. We will assert this.

We expect that DrugBank is sourcing the chemical from PDB to add to their database. The chemical structure on DrugBank should coincide with that from PDB. Unfortunately the SMILES on PDB and DrugBank differ in two stereocenters. We don’t know why. Why the inconsistency? If the DrugBank data aren’t from PDB for the XMP ligand where did they come from?

Did PubChem pick up the structure of XMP from the PDB Database or from DrugBank? Let’s see. If I download the 2D molfile from PubChem and generate the chemical name and InChIs I get consistency…PubChem IS consistent with PDB. It is NOT consistent with DrugBank despite the fact that DrugBank is linked into this PubChem record.

This is a very convoluted, and maybe confusing analysis of ONE compound on DrugBank. I have looked at dozens and see similar issues. Assuming that PDB is the source database for data on DrugBank why are the structures differing so much? There are worse examples to come…the linking together of data on the web between even curated databases is an incredible mess.

Caveat: This is detailed and challenging work. I recommend anyone to check my work and see if I missed anything and confirm or challenge the observations as some of the issues I am seeing can be tool-based…the software tools I use may have issues with SMILEs conversion, molfile or SDF reading etc. It is exacting to check chemical structures…

Statins, Fast Food and Happy Healthy Meals at McDonalds

You MUST be kidding me. What type of culture do we have when we let people east the crap they want then hand them pills and tell them not to worry? Ok, what’s driving my emotional response to this? This post on LabSpaces: Free statins with fast food could neutralize heart risk, scientists say. You can see it now…..”double burger, super size fries, two of those McFlurries and a coffee. ” Thank you sir….and here is your batmobile with an ejector providing a 3 pack “statin-surge”.

As Lab Spaces reports “Dr Francis, from the National Heart and Lung Institute at Imperial College London, who is the senior author of the study, said: “Statins don’t cut out all of the unhealthy effects of burgers and fries. It’s better to avoid fatty food altogether. But we’ve worked out that in terms of your likelihood of having a heart attack, taking a statin can reduce your risk to more or less the same degree as a fast food meal increases it.”

Yes…AVOID the fatty food…or at least exercise afterwards! I enjoy a McFlurry as much as the next kid but then I’ll go swim a mile, or run 5km, or cycle 20 miles. I think I can burn it off instead of “statin-it-out”.

The interview follows with “”When people engage in risky behaviours like driving or smoking, they’re encouraged to take measures that minimise their risk, like wearing a seat belt or choosing cigarettes with filters. Taking a statin is a rational way of lowering some of the risks of eating a fatty meal.” then thankfully add “The researchers note that studies should be conducted to assess the potential risks of allowing people to take statins freely, without medical supervision. They suggest that a warning on the packet should emphasise that no tablet can substitute for a healthy diet, and advise people to consult their doctor for more advice.”

But what are they encouraging…it’s really an acknowledgment that because fast Food is goddamn tasty that we should have toys for kids and pills for adults. What will it be “Happy Meals” for kids and “Healthy Meals” for adults…because Mickey-D’s will throw in a six pack of statins? What’s next…viagra at the bar….feel free to drink as much as you want gents…these pep-pills will support your inebriated performance. Don’t worry, be happy…eat crap, drink like a fish…the pharma industry has you covered. And any profits are purely accidental…

Fail Fast Despite the Hype – A Model from Google Wave

I’ve been to Scifoo twice. Both times were great. I didn’t get to go this year…and I am sad not angry that I wasn’t invited. It is terrific that other people, new and old attendees, got to share in the wealth of experience that makes up SciFoo. I hope that it continues and I hope I get to go again.

The first time I went the Google Datasets project was announced. It seemed like a great offer to make to the scientific community. There clearly wasn’t enough participation for the effort as the project was promptly killed.

The next time I went back to Scifoo Google Wave received a lot of attention. Cameron Neylon helped integrate ChemSpider into Wave with ChemSpidey and the potential of Google Wave exploded across the internet as Google’s next big win. I thought the technology was “cool”, interesting, technology looking for a problem and “noisy”…it was very distracting, difficult for me personally to adopt into my daily work. I did play with it, worked on a couple of projects with some colleagues and conceived of how we would use some of the functions.

And now Google Wave is winding down….and I take this comment to heart “…despite these wins, and numerous loyal fans, Wave has not seen the user adoption we would have liked. We don’t plan to continue developing Wave as a standalone product.” Basically they have learned some lessons, probably got some very nice capabilities to plug in elsewhere later, and have decided, to stop investing. I’d love to know what their process was to come up with this decision. Wave was a massive story in the media….and well executed in terms of marketing the story up. How many companies are this clean with an announcement in terms of killing a project of this size…making a tight blog post on the company blog. It’s surprising to see it happen this way, but I have to respect them for the style of pulling the plug and, failing fast. There are lots of other companies who would continue to invest, fearful of the fallout of pulling the plug on a high profile project. Good for you Google…it’s a shame it didn’t work…I DID like pieces of the technology but overall I wasn’t an adopter.  But thanks for this “The central parts of the code, as well as the protocols that have driven many of Wave’s innovations, like drag-and-drop and character-by-character live typing, are already available as open source, so customers and partners can continue the innovation we began”. The community will probably take them!

How long does it take to update WordPress?

It can take a long time to update software. Especially when there are processes, procedures and testing involved. I know…I was involved with ACD/Labs when they rolled out their Updater allowing people to update their ACD/Labs software…it’s great for individuals but corporations would find it dangerous. WordPress is the blogging platform for this blog and updating it takes time…how much time? Well, about a few seconds to logon, click on “WordPress 3.0.1 Update Now” and let it happen. The results were seamless, the blog didn’t break, I didn’t lose anything and was back in production after I walked to the kitchen, grabbed a coffee and walked back. I also write on Blogger some cathartic poetry (FourQuadrantsPoetry) and adventures of an aging sportsman (http://1000milesin1year.blogspot.com/). They simply update in the background and I don’t even know about it. I install windows updates all the time and over the past few years, though it hasn’t been t0tally painless, today it is mostly seamless. I must say though that the latest iPhone OS upgrade sucked and my phone has become slow as a dog to move between apps. Just horrible. But my congrats to WordPress for an update well done….seamless and fast.

How are NMR Prediction Algorithms and AFM Related?

There’s a really nice News piece over on Nature News regarding “Feeling the Shapes of Molecules“. The work reports on how Atomic Force Microscopy is being used to deduce chemical structure directly, one molecule at a time. It is, quite simply, stunning. This work is an extension of the original work reported on pentacene that many scientists thought was spectacular. This work is even one step closer to the dream of single molecule structure identification. The work is entitled “Organic structure determination using atomic-resolution scanning probe microscopy” and as well as the IBM group responsible for the AFM work involves Marcel Jaspars, someone who’s work I have watched for many years as I am trained as an NMR spectroscopist and have spent a lot of time working on computer-assisted structure elucidation (CASE) approaches to examine natural product structures (see references in here…).

The molecule that they studied was cephalandole A  that had previously been mis-assigned. Interestingly my old colleagues from ACD/Labs, where I worked for over a decade, and myself had published an article in RSC’s Natural Product Reviews where we studied “Structural revisions of natural products by Computer-Assisted Structure Elucidation (CASE) systems“. The basic premise of the article is that there are incorrect structures making it into the literature because of the misinterpretation of the analytical data and that computer algorithms, specifically NMR prediction and CASE algorithms, can be used to rule out structures elucidated by the scientists.It is hard to do justice to the entire review article as we detail the approaches to CASE and NMR prediction and doing it in a blog post is tough. So, I do recommend reading the NPR article. However, I am extracting the part that applies to the elucidation of the structure of cephalandole A and how algorithms would be of value in negating the incorrect structure.

“In 2006 Wu et al isolated a new series of alkaloids, particularly cephalandole A, 16. Using 2D NMR data (not tabulated in the article) they performed a full 13C NMR chemical shift assignment as shown on structure 16.

Mason et al synthesized compound 16 and after inspection of the associated 1H and 13C NMR data concluded that the original structure assigned to cephalandol A was incorrect. The synthetic compound displayed significantly different data from those given by Wu et al. The 13C chemical shifts of the synthetic compound are shown on structure 16A.

Cephalandole A was clearly a closely related structure with the same elemental composition as 16, and structure 17 was hypothesized as the most likely candidate. Compound 17 was described in the mid 1960s and this structure was synthesized by Mason et al. The spectral data of the reaction product fully coincided with those reported by Wu et al. The true chemical shift assignment is shown in structure 17. For clarity the differences between the original and revised structures are shown in Figure 17.


We expect that 13C chemical shift prediction, if originally performed for structure 16, would encourage caution by the researchers (we found dA=3.02 ppm). Figure 18 presents the correlation plots of the 13C chemical shift values predicted for structure 16 by both the HOSE and NN methods versus experimental shift values obtained by Wu et al. The large point scattering, the regression equation, the low R2 =0.932 value (an acceptable value is usually R2 ≥ 0.995) and the significant magnitude of the g-angle between the correlation plot and the 45-grade line (a visual indication for disagreement between the experiment and model) could indicate inconsistencies with the proposed structure and should encourage close consideration of the structure. Our experience has demonstrated that a combination of warning attributes can serve to detect questionable structures even in those cases when the StrucEluc system is not used for structure elucidation.

Figure 18. Correlation plots of the 13C chemical shift values predicted for structure 16 by HOSE and NN methods versus experimental shift values obtained by Wu et al. Extracted statistical parameters: R2(HOSE)=0.932, dHOSE=1.20dexp-25.6.

So, for those NMR jocks who don’t have access to the genius of IBM scientists performing AFM, and yet want to have tools to help in the elucidation process you’d be doing well to use NMR prediction algorithms and CASE systems to help….it’s rather embarrassing to have to issue a retraction on a paper with your name on.

Meanwhile I am in awe of the work reported by Marcel and his colleagues at IBM. Clearly there’s a long way to go before such approaches are mainstream but the flag is in the sand…this is where things will speed up and we are surely destined, I hope (!) to see many more reports of this type of work and how it is progressing. Let’s hope. Feedback on the NPR article welcomed!!!

Organic structure determination using atomic-resolution scanning probe microscopy

Blogging – How To and Why….

I’ve been blogging for over 3 years at this point and next week I am participating in a session at the BCCE conference in Texas and will talk about how to set up a blog on Google Blogger, rather than go through the challenge of setting up your own WordPress blog (which we do). I will talk about my experiences of blogging and the “highs and lows” of the experience. Overall I enjoy blogging, but I don’t have enough time to blog as much as I’d like to any more, and I’m glad I started participating in the blogosphere. The presentation is on Slideshare and embedded below. As usual, Creative Commons licensing…use as you wish.

American Chemical Society Loses the Appeal Against the Leadscope Case

The American Chemical Society is going to take a pretty significant hit in its most recent iteration of the ACS vs Leadscope case…to the sum total of $40 million PLUS costs. Ow.

ACS, through CAS generally, have had a number of very high profile collisions over the past few years but this has to be the most costly.

1) In 2004 ACS went up against Google for infringement against “Scholar” as a trademark. ” The ACS complaint contends that Google’s use of the word scholar infringes on ACS’s SciFinder Scholar and Scholar trademarks and constitutes unfair competition.” No one “lost” and it was settled out of court with the statement from the ACS that “The settlement includes a confidentiality clause and as such the ACS will have no further comment.” Not sure how much it cost but I don’t personally know any cheap lawyers. And if you’re up against Google lawyers they are not going to be cheap lawyers!

2) In 2005 the ACS opposed the creation of PubChem stating “The ACS believes strongly that the Federal Government should not seek to become a taxpayer supported publisher. By collecting, organizing, and disseminating small molecule information whose creation it has not funded and which duplicates CAS services, NIH has started ominously, down the path to unfettered scientific publishing…“. This one was a very public battle with a very significant public outcry. There were discussions on multiple blogs, letters to C&E News and a number of people I know personally gave up their ACS membership in disgust. Wikipedia has some interesting reports about some of the costs involved. “The ACS has a strong financial interest in the issue since the Chemical Abstracts Service generates a large percentage of the society’s revenue. To advocate their position against the PubChem database, ACS has actively lobbied the US Congress. They are reported to have paid the lobbying firm Hicks Partners LLC at least $100,000 in 2005 to try to persuade congressional members, the NIH, and the Office of Management and Budget (OMB) against establishing a publicly funded database. They also were reported to have spent $180,000 to hire Wexler & Walker Public Policy Associates to promote the ‘use of [a] commercial database.” In the same article Wikipedia reports on the ACS stance against Open Access: “The journal Nature reported that ACS had hired a public relations firm, Dezenhall Resources, to try to halt the open access movement.[6] Scientific American later reported that ACS had spent over $200,000 to hire Wexler & Walker Public Policy Association to lobby against open access”

3) In 2002 ACS sued Leadscope and for the past eight years Leadscope and the founding scientists Paul Blower, Glenn Myatt and Wayne Johnson have been battling the charge of trade secret misappropriation. The ACS claimed that the three scientists had stolen trade secrets by patenting a software program for pharma companies that shortens the process to develop new drugs. The case was finally tried in 2008 and the jury found no evidence of misappropriation. They determined that the ACS had brought its claim in bad faith and awarded Leadscope damages on their countersuit for defamation, unfair competition and tortuous interference following an eight-week trial and assigned damages of $27 million.

In closing arguments, Leadscope’s attorney argued, that ACS “destroyed the reputations of three dedicated scientists…They have ruined the financial position of LeadScope…These scientists did their own work. They didn’t take anything from [ACS]“. Much of the case focused on expert analysis of Leadscope’s source code. Leadscope presented expert testimony that the source code of their own product was NOT copied.

The C&E News report of the result is here.

ACS appealed the result of the case and has been fighting it for the past couple of years. They lost the appeal and the costs are now up to $40 million PLUS costs. Ow.

I’ve been an ACS member for well over a decade. I’ve been an RSC (Royal Society of Chemistry) employee for just over a year. All my comments are made as an ACS member and not an RSC employee…it’s why I am making the comments here and not on the ChemSpider blog.

1) Summing up the amount of money that has gone into litigation, lawyers fees and settlements how much money has been drained from the coffers of the ACS in the past decade. With the impending $40 million damages and the other legal wranglings it has to be over $50 million? Surely that money would be put to best use subsidizing a conference, keeping membership fees down or even investing it to supply materials or support to schools and colleges with needs around chemistry? How many other legal wranglings are waiting in the wings to further draw down the coffers?

2) How many not-for-profits engage themselves in such regular legal wranglings? In 1990 a lawsuit was brought against the ACS threatening the not-for-profit status. The discussions regarding ACS/CAS having not-for-profit status continues to be a talking point in a number of circles and dinner conversations I have sat in on. As Jeffrey Rich commented “CAS is in no way related to the Boy Scouts of America or the United Way. What they do is no different from what a big computer business or publishing company does. That’s not a sign of a charitable organization, but of an intellectual business organization in business to make a bundle.” Another ACS, the American Cancer Society, has similar questions hanging over it.

3) What is the reputation cost of these legal cases for the ACS? I know a number of people who have left the ACS because of the PubChem challenges made by ACS. The blogosphere lit up when these challenges were happening and yet, as far as I can tell, no efforts are being undertaken to defuse or participate in the discussions. The statements are legal only and carry only succinct statements that hardly explain the mindset behind the challenges. A town meeting allowing a dialog would be very beneficial. I look forward to sitting in on such a discussion regarding Leadscope at the next ACS. Will it happen? Was their a town meeting at ACS/CAS regarding the latest legal conclusion and how it will impact the organization?

I enjoy ACS meetings. I read C&E News every week but admit that I find RSC’s Chemistry World a more entertaining read. I know a lot of people at CAS and ACS and they are great people.

I hope that more consideration is given before the next legal case is brought against an individual or organization. It costs reputation and money and will continue the growing concern regarding ACS’s business focus rather than acting as a nonprofit.