Archive Info

You are currently browsing the The ChemConnector Blog weblog archives for 'Uncategorized' category

The Curation of Almost 5000 Structures on Wikipedia

I recently commented on the statement made by Eric Shively of CAS about the CAS Validation Project going on at Wikipedia. The basic premise of the work is the need to validate CAS numbers to ensure that the CAS numbers listed in a chemical box are associated with the appropriate structure shown in the chemical box. So, if the structure has stereochemistry make sure that the CAS number is for the form of the structure with stereochemistry. If the CAS Number is for a neutral compound then the structure displayed should not be the salt. And so on, and so on. There are many sources of CAS Numbers online. In fact there are many places to search for them to confirm. Type in “CAS Number search” online and you’ll find a lot of hits, though admittedly not all of them related to Chemical Abstracts Services.

Some examples on “online CAS number searches” are excellent. In the order that I see them in my search:

The NIST Webbook - much loved by many scientists and very useful.

ChemIndustry - An excellent resource for chemists and gaining a good following in the market I believe

ChemFinder - Cambridgesoft’s online search system

A Buyers Guide - A German Chemical Buyer’s Guide.

PennSylvania Department of Environmental Protection

California Department of Pesticide regulation

And on and on. There are likely legal reasons for a number of these databases to have CAS Numbers. As I continued to peruse the list I was more than impressed by the number of databases serving up CAS numbers online, and, I believe, a number of them containing over 10,000 numbers which, as I have commented before, is rather a magic number. Should Wikipedia be concerned about the 10,000 CAS number issue with some of the other issues being discussed now?

PMR recently commented on my blogpost here. He said “PMR: Wikipedia has between 1000 and 2000 chemical substances (ca 0.01% of the total number of substances in CAS).”

The number of chemical substances in Wikipedia is actually MUCH higher than that…I know since I’ve been looking at them, in detail as described here. To clarify, I am building an SDF file from the chemicals on Wikipedia so that it can be deposited on ChemSpider hooked up back to Wikipedia. This was done earlier by linking up chemical names but it was far from perfect so we are doing it in this more “curated” manner. The outcome from the work, and thanks to multiple other sets of eyes from WP:CHEM, will be a curated SDF file. I will return the SDF file with the following fields generated: SMILES string, Systematic Name, InChIString, InChIKey. These can then be used to homogenize the available fields in the Chemical Boxes etc.

In doing the work (I have already worked through the whole alphabet) I have over 4900 compounds already curated at a first level. I have disregarded the majority of inorganics and organometallics for this pass. ca. 5000 organics manually curated is ENOUGH of a challenge. I estimate the number of chemical compounds to be about 6500-7000, and it’s growing. So, it’s about a factor of 3-4 times bigger than PMR’s estimate. The vast majority do have CAS numbers. While it hasn’t hit 10,000 yet… it’ coming.

An Excellent Review of Protein Docking

As a result of work we are doing over on ChemSpider regarding LASSO I have become increasingly interest in the world of protein docking.  A great review article was just released. I highly recommend it if this is an area of interest for you.

Protein-ligand Docking: A Review of Recent Advances and Future Perspectives

Current Pharmaceutical Analysis, Volume 4, Number 1, February 2008, ISSN: 1573-4129

Montserrat Vaqué, Anna Ardévol, Cinta Bladé, M. Josepa Salvadó, Mayte Blay, Juan Fernández-Larrea, Lluís Arola and Gerard Pujadas

Understanding the interactions between proteins and ligands is crucial for the pharmaceutical and functional food industries. The experimental structures of these protein/ligand complexes are usually obtained, under highly expert control, by time-consuming techniques such as X-ray crystallography or NMR. These techniques are therefore not suitable for routinely screening the possible interaction between one receptor and thousands of ligands. To overcome this limitation, computational algorithms (i.e. docking algorithms) have been developed that use the individual structures of the receptor and ligand to predict the structure of their complex. The present review, then, summarizes: (a) the fundamentals of the algorithms of the most commontly used docking programmes (with particular emphasis on their strengths and limitations); (b) how the results from different docking algorithms compare (i.e. which software gives the best predictions); and (c) the future perspectives and challenges for docking techniques.

The Full NMR Assignment of Hexacyclinol using CASE now published

The hexacyclinol debacle has been highlighted on a number of blogs and has caused a furor within the organic chemistry community. I have discussed this previously on the ChemSpider blog. Well, I am happy to say that the article describing our work is finally available as an ASAP article online . This was certainly an interesting piece of work to be involved with, was a detective story from start to finish and brought together a very skilled team to work on this issue. In my opinion this study truly shows the capabilities of Computer-Assisted Structure Elucidation.

hexacyclinol.png

My friend “An American Citizen”

My friend “Halbstein” has started his own blog - American Citizen. Recently he and I sat for lunch and talked about the politics of health care in the United States and we reviewed a very interesting article together. He has commented about this on his blog and I recommend people interested in the costs of health care in the USA to browse it. Very revealing …go to his blog for info.

In response to his post I waxed lyrical about the movie Field of Dreams, Burt Lancaster and my doctor when I was growing up. Halbstein took it one step further…and does it in a way that might stimulate you all to remember what medicine used to be like. While technology and  medicine have advanced I have to ask the question whether patient care and doctor-patient relationships have balanced it by going the other way? Read about Dr Lipmann.

Does the Power of Marketing Equate to the Stupidity of the Public?

I am an iPod user. I couldn’t wait to see DVDs in Blu-Ray format. I believed (twice…mistakenly) that German cars would be better than Japanese (I was wrong!). The majority of us are powerfully influenced by marketing. Specifically it is impossible to miss the latest “bandwagon jumpers” from the food companies when there is yet another way made available to them to manipulate the public. How many unhealthy foods do you see labeled as “cholesterol-free, sugar-free, fat-free, blah-blah-blah”. Ok, so a food can be cholesterol free but does that mean it’s good for you? Fat-free…whoo-hoo…balance that with “stocked with calories from a gazillion sugar calories” and who really cares. It shouldn’t be that difficult to have a gut-level instinct around what’s good and what’s bad to put in your mouth and down to your stomach. I DO put bad stuff in there…chocolate, french fries etc. but I am under not under an illusion that they might be bad for me…I know they are…but moderation and life balance takes care of that.

Why the rant? Trans-fat. TRANS-FAT!!! Ok, so there’s science to the outcries to remove it from food. Personally I prefer butter over margarine now despite the “butter is bad for you…eat this pot of chemistry called margarine” pitches over the years. And yes..I listened and ate chemistry for a long time. It’s not the science behind trans fat that worries me …it’s the vampire marketers using it to their advantage. Look at the image below. Why the hell are they labeling 100% sugar as Trans-fat free? Don’t the public know that sugar isn’t fat? Does labeling it trans-fat free make a bag of pure sugar good for you? Whoo-hoo..grab a spoon. Come on people…wake up. Manipulation is the art of the marketer. What’s next …a bottle of water labeled as trans-fat free, sugar free, cholesterol free? Maybe it IS appropriate to label a GLASS BOTTLE of water as “plasticizer free”…take a whiff of a PLASTIC bottle of water when it’s sat in the car for a week. Let’s not be sheeple to the marketers…

sugar_and_trans_fat.png

Sign up to Receive ChemConnector Via Feedburner

I am finally getting back into blogging after a Christmas Season spent doing Wikipedia Curation and meeting grant application deadlines. So, both this blog and the ChemSpider blog are going to become a little more active. Since this is a new blogsite if you are interested in receiving the posts into your email simply fill in your email address in the box on the right that looks like this:

feedburner1.png