RSS

How would you represent a racemate?

23 Jan

Assume that you were hosting a public domain database (say ChemSpider). Assume that you had to represent racemates in the database. Assume that with an abundant community willing to provide input you get a lot of feedback about how that should be done. Assume that you have the benefit of hosting a blog and can get more input….thus this question.

Which of the three representations below would you use to represent a racemate?

 
7 Comments

Posted by on January 23, 2011 in Data Quality

 

7 Responses to How would you represent a racemate?

  1. Mat Todd

    January 23, 2011 at 10:39 pm

    None! For #1 you need to specify ratios to be meaningful (i.e. 1:1 for a racemate). #2 indicates stereocentres, but there is nothing to say what they are. #3 ditto with the added complication that that particular representation can mean that the stereochemistry is unknown.

    Any other possibilities? My human brain responds well to the word “rac” appearing somewhere.

     
  2. Chris

    January 24, 2011 at 4:07 am

    2 and 3 do not represent the relative stereochemistry.
    1 would be preferred but not ideal, I guess you need to indicate the ratio also?

     
  3. Rob Hooft

    January 28, 2011 at 6:23 pm

    I have been wondering about this, and about the associated naming for compounds, this week! Looking at the chemspider entry for tartaric acid, the mixup of different stereochemistry is so large that, if I would try to curate it, I would not even start how to clean it up.

    There are possibly images and/or names that represent:

    1) unknown stereochemistry at all centers
    2) Either one or its mirror image, unknown
    3) A mixture of mirror images, either 50/50 (racemate) or another ratio
    4) A mixture of any configuration of all centers
    5) A pure enantiomer
    6) A pure diastereomer

    And, even more complicated

    7) A database identifier referring to a database that is not clear about 1-6…..

     
  4. tony

    January 30, 2011 at 5:18 pm

    Rob, a search by the name tartaric acid on ChemSPider gives me one record…CSID852 that matches the structure given at Wikipedia http://en.wikipedia.org/wiki/Tartaric_acid that has no stereochemistry. All identifiers with stereochemistry such as L, D, +, – have been removed. The different stereoforms of tartartic acid will still be on the database. One of the reasons that we need to see the list of names associated with Wikipathways is that we can create the structure file directly and get to agreement what the individual identfiers are meant to mean.

     
  5. Markus Sitzmann

    February 9, 2011 at 5:50 pm

    There are eight ways to specify a stereocenter (well, Tony is listed as co-author):

    http://cactus.nci.nih.gov/blog/?p=679

    Besides that, I am not sure what the best representation is, but I find number 1 particularly bad (or dangerous) as it does not allow you the correct calculation of many properties (staring with simple things like molecular weight, InChIKey…). Probably 3 is the best if there is additional information given.

     
  6. Markus Sitzmann

    February 10, 2011 at 3:21 pm

    Thinking about this a little bit further: probably the best way is to store each diastereomere (full structuture including full stereochemistry) separately in your database structure index and then link both by another database table with all additional information (ratio, etc.). By this all properties can be calculated correctly, the structures are searchable and if one of the diastereomers is found in a search in can clearly linked to all racemic mixtures in which it occurs.

     
  7. tony

    February 11, 2011 at 12:13 am

    Markus..thanks for the comments. I am sure you would agree rather a challenging problem. Wait until I put up my commentary about Symbicort…

     

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

 
Stop SOPA