RSS

Your Opinion WANTED on how should the structure of Tegaserod be drawn

18 Feb

Those of you who watch this blog know that many of the discussions are about chemical structures, accurate representations on databases and how to “correctly” communicate chemical structures/compounds for the users. So, this is an OPINION question…it’s not an “I have an answer” blog post.

So, Tegaserod has, according, the Dailymed here the structure below:

It can be envisaged as having a trans-orientation but the name on DailyMed doesn’t indicate trans….”3-(5-methoxy-1H-indol-3-ylmethylene)-N-pentylcarbazimidamide”

On Wikipedia here we see the structure below and a systematic name supporting a trans-orientation.

Now there are actually a number of ways to represent Tegaserod and, since there’s no stereochemistry to complicate the molecule, and we are interested in the skeleton per se, we can search on the first part of the InChI on a database like ChemSpider. A search on IKBKZGMPCYNSLU as the first part of the InChI for the structure gives 3 hits. Take a look.I don’t see any real reasons to show the crossbonds for the NH but so be it.

Now, consider that the three hits are E-, Z- and crossbond orientations, and their InChIKeys are as shown below, the results set is indeed expected. My question, based on the structures that you see for Tegaserod, would you prefer to see the compound drawn and how would you expect it to be held in the database. Think about what you would expect to happen in terms of a search. If you drew a cis-form should it retrieve cis and crossed? If you drew crossed should it retrieve cis and trans? etc. Remember, it’s an opinion so no answer is wrong…

 
4 Comments

Posted by on February 18, 2011 in Computing, Data Quality, InChI

 

4 Responses to Your Opinion WANTED on how should the structure of Tegaserod be drawn

  1. Markus Sitzmann

    February 18, 2011 at 5:10 pm

    In my view, the ideal behavior would be: if the exact match is available, present it most prominently but give clear hints that (stereo)isomers have been found, too. If only similar structures are found (not the exact match), present those as (clearly marked) related structures. You could do this for tautomers, too (only those that are in the database – not any possible). The whole thing should be a little bit like Google’s “Did you mean …”

     
  2. Dmitry Pavlov

    February 19, 2011 at 2:31 am

    Antony,

    Indigo’s approach is that the “crossed” double bond means unspecified cis-trans. When you do a substructure match with “crossed” bond in the query, Indigo will suppose that you are not interested in the cis-trans configuration, and it will give a positive match on all the three structures shown. When you do a substructure match with cis double bond in the query, Indigo will give a positive match only on the target with the cis double bond.

    On the other hand, if you perform the exact match, then Indigo will match cis only with cis, trans only with trans, and crossed only with crossed.

    Bingo search engine, being based on Indigo, will give the results compliant to the Indigo’s rules.

     
  3. Orion Jankowski

    February 20, 2011 at 8:44 pm

    In practice, chemists typically have no control over hydrazone stereochemistry. Sometimes isomers can be separated, sometimes they can’t — sometimes a single geometric isomer is stable, sometimes it isn’t. Most practicing chemists performing this search are probably NOT going to be interested in the geometric isomer that happens to be present in the database, even if it is factually consistent with the actual structure. Given this use-case, I would allow any search (E//Z/undefined) to bring up the structure, but present is as the E form (or whatever form is registered). Occasionally, someone might search for Z and actually mean “only give me Z”. Some hard/soft tolerance in the query might be appropriate here (e.g. an advanced search, where it is assumed the user knows what they’re doing would give the exact isomer specified vs the database entry).

     
  4. Dmitry Pavlov

    February 24, 2011 at 4:26 am

    (to append to my previous comment)

    In query Molfiles, Indigo ignores cis-trans notation unless there is a “stereo care box” set on the double cis-trans bond (this rule is essentially for telling an “intentionally specified” cis-trans from “unintentionally specified”). In SMILES/SMARTS queries, the ordinary “/ ” notation is taken into account.

     

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

 
Stop SOPA