Rabbits, Potatoes and other Vegetables in the NCGC Database


No wonder I’m so healthy…I like rabbits, eat lots of potatoes, drink “green sludge in a bottle” to get my vegetables and also have lots of other “ingredients” in my diet. According to the NCGC data collection that I have been browsing through the NPC Browser on my desktop these are all parts of the NCGC data collection. While chatting with a number of pharmaceutical scientists last week regarding data quality in public domain databases the NPC Browser was used as an example of data content “to review”. If you’d like to review the contents yourself you will find many issues regarding stereochemistry, valency, charge balance, incorrect associations between structures and identifiers but you can also review the data in table format and look at the content that doesn’t have structures associated and scratch your head at some of the content. To see the errors go to tabular mode as shown below and scroll down past record 8000.

How to display the tabular format for the NCGC Data in the NPC Browser

Notice that the drugs “water Lily”, “Water Cress” and “Water Hemlock” are listed. I wonder what water cress is used for? It all becomes much more fun when you see some of the others listed below. Rabbit…now that’s a good…take two in the morning, without food, and repeat dosage for 7 days. Cures “big sharp pointy teeth”. Vegetables…ah yes, nothing specific. Just “vegetables”. Kind of a cure all really. Take 5 portions per day, with food, obviously. And potatoes…good for a stiff neck (starch collar syndrome). If you browse through you’ll also find “ingredients” listed as a compound. Glad about that really…most drugs have ingredients. I am sure there is a reason that these are listed, though I cannot imagine what the reason would be. If there is no good reason it is time to decommission this dataset until it is cleaned up in a major way. Clearly the contents are suspect at best.

A Rabbit in the NCGC Collection - I pity the rabbit during high-throughput screening

Potatoes - the drug of choice for many McDonald's visitors. Fry-style

 

Momma was right - eat your vegetables and you'll be healthy. They are drugs?




  1. #1 by Markus Sitzmann on July 19, 2011 - 3:37 pm

    Did they maybe use FDA’s Unique Ingredient Identifier (UNII) list (http://www.cancer.gov/cancertopics/cancerlibrary/terminologyresources/fda) for the generation of the database? Sounds like.

    Markus

  2. #2 by tony on July 19, 2011 - 9:30 pm

    Markus…that would actually make good sense based on what I am seeing in the list. For example,

    WATERCRESS
    WATERMELON
    WHEAT
    WHEAT BRAN
    WHEAT ENDOSPREM
    WHEAT GERM
    WHEAT GLUTEN
    WHEAT GLUTEN
    WHEAT MIDDLINGS
    WHEAT MIDDLINGS
    WHEAT MIDDLINGS
    WHEY
    WHITE FISH
    WHITE MUSTARD
    WHITE OAK BARK
    WHITE PEPPER
    WHITE WILLOW EXTRACT
    WILD ROSE EXTRACT
    WINE

    However, it cannot be all-encompassing as it doesn;t list Water Lily or “Ingredients” that I have seen listed. So, I think this is only one of the included sets influencing the data.

  3. #3 by Sean Ekins on July 20, 2011 - 12:24 pm

    ..and what about Cockroach, American – when was that an ingredient in drugs – folk medicine or delicacy in countries outside the US.

(will not be published)