No wonder I’m so healthy…I like rabbits, eat lots of potatoes, drink “green sludge in a bottle” to get my vegetables and also have lots of other “ingredients” in my diet. According to the NCGC data collection that I have been browsing through the NPC Browser on my desktop these are all parts of the NCGC data collection. While chatting with a number of pharmaceutical scientists last week regarding data quality in public domain databases the NPC Browser was used as an example of data content “to review”. If you’d like to review the contents yourself you will find many issues regarding stereochemistry, valency, charge balance, incorrect associations between structures and identifiers but you can also review the data in table format and look at the content that doesn’t have structures associated and scratch your head at some of the content. To see the errors go to tabular mode as shown below and scroll down past record 8000.
Notice that the drugs “water Lily”, “Water Cress” and “Water Hemlock” are listed. I wonder what water cress is used for? It all becomes much more fun when you see some of the others listed below. Rabbit…now that’s a good…take two in the morning, without food, and repeat dosage for 7 days. Cures “big sharp pointy teeth”. Vegetables…ah yes, nothing specific. Just “vegetables”. Kind of a cure all really. Take 5 portions per day, with food, obviously. And potatoes…good for a stiff neck (starch collar syndrome). If you browse through you’ll also find “ingredients” listed as a compound. Glad about that really…most drugs have ingredients. I am sure there is a reason that these are listed, though I cannot imagine what the reason would be. If there is no good reason it is time to decommission this dataset until it is cleaned up in a major way. Clearly the contents are suspect at best.