For the past couple of weeks I have been looking at the NPC browser and the dataset contained within it. I am using it as an example of what type of data is finding its way into the public domain for use by Life Scientists. I had the “opportunity” to take a couple of LONG flights to and from Europe last week and late nights in hotels/ During the trip I finished my review of the data. This does NOT mean that I have a fully curated dataset …no chance. That would take a few weeks to assemble! However, it is enough data to insert some of the conclusions into a paper that has just returned from review as well as provide data for a paper presently being assembled. With that said I’m unlikely to report much more on the data until that paper is through review.
What I can comment is that the dataset does not seem to align with a lot of the comments in the original paper listed below.
R. Huang, N. Southall, Y. Wang, A. Yasgar, P. Shinn, A. Jadhav, D.-T. Nguyen, C. P. Austin. The NCGC Pharmaceutical Collection: A Comprehensive Resource of Clinically Approved Drugs Enabling Repurposing and Chemical Genomics. Science Translational Medicine, 2011; 3 (80): 80ps16 DOI: 10.1126/scitranslmed.3001862
The data has hardly been curated aqnd many of the suggested heuristics applied to the assembly of the dataset failed based on what came through the set that was issued. One of my favorite “drugs” in the screening set is shown below. I doubt Mn2+ is easily marketed as a drug, and having Mn2+ labeled as Selenium oxide, cadmium salt (1:1) seems a little strange. Having it labeled as Strontium tetraborate or barium tetraborate seems just as weird. This is one of many…many others will be discussed in a publication presently in development. Watch this space.