Archive for category Wikipedia Chemistry
Tonight I gave a presentation at the BAGIM meeting in Boston. The abstract is below together with the embedded presentation from Slideshare
ChemSpider – Is This The Future of Linked Chemistry on the Internet?
ChemSpider was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge. There are now hundreds of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them. Despite the diversity of databases available online their inherent quality, accuracy and completeness is lacking in many regards. ChemSpider was established to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data and experimental properties. ChemSpider has now grown into a database of almost 25 million chemical substances, grows daily, and is integrated with over 400 sources, many of these directly supporting the Life Sciences. This presentation will provide an overview of our efforts to improve the quality of data online, to provide a foundation for a linked web for chemistry and to provide access to a set online tools and services to support access to these data.
Those who frequent the ChemConnector or ChemSpider blogs, or have these plugged into your Readers will have noticed a sudden silence from me in the New year. It was one of those “phone calls you never want” calls. My mother was rushed into hospital with a subarachnoid hemorrhage. That is NOT a bleeding spider deep in her waters (get it? A sub arachnid hemorrhage…) but a bleed in the brain. It is a form of stroke and as Wikipedia so nicely states (scary) “Up to half of all cases of SAH are fatal and 10–15% die before reaching a hospital”. My mother made it to hospital thanks to the valiant efforts of my sister who had experience of exactly this medical emergency since her friend had SAH just over a year ago. By the time I got home three days later my mother was safely at the Walton Center,supposedly one of the best neurology hospitals in the UK. When I walked onto the ward for the first time after a red eye flight from the US and no sleep for about 30 hours, my mother was awake, but tired. She was bruised from all the lines running into her and had a drain running from her head to a bag draining fluid from the brain to prevent hydrocephalus. they must have been collecting over a half litre of fluid every day or so. Whenever I was there the bag was full of bloody fluid and seemed to get drained regularly. It’s very concerning and emotional for any child to see their parent in such a state….
In the first couple of days she was talkative but sleepy. With SAH it’s the few days following the event that are particularly telling and dangerous. No different in this case. All hell broke loose as she headed down from the normal ward and down to the High Dependency Unit (one nurse, per two patients) one day. We received a call and when we arrived she wasn’t conscious and non-responsive. Within a few days she had declined and had moved to the critical care (1 nurse, 1 patient it seemed) ward as a result of the drain from her head blocking and a build up of fluid, heart arrythmia, low blood pressure and an infection. They made a 6 inch cut across the scalp, drilled a hole into the skull and ran a fresh drain into a ventricle of the brain. The next three days were emotionally and physically draining (3 hours a day of driving and not knowing whether she would be able to talk that day or not or even know who we were. By the time she got back to High Dependency (who would have thought that would seem like a happy day…but after critical care it is!) she was on seven drugs, had mainlines running into her femoral artery and later the carotid artery. She was bruised and bandaged, cabled, wired and clearly in distress. At one point her eyes communicated “Enough…I can’t do this anymore” and it was one of the hardest moments of my life…but a singly defining moment in the nature of my relationship with my sister and my mother…and how closely connected we are.
During that period the doctors performed endovascular surgery to insert a coil as described in detail here. My mother now has Platinum in her brain and without it would likely not survive. The stress on her system would not been conducive to her surviving a more invasive surgery. When I left the UK, after almost 3 weeks, multiple changes to my flights (and lots of charges from United airlines!) my mother was off of all drugs, sitting up, had just drank her first glass of water in 7 days (she was on a nose feed for food for a long time and was receiving intravenous fluids the entire time) . I’ve been home almost a week and she is now eating soups, drinking hot drinks, can get out of bed and is learning to walk again…after three ways in bed there is a lot of muscle atrophy.
And so to the National Health Service of the UK. I have heard MANY nightmare stories and experienced some myself when I lived in the UK. However, I’ve lived in Canada and now live in the US. I have nightmare stories and experiences in both countries. Those stories are for another time… What I can say is that the treatment my mother received was outstanding. Her nurses and doctors were phenomenal. There were not only skilled at their jobs but sympathetic to us at a human level, listened to us when we were concerned and educated us when we asked. The coiling procedure is not available in every hospital and is state of the art surgery. Bottom line is my mother nearly died the moment the hemorrhage happened (50% of people do!) and, in my opinion, she went to the edge and back a number of times in 3 weeks. The medical staff clearly saved her life and I and my family are indebted to them for the treatment and the experience. one concern we didn’t have to deal with is “cost”. Even for the most mundane procedures in the US there is a cost concern. Having visited friends and hospital members in hospital I am conscious of the “how much per pill” mentality that persists here. Based on what I saw happen to my mother, and the 4 weeks of hospital stay to come and months of rehabilitation to follow my mothers treatment and recovery in this country would cost well over a hundred thousand dollars..probably more (maybe some one can give me an inform guess?). In the UK the National Health Service assumes those costs. There is no bill to come that we need to worry about. The focus can be on the patient, their rehabilitation and care. In this country I have sensed and discussed with some close friends the mentality of “what is a life worth?”. What child wants to be put in that situation?!
And so my plea to President Obama. “Please stay on task with your intentions to provide affordable health care for all families. Rich or poor none of us want to be faced with the challenging questions associated with the mentality of “What is a life worth” that will prevail unless health care costs are brought under control in this country. We have research investments in this country which have delivered incredible technologies to preserve life as we are threatened. We have drugs to support and enhance life when burdened by sickness and slowed by age. Yet, for many, basic healthcare remains out of reach. It is past the time for change. The majority of the populace, whether they voted for you or not, will lend their support to you to make the necessary changes. The world is watching and you can lead the change in healthcare. You have my support.”
My best friend is right in the middle of the challenges of “commercialized health care” in the United States. Jeff is a wonderful man and one of my life mentors. He is at once incredibly intelligent, thoughtful, caring, challenging and motivating. He is presently struggling with a health issue of his own and is about to enter into the challenges of dealing with the costs of excellent care, some of the (in)adequacies of the system, and going under the knife for a very scary yet incredible surgery. He has the blog American Citizen and is about to start posting videos about the challenges he is going through. Knowing Jeff they will be witty, amusing and straight to the point. Check out his blog and watch out for the movies.
I blogged previously about curating Wikipedia chemistry pages…specifically a focus on chemical structures and the quality of systematic names, trade names, structure images and outlinks to other site. This project has moved quite well….a LOT of eyeballing into the early hours. I am taking a break to catch up with some other work for the next couple of weeks (at least). As it is I have made my first pass to the letter P (having done X,Y,Z) already. There are 1100 links left for me to review – links to pages that I need to click on, open up, see if it is a structure page and then curate and validate.
I think what’s been done to date has been of value to the WP:CHEM team and to the overall quality of what’s on there. I had questioned in my own head how important and valuable the effort was. Thanks to Walkerma who pointed out this facility today it is clear that the chemistry pages are getting a lot of visits…over a 100 per day in many cases. A report on my progress is posted online here.
t’s been a work of passion to this point. Now, the reality is it is just work. I am tired of looking at Wikipedia pages (no insult to WP but I have tired eyes). It will get finished, and I hope by the end of the month…I won’t be rushing it since it will impact the quality but I will be glad when it’s over
I’ll confess that despite the lure of Christmas candy, repeats of oldie-but-goodie movies and the urge to go hack down a Xmas tree I found it difficult to stay away from my computer over the holidays. While I stayed silent in the blogosphere I probably spent more nighttime hours with my laptop than I have in the past few months. I had a conversation with Walkerma from the Wikipedia Chemistry group in December and confessed my interest in curating Wikipedia chemical structures. For those of you who read the ChemSpider blog you’ll know I have rather a passion for curation. And I’d done a significant amount of it on ChemSpider and also, of late, on Wikipedia….see the taxol and diazonamide stories.
We have recently announced our intention to rollout WiChempedia over on the ChemSpider blog. Now, before we go grab the chemistry content I wanted to make sure that we could grab “clean” data. In keeping with the structure centric nature of the system we want to build my first charge was to check/validate/curate the structure-name pairs on Wikipedia. Using some CSV files provided to me by Martin Walker I went to work. First of all, those CSV files were dirty…the word Ethanol shows up in some obscure places. With the assistance of a good action movie, a glass of wine, some basic text queries and removals, and some delete-delete-delete keystrokes and I had removed the majority of “no way it could be a chemical” text strings. Then, I imported the list of chemical names into a desktop chemical structure databasing tool (more on the tools in a separate post) and I went to work. There were a few little tricks to make the whole process easier but that will be detailed elsewhere. I could actually manage to check a structure in about 2 minutes per in general. In some cases I had to redraw structures (some took a LONG time). I wandered between PubChem and ChemSpider, Chemrefer and Google looking for confirmation of structures and registry numbers.
I’ve made many edits to the Wikipedia entries already…you can see my contributions since Dec 15th online. I recently started to keep a mare detailed report of mistakes/suggestions/comments I have made on structures on Wikipedia structures (as a result of a conversation with Walkerma). The latest report is here. Walkerma is posting a version of this online for members of WP:CHEM to comment on.
My overall conclusions so far…my estimate is that about 2-3% of the structure records online have errors. What’s an error?
1) The structure does not match what it “should be” based on review of many other sources.
2) Systematic nomenclature can be poor…if the name displayed on Wikipedia is converted to a structure then sometimes it is inconsistent with the actual structure displayed
3) Sometimes the formula or mass displayed in the ChemBox are inconsistent with the actual mass or formula of the structure displayed
4) The SMILES or InChI String associated with the structure can produce a different structure when converted.
5) The registry number matches either a different structure or a different “form” of the structure. For example, the structure shown is a neutral form of the compound but the registry number is for the salt.
There are other issues but the ones are above are the most common.
It turns out that Peter Murray-Rust and his group have been doing similar work according to his post here . I appreciate his comment “We are very grateful for this work. We are also doing similar things and we’d be delighted to coordinate”.
While this is not exactly Open Notebook Science – as I do the work of curating Wikipedia records I am keeping records, putting them up online for others to check and comment on so this is Collaborative Science through curation. This IS actually having an impact on the Wikipedia records every 24 hours at present. Not only am “I” making edits of records as I find errors but when I open the conversation with others for their comments then they make decisions and appropriate edits. You can see WP users making edits according to my comments – see here for example. I’m interested to see the similar contributions from Peter’s team.
There is expected to be an IRC chat with the WP:CHEM team in the near future and hopefully a chance to compare notes, processes and the path forward for curation. I’m looking forward to the opportunity to hearing about Peter’s teams approach to curating the data and identifying how we differ and how we can mesh our efforts. It would be good if PMR’s group can adopt an Open Notebook Science approach to Wikipedia analysis as he did recently with the NMR analysis. In that way we’ll be able to jointly track our efforts as we work together to help the Wikipedia team. (Peter- if you are reading can you share your experiences of curating Wikipedia and what your team is observing. Can we do Collaborative Science on this project together?)