I am not sure how many of you encounter the situation of opening a file sent to you in CSV or TSV format into Excel and then seeing that the CASRN column contains some dates instead.
As an example see the set of records below in a file I am processing regarding endogenous metabolites
It is incredibly annoying, especially if you don’t check a large file for the presence of date format CASRNs, so I always check now simply by sorting. But what to do to fix it? There may be other ways but I have a simple solution that works and allows me to check and keep the file intact.
If you have a column of CAS Numbers then insert a new column adjacent to the CASRNs. Then insert a simple entry to convert the date to text format as follows: =TEXT(C2,”yyyy-mm-d”). It is obvious in the screenshot below.
in this example C2 is the cell containing the first CASRN. Then simply drag the corner of the first repaired cell to populate the column with repaired CASRNs.
In general I then do a gap analysis of what we are missing in the CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard) by copying the list of CASRNs into the batch search page at https://comptox.epa.gov/dashboard/dsstoxdb/batch_search and running a search with the CASRN identifier selected.
By downloading the Excel spreadsheet, and taking advantage of our batch search checking the CAS Checksum the output can be very informative.
It’s a simple solution…but hopefully useful.