My favorite tool is OpenRefine.
OpenRefine all the way, baby! I’m the curator of eagle-i, an RDF open access repository of stem cells, viruses, mice, core facilities, lab equipment and people. I use OpenRefine to facet by URI, by date, by label, to clean the data up, search for anomalies and export in csv, Excel or whatever format is needed for my purpose.
I use OpenRefine’s General Refine Expression Language functionality (GREL) to perform complex search and replace, to transform the data, to break up and combine strings.
OpenRefine is the gateway to my deeper understanding of eagle-i, and has led the way to new uses of the data for assessment of facility use and resource discoverability.
– Juliane Schneider, Lead Data Curator, Harvard Catalyst, based in Chicago.
Have a favorite tool of your own? Please tell us about it!
DATA CLEANING · OPENREFINE · TOOLS
Dialogue & Discussion
You can review our commenting policy here.