prototype
prototype

Reputation: 7970

right way to modify d3.csv to lower case column names

Have an app that processes csv data files provided by clients. Javascript is case sensitive. But most stats packages are case-insensitive, so the column names come in various capitalizations. Further the capitalizations can change if they, for instance, run the file through SAS, SPSS, or other programs and save as a new file, converting everything to upper or lower case.

So I modify D3 such that it automatically converts column names to lower case. This can easily be done by modifying the d3.js source code to add .toLocaleUpperCase() as below:

    ...
    dsv.parse = function(text, f) {
      var o;
      return dsv.parseRows(text, function(row, i) {
        if (o) return o(row, i - 1);
        var a = new Function("d", "return {" + row.map(function(name, i) {
          return JSON.stringify(name.toLocaleUpperCase()) + ": d[" + i + "]";
        }).join(",") + "}");
        o = f ? function(row, i) {
          return f(a(row), i);
        } : a;
      });
    };
    ...

The only small challenge is that I can't just add this as a small monkey patch or plugin in my code. This code block gets called when d3.dsv is imported, and that depends d3_xhr being defined, which requires d3.xhr, etc. So I download the entire d3.js and modify it as above, and save as d3.mod.js.

Can imagine this biting me or another dev in a year or two when we need to update d3. What's the right™ way to handle a mod like this?

Upvotes: 1

Views: 2193

Answers (2)

NicoWheat
NicoWheat

Reputation: 2441

This might be a lot faster in some circumstances. It is what I normally do to prevent iterating and changing every key one by one.


let csvText = "year,make,model,length\n1997,Ford,E350,2.34\n2000,Murcury,Cougar,2.38\n"

//or - to convert your csv object into a string
// let csvText = d3.csvFormat(csvObject)

let csvRows = csvText.split("/n") //convert csv string into array of rows
let arrayOfColumnHeaders = csvRows[0].split(",") //get column names as array of strings
let newColumns = arrayOfColumnHeaders.map(column => column.toLowerCase()).join(",") //change them
csvRows[0]=newColumns //replace the headers with new ones
let newtext= csvRows.join("\n") // convert array of rows back into single string
let csvObject = d3.csvParse(newtext) //now parse it

Upvotes: 0

go-oleg
go-oleg

Reputation: 19480

You can use the accessor that d3.csv accepts to specify how you want the row data transformed.

The function below takes an object and converts all of its properties to lowercase. The example uses d3.csv.parse() instead of d3.csv() because its more straightforward to demo here, but you can do the same thing with d3.csv().

Unfortunately, this function gets called for every row as opposed to once when it reads in the row headers. Perhaps there is a better way...

var string = "Year,Make,Model,Length\n" +
  "1997,Ford,E350,2.34\n" +
  "2000,Mercury,Cougar,2.38\n";

function convertPropsToLowerCase(d) {
  Object.keys(d).forEach(function(origProp) {
    var lowerCaseProp = origProp.toLocaleLowerCase();
    // if the uppercase and the original property name differ
    // save the value associated with the original prop
    // into the lowercase prop and delete the original one
    if (lowerCaseProp !== origProp) {
      d[lowerCaseProp] = d[origProp];
      delete d[origProp];
    }
  });
  return d;
}

var obj = d3.csv.parse(string, convertPropsToLowerCase);

console.log(JSON.stringify(obj,null, '\t'));
/*
[
	{
		"year": "1997",
		"make": "Ford",
		"model": "E350",
		"length": "2.34"
	},
	{
		"year": "2000",
		"make": "Mercury",
		"model": "Cougar",
		"length": "2.38"
	}
] 
*/
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/3.4.11/d3.min.js"></script>

Upvotes: 3

Related Questions