chemook78
chemook78

Reputation: 1198

What does the row function do in this D3 multi-series line graph?

I am trying to make a multi-series line chart with D3 based on this example by Mike Bostock:

https://bl.ocks.org/mbostock/3884955

I am having a hard time trying to understand what happens after loading a data.tsv file, and specifically the "type" function. It seems to parse the data and converts the temperature strings to numbers. But what else does it do and why is it necessary?:

function type(d, _, columns) {
  d.date = parseTime(d.date);
  for (var i = 1, n = columns.length, c; i < n; ++i) d[c = columns[i]] = +d[c];
  return d;
}

Here is the full code:

var svg = d3.select("svg"),
    margin = {top: 20, right: 80, bottom: 30, left: 50},
    width = svg.attr("width") - margin.left - margin.right,
    height = svg.attr("height") - margin.top - margin.bottom,
    g = svg.append("g").attr("transform", "translate(" + margin.left + "," + margin.top + ")");

var parseTime = d3.timeParse("%Y%m%d");

var x = d3.scaleTime().range([0, width]),
    y = d3.scaleLinear().range([height, 0]),
    z = d3.scaleOrdinal(d3.schemeCategory10);

var line = d3.line()
    .curve(d3.curveBasis)
    .x(function(d) { return x(d.date); })
    .y(function(d) { return y(d.temperature); });

d3.tsv("data.tsv", type, function(error, data) {
  if (error) throw error;

  var cities = data.columns.slice(1).map(function(id) {
    return {
      id: id,
      values: data.map(function(d) {
        return {date: d.date, temperature: d[id]};
      })
    };
  });

  x.domain(d3.extent(data, function(d) { return d.date; }));

  y.domain([
    d3.min(cities, function(c) { return d3.min(c.values, function(d) { return d.temperature; }); }),
    d3.max(cities, function(c) { return d3.max(c.values, function(d) { return d.temperature; }); })
  ]);

  z.domain(cities.map(function(c) { return c.id; }));

  g.append("g")
      .attr("class", "axis axis--x")
      .attr("transform", "translate(0," + height + ")")
      .call(d3.axisBottom(x));

  g.append("g")
      .attr("class", "axis axis--y")
      .call(d3.axisLeft(y))
    .append("text")
      .attr("transform", "rotate(-90)")
      .attr("y", 6)
      .attr("dy", "0.71em")
      .attr("fill", "#000")
      .text("Temperature, ºF");

  var city = g.selectAll(".city")
    .data(cities)
    .enter().append("g")
      .attr("class", "city");

  city.append("path")
      .attr("class", "line")
      .attr("d", function(d) { return line(d.values); })
      .style("stroke", function(d) { return z(d.id); });

  city.append("text")
      .datum(function(d) { return {id: d.id, value: d.values[d.values.length - 1]}; })
      .attr("transform", function(d) { return "translate(" + x(d.value.date) + "," + y(d.value.temperature) + ")"; })
      .attr("x", 3)
      .attr("dy", "0.35em")
      .style("font", "10px sans-serif")
      .text(function(d) { return d.id; });
});

function type(d, _, columns) {
  d.date = parseTime(d.date);
  for (var i = 1, n = columns.length, c; i < n; ++i) d[c = columns[i]] = +d[c];
  return d;
}

Upvotes: 1

Views: 1450

Answers (1)

Gerardo Furtado
Gerardo Furtado

Reputation: 102194

That row function (here named type) is actually doing just this:

function type(d){
    d.date = parseTime(d.date);
    d["New Your"] = +d["New Your"];
    d["San Francisco"] = +d["San Francisco"];
    d["Austin"] = +d["Austin"];
    return d;
}

So, it basically parses the date property and coerces all the other properties to numbers, as you correctly guessed.

What else does it do?

Nothing, just that.

Why is it necessary?

Even if in your CSV or TSV the values are numbers, d3.csv and d3.tsv populate the objects with strings, not numbers. They need to be converted to numbers.

Regarding the date it's pretty obvious that they need to be parsed.

Explanation

d3.csv and d3.tsv accept a row conversion function. That function has three defined parameters:

If a row conversion function is specified, the specified function is invoked for each row, being passed an object representing the current row (d), the index (i) starting at zero for the first non-header row, and the array of column names.

That being said, in the type function:

  • d is the whole object representing each row
  • _ is its index of that object and
  • columns is the header row.

In your case, this is columns:

["date", "New York", "San Francisco", "Austin"]

Now, let's analyse the for... loop.

The variable i starts at 1 (avoiding date) and ends at 3. For each loop, it gets a property...

d[c = columns[i]]

... and coerces its value (which is a string) to a number:

columns[i]] = +d[c];

Thus, for instance, in the first loop, when i = 1, this is what happens:

d[c = "New York"] = +d[c];

The same happens to all other properties. Then, for the next row (object), the for.. loops again, and so on until the end of the data array.

Here is a demo with three rows only:

var parseTime = d3.timeParse("%Y%m%d");
var data = d3.csvParse(d3.select("#csv").text(), type);

function type(d, _, columns) {
  d.date = parseTime(d.date);
  for (var i = 1, n = columns.length, c; i < n; ++i) {
    d[c = columns[i]] = +d[c];
    console.log("row is " + _ +",columns[i] is " + columns[i] + ", +d[c] is " + (+d[c]))
  }
  return d;
}
pre {
  display: none;
}
<script src="//d3js.org/d3.v4.min.js"></script>
<pre id="csv">date,New York,San Francisco,Austin
20111001,63.4,62.7,72.2
20111002,58.0,59.9,67.7</pre>

PS: I suggest you edit your question's title. The title is the most important part of the question, and 96.38% of the users (source: FakeData Inc.) only read the title. Right now, your tittle doesn't match your real question. Thus, it could be "What does the row function do in this code?" or something like this.

Upvotes: 4

Related Questions