Remove duplicates from parsed CSV array

Question

I made progress with this based on a previous question, but got one last hurdle. I have the following data in a csv.

this_year   |   minus_one_year  |   minus_two_year  |   minus_three_year
-------------------------------------------------------------------------
1           |   2               |   2               |   3
-------------------------------------------------------------------------
4           |   5               |   5               |   5
-------------------------------------------------------------------------
2           |   2               |   2               |   2
-------------------------------------------------------------------------
4           |   5               |   4               |   4
-------------------------------------------------------------------------
1           |   2               |   3               |   3
-------------------------------------------------------------------------

I read this csv file and now I need produce nodes. A node contains a node (column heading) along with the value. So in the above data, you can see that this_year has 3 distinct values, 1, 2 and 4, so the nodes for this should look like this.

{
  "nodes": [
    {
      "name": "1",
      "node": "this_year"
    },
    {
      "name": "2",
      "node": "this_year"
    },
    {
      "name": "4",
      "node": "this_year"
    }
  ]
}

The other nodes for the other columns should also be produced. So far I have this

d3.csv('my_csv.csv')
    .then(function(data) {

        let graph = {"nodes" : [], "links" : []};

        graph.nodes = data.reduce(function(acc, line){
            return acc.concat(Object.entries(line).map(function(column){
                return {name: column[0], node: column[1]}
            }))}, [])
            .sort(function (n1, n2) {
                return d3.ascending(n1.name, n2.name);
            });

        console.log('nodes:', JSON.stringify(graph.nodes));

    }).catch(function(error){
});

This produces the following

[
  {
    "name": "this_year",
    "node": "1"
  },
  {
    "name": "this_year",
    "node": "4"
  },
  {
    "name": "this_year",
    "node": "2"
  },
  {
    "name": "this_year",
    "node": "4"
  },
  {
    "name": "this_year",
    "node": "1"
  }
]

So it has the correct format, but it is outputting the duplicates, it should only contain one each for 1, 2 and 4. How can I remove these duplicates? I have looked at reduceRight, is this something I can use?

Thanks

Andrew · Accepted Answer

Under the assumption that your data is so well groomed, a quick and dirty way is to combine the object's values into strings and make a Set out of them. Sets, by nature, cannot have duplicates. But at the same time, JavaScript Sets can only understand duplicates that are primitive data types (numbers, strings, etc.)

Concat the values, turn it into a Set, turn it back into the data structure it was previously.

Please tell me if you find anything confusing about my syntax. I make liberal use of modern JavaScript syntax.

Edit: You can use Array#filter to be selective as to what should exist in the array.

const values = [
    {
        "name": "this_year",
        "node": "NA"
    },
    {
        "name": "this_year",
        "node": "1"
    },
    {
        "name": "this_year",
        "node": "4"
    },
    {
        "name": "this_year",
        "node": "2"
    },
    {
        "name": "this_year",
        "node": "4"
    },
    {
        "name": "this_year",
        "node": "1"
    }
]
function constructUniques(array) {
    const concattedStringValues = array
           .filter(({node}) => node !== 'NA')
           .map(({name, node}) => `${name} ${node}`)
    const uniqueStrings = new Set(concattedStringValues)
    return [...uniqueStrings].map(concattedVal => {
        const [name, node] = concattedVal.split(' ')
        return {name, node }
    })
}

console.log(constructUniques(values))

Remove duplicates from parsed CSV array

Answers (1)

Related Questions