Reputation: 124
For the record I am a relatively new programmer
I have the code working, but it seems clunky and slow if it were to sort through many items
granted, this node app does not need to be fast, i.e the process could take 5 min and it would be fine, but I was curious if there was a better way to do it...
I have this node app which is comparing two data sets... the goals of the program are as follows
right now this is the code
const fs = require("fs");
const csv = require("csv-parser");
const fetch = require("node-fetch");
const results = [];
fs.createReadStream("./customers.csv")
.pipe(csv())
.on("data", (data) => {
results.push(data);
})
.on("end", () => {
console.log("Getting Customer Data from Waze...");
fetch("https://gql.waveapps.com/graphql/public", {
method: "post",
headers: {
//prettier-ignore
'Authorization': "Bearer MyAuth",
"Content-Type": "application/json",
},
body: JSON.stringify({
query: `
query {
business(id: "MyBusinessId") {
customers {
edges {
node {
id
name
}
}
}
}
}
`,
}),
})
.then((res) => res.json())
.then(({ data }) => {
console.log("Filtering Data...");
// this maps through the csv file
results.map((csv) => {
let array = [];
name = "";
data.business.customers.edges.map((customer) => {
// push the results of the expression (true of false) to an array
array.push(
customer.node.name.toLowerCase() === csv.name.toLowerCase()
);
// push nonexistent name (if there is one) variable so error handling is clear
if (customer.node.name.toLowerCase() !== csv.name.toLowerCase()) {
name = csv.name;
}
});
// if all elements in array are false, that means there is no matching name in the data.business.customers.edges array and error will be true, if there is a true field in the name, return false
const error = !array.some((el) => {
if (el) {
return true;
}
});
if (error) {
return console.log(
`Name: ${name} not found in Waze customer list, please check your spelling`
);
}
// send http request here
});
console.log("Finished Sending Invoices");
});
});
the customer.csv file
"name","domain","expiration-date"
"bob","yahoo.com","7/2/2020"
"suzie","google.com","12/1/2020"
right now the graphql api returns data that looks like this...
[
{
node: {
id: 'QnVzaW5lc3M6MzE4NmRmNDQtZDg4Zi00MzgxLTk5ZGEtYTQzMWRmYzhmMDk5O0N1c3RvbWVyOjQ3NTg0Mzc2',
name: 'NOInvoice'
}
},
{
node: {
id: 'QnVzaW5lc3M6MzE4NmRmNDQtZDg4Zi00MzgxLTk5ZGEtYTQzMWRmYzhmMDk5O0N1c3RvbWVyOjQ3NTg0MzU3',
name: 'Suzie'
}
},
{
node: {
id: 'QnVzaW5lc3M6MzE4NmRmNDQtZDg4Zi00MzgxLTk5ZGEtYTQzMWRmYzhmMDk5O0N1c3RvbWVyOjQ3NTgwODkx',
name: 'Bob'
}
}
]
any help would be appreciated greatly
Upvotes: 0
Views: 212
Reputation: 2501
Nested maps = O(n*m) time complexity = poor performance
First create a hashmap of the names from the API, then scan the csv array and check each name against the hashmap to see if it exists or not.
Using a hashmap is a common way to improve the performance of nested loops. The result will be closer to O(n+m) time complexity, significantly more performant.
// create hash of valid names from API
const validNames = data.business.customers.edges.reduce(
(names, customer) => {
names[customer.name] = customer; /* or = true */
return names;
},
{}
);
// see if any of the names in the csv are not valid
const err = results.reduce((err, csv) => validNames[csv.name] ? err: ++err, 0);
if (arr > 0) {
// have invalid names in CSV
}
// OR alternatively, find the invalid entries
const invalid = results.reduce(
(invalid, csv) => {
if (!validNames[csv.name]) invalid.push(csv);
return invalid;
},
[]
);
EDIT
// OR shorter version of find the invalid entries
const invalid = results.filter(csv => !validNames[csv.name]);
if (invalid.length) {
// have invalid names in CSV
}
Upvotes: 1
Reputation: 579
I think you're using a lot of extra variables such as array
, name
and error
that you actually don't need. So this not a performance optimization but an attempt to address the clunkiness of the code.
I'm pointing out some changes that you might consider.
results.map((csv) => {
customers_names = data.business.customers.edges.map((edge) => edge.node.name)
if(!customers_names.some((name) => name === csv.name)) {
console.log(`Name: ${csv.name} not found in Waze customer list, please check your spelling`)
}
})
Instead of:
results.map((csv) => {
let array = []; <-- (1)
name = ""; <-- (2)
data.business.customers.edges.map((customer) => {
// push the results of the expression (true of false) to an array
array.push(
customer.node.name.toLowerCase() === csv.name.toLowerCase()
);
// push nonexistent name (if there is one) variable so error handling is clear
if (customer.node.name.toLowerCase() !== csv.name.toLowerCase()) {
name = csv.name; <-- (3)
}
});
// if all elements in array are false, that means there is no matching name in the data.business.customers.edges array and error will be true, if there is a true field in the name, return false
const error = !array.some((el) => {
if (el) {
return true;
}
}); <-- (4)
if (error) { <-- (5)
return console.log(
`Name: ${name} not found in Waze customer list, please check your spelling`
);
}
// send http request here
});
(1) array
keeps the boolean
values that determines whether or not the csv.name
was found in the data (GraphQL
response). This array
will be iterated on (4). You don't really need two steps by iterating two different arrays, when you can actually find that out with the some
function and comparing the names.
In (2) you define a variable name
, and in (3) you keep updating that variable over and over again with the same value, which is csv.name
(it doesn't change because it doesn't depend on customer
at all). So I would remove that variable completely
(5) you only care about csv.name
in the logs. So I'm doing exactly that in the shorter version
Upvotes: 1