Steve D
Steve D

Reputation: 393

Fastest way to gather JS objects by date

I have my bank transaction data in the following format:

var transactions = {
    food: [
        {
            date: new Date('2016-01-09'),
            amount: 123.45
        },
        {
            date: new Date('2016-01-16'),
            amount: 87.88
        },
        {
            date: new Date('2016-01-23'),
            amount: 99.99
        },
        {
            date: new Date('2016-01-30'),
            amount: 99.99
        }
    ],
    doctor: [
        {
            date: new Date('2016-01-15'),
            amount: 1124.01
        },
        {
            date: new Date('2016-01-16'),
            amount: 656.00
        },
        {
            date: new Date('2016-01-23'),
            amount: 1000.00
        },
    ]
}

That is, an object that looks like {transaction_type: [array of transactions]}.

I would like to group these transactions by date, so in the end I get

var aligned_transactions = [
    {
        date: new Date('2016-01-09'),
        amounts: [123.45]
    },
    {
        date: new Date('2016-01-15'),
        amounts: [1124.01]
    },
    {
        date: new Date('2016-01-16'),
        amounts: [87.88, 656.00]
    },
    {
        date: new Date('2016-01-23'),
        amounts: [99.99, 1000.00]
    },
    {
        date: new Date('2016-01-30'),
        amounts: [99.99]
    }
]

So the amounts are now grouped by date. Of course, in the real setting, there are several hundred transaction types, each with an array of thousands of transactions (roughly 1 million transactions to process). What is the "fastest" way to transform the transactions in this manner? Here, fastest means total time spent doing the transformation. A jsperf result would be awesome.

Note that I have tried several approaches, and have found the "obvious" approach of nested for loops to be quite slow: about 12 seconds on my machine with 1 million total transactions (Chrome, Ubuntu). I'm guessing creating all these new objects is taking its toll.

One approach that was somewhat promising was to "vertically" slice these transaction lists, so that I get a bunch of small arrays, which I then create objects from, and recursively "merge" them together. This was pretty fast, about 6 seconds on my machine with the same 1 million transactions above. Still, I'm hoping there's a much faster way.

Edit:

Here's the nested for loop solution:

function align_data(transaction_types) {
    var i, j, transaction, transactions;
    var timestamps = {};
    for (i = 0; i < transaction_types.length; i++) {
        transactions = transaction_types[i];
        for (j = 0; j < transactions.length; j++) {
            transaction = transactions[j];
            if (timestamps[transaction.date]) {
                timestamps[transaction.date].amounts.push(transaction.amount);
            } else {
                timestamps[transaction.date] = {
                    date: transaction.date,
                    amounts: [transaction.amount]
                };
            }
        }
    }

    var aligned = [];
    for (date in timestamps) {
        if (timestamps.hasOwnProperty(date)) {
            aligned.push(timestamps[date]);
        }
    }

    return aligned;
}

Upvotes: 0

Views: 235

Answers (3)

Douglas
Douglas

Reputation: 5087

I just tested it, running your code takes about 5-6 seconds on a random set of 1 million records on my computer, while the following took maybe half a second:

function align_data(transaction_types) {
    var i, j, transaction, transactions;
    var timestamps = {};
    for (i = 0; i < transaction_types.length; i++) {
        transactions = transaction_types[i];
        for (j = 0; j < transactions.length; j++) {
            transaction = transactions[j];
            if (timestamps[transaction.date.getTime()]) {
                timestamps[transaction.date.getTime()].amounts.push(transaction.amount);
            } else {
                timestamps[transaction.date.getTime()] = {
                    date: transaction.date,
                    amounts: [transaction.amount]
                };
            }
        }
    }

    var aligned = [];
    for (date in timestamps) {
        if (timestamps.hasOwnProperty(date)) {
            aligned.push(timestamps[date]);
        }
    }

    return aligned;
}

All I changed was indexing timestamps by transaction.date.getTime() instead of transaction.date.

Upvotes: 1

jgfооt
jgfооt

Reputation: 954

How about using a single accumulator object across all your records, and then turning that into your preferred array format:

function align_data(transactions) {
    var acc = {};
    var d;
    for (var key in transactions) {
        transactions[key].forEach(function (record) { 
            d = record.date.toISOString();
            if (d in acc) {
                acc[d].push(record.amount);
            } else {
                acc[d] = [record.amount];
            }
        });
    }
    var aligned_transactions = [];
    for (var date in acc) {
        aligned_transactions.push({date: new Date(date), amounts: acc[date]});
    }
    return aligned_transactions;
}

Upvotes: 0

Denys Konoplin
Denys Konoplin

Reputation: 362

I think that it is a situation, when you really should try to use WebWorkers.

Upvotes: 0

Related Questions