Thiago Negri
Thiago Negri

Reputation: 5351

How to transform date-based CSV file into a spreadsheet?

I have a file like this:

100,2015-09-01
100,2015-09-02
105,2015-09-01
645,2017-01-01

It's just a number and a date separated by a comma. It has 20000 lines.

I want to transform this file into a spreadsheet (Google, Excel, LibreOffice, any format, even a CSV) that says for each number if it appears for a given date. For example:

ID  | 2015-09-01 | 2015-09-02 | 2017-01-01
----+------------+------------+-----------
100 | YES        | YES        | NO
105 | YES        | NO         | NO
645 | NO         | NO         | YES

How can I do that?

I'm on Linux, so command line utilities like sed, awk, etc are available and preferred as a way to automate this.

Upvotes: 2

Views: 64

Answers (2)

Luis Colorado
Luis Colorado

Reputation: 12668

The CSV equivalent to the data you post (to be read by excel) is:

ID,2015-09-01,2015-09-02,2017-01-01
100,YES,YES,NO
105,YES,NO,NO
645,NO,NO,YES

as you can easily test if you create that spreadsheet and export it to CSV format.

So, to be able to read it as CSV, you have first to convert your data from the format you posted to the format I post above, and then import it as CSV. This can be done with the AWK script that has been posted here also by @ThiagoNegri, so I'm not going to repeat it again.

Upvotes: 0

Thiago Negri
Thiago Negri

Reputation: 5351

Could make it work using AWK.

awk -F "," '{
    numbers[$1]
    dates[$2]
    number_date[$1,$2]
} END {
    for (number in numbers) printf ";%s", number;
    printf "\n"
    for (date in dates) {
       printf "%s", date
       for (number in numbers)
          if ((number SUBSEP date) in number_date) printf ";YES"
          else printf ";NO"
       printf "\n"
    }
}' input.csv | sort > output.csv

Upvotes: 1

Related Questions