dMb
dMb

Reputation: 9337

add the filename to a csv file as a column

I have a long list of csv files that use for data mining (new files are coming in daily). Each file name contains a date when the file was created. I need to parse the date out of the filename and add it as a new column to each row in the file (changing the header line would also be nice).

so if I have a file named cx3-2016-04-01.csv with the following content:

country,os,os_ver,oem,model
CN,A,6.0,Xiaomi,MI NOTE    
US,A,6.0,LGE,LGLS7700
CN,A,6.0,Xiaomi,MI 4LTE
US,A,6.0,LGE,LGUS991
US,A,6.0,LGE,LGUS991

I want the output to look like:

date,country,os,os_ver,oem,model
2016-04-01,CN,A,6.0,Xiaomi,MI NOTE    
2016-04-01,US,A,6.0,LGE,LGLS7700
2016-04-01,CN,A,6.0,Xiaomi,MI 4LTE
2016-04-01,US,A,6.0,LGE,LGUS991
2016-04-01,US,A,6.0,LGE,LGUS991

Can and how do I do this using standard linux command line tools in a single command or command chain (but not with a script)?

Upvotes: 0

Views: 2261

Answers (1)

Srikanth Lankapalli
Srikanth Lankapalli

Reputation: 136

Try this awk

Run this on the path where the file is stored Or provide the filename with the path. In the below I just gave the file name ( cx3-2016-04-01.csv ) towards the end.

awk ' { x=1 ; if ( x == NR ) { print "date,country,os,os_ver,oem,model" } else { gsub("cx3-","",FILENAME); gsub(".csv","",FILENAME); print FILENAME","$0 } } ' cx3-2016-04-01.csv

How it works

  1. First Line is Hard Coded for the Header ( date,country,os,os_ver,oem,model )

  2. For Every other line, Filename's "cx3-" and ".csv" is removed and added to the start of the line with a , ( comma ).

Here is the output the above command produces. date,country,os,os_ver,oem,model 2016-04-01,CN,A,6.0,Xiaomi,MI NOTE 2016-04-01,US,A,6.0,LGE,LGLS7700 2016-04-01,CN,A,6.0,Xiaomi,MI 4LTE 2016-04-01,US,A,6.0,LGE,LGUS991 2016-04-01,US,A,6.0,LGE,LGUS991

Upvotes: 1

Related Questions