Zack
Zack

Reputation: 5148

How to get a specific column from the first row of a .csv file in bash?

I am writing a bash script that connects to a server, exports data to a .csv file and then runs a jar that uses that newly created file. The problem is, the jar requires the file name to include the value of the Timestamp column of the first row in the .csv file.

Here is the first line of my .csv file. In this case, the timestamp is 2012-11-01 located at the end of the row.

"####<Nov 1, 2012 12:00:01 AM UTC> <Warning> <AesoRMQAdapter::RabbitMQAdapter> <> <myServer> <[ACTIVE] ExecuteThread: '7' for queue: 'weblogic.kernel.Default (self-tuning)'> <> <> <> <1351728001726> <BEA-000000> <DEBUG SEND MESSAGE={"Volume":55.1,"OfferedVolume":54.8,"ArmedVolume":0.0,"Status":false,"BlockNr":0,"Timestamp":"2012-11-01T00:00:01+0000"}> "

My question is as followed.

How can I, after retrieving the .csv file...

  1. Grab the first timestamp from the first row in the .csv file
  2. Use that timestamp in a filename that I'll be saving the .csv file under

I appreciate all of your help!

Upvotes: 0

Views: 2960

Answers (3)

Lev Levitsky
Lev Levitsky

Reputation: 65791

For instance, with GNU grep:

ts=$(grep -Pom1 '(?<="Timestamp":")[^"]*' csv)

or with sed:

ts=$(sed -n '1s/.*"Timestamp":"\([^"]*\).*/\1/p' csv)

Then you can do

mv csv "$ts.txt"

where csv is the old name, and 2012-11-01T00:00:01+0000.txt will be the new name.

Upvotes: 0

Kent
Kent

Reputation: 195059

awk oneliner to do it in one shot:

awk -F':"' 'NR==1{split($NF,t,"T");print "mv "FILENAME" "t[1]".csv"}' file.csv

this will print the "mv" command line. if you want to execute it, just pipe the output to sh like:

awk ..... |sh

test:

kent$  cat dummy.csv 
"####<Nov 1, 2012 12:00:01 AM UTC> <Warning> <AesoRMQAdapter::RabbitMQAdapter> <> <myServer> <[ACTIVE] ExecuteThread: '7' for queue: 'weblogic.kernel.Default (self-tuning)'> <> <> <> <1351728001726> <BEA-000000> <DEBUG SEND MESSAGE={"Volume":55.1,"OfferedVolume":54.8,"ArmedVolume":0.0,"Status":false,"BlockNr":0,"Timestamp":"2012-11-01T00:00:01+0000"}> "
foo;bar;blah

kent$  awk -F':"' 'NR==1{split($NF,t,"T");print "mv "FILENAME" "t[1]".csv"}' dummy.csv
mv dummy.csv 2012-11-01.csv

Upvotes: 0

Julien Vivenot
Julien Vivenot

Reputation: 2250

Use head -1 to get only one line from your input file, then grep -o to retrieve all timestamps in this line then head -1 to keep only the first one.

$ date=`cat myfile.csv | head -1 | grep -o -e "[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}" | head -1`
$ echo $date
2012-11-01
$ mv myfile.csv myfile.$date.csv

Upvotes: 2

Related Questions