Reputation: 691
I have a file like this:
Date Time Level Gridsize Miss Temp Parameter ID
1988-05-16 12:00:00 -3 1 0 27.060 -1
1988-05-16 12:00:00 -3 1 0 9.0300 -2
1988-05-16 12:00:00 -3 1 0 1.2000 -3
1988-05-17 12:00:00 -3 1 0 27.100 -1
1988-05-17 12:00:00 -3 1 0 9.0200 -2
1988-05-17 12:00:00 -3 1 0 1.2300 -3
1988-05-18 12:00:00 -3 1 0 27.190 -1
1988-05-18 12:00:00 -3 1 0 9.0400 -2
1988-05-18 12:00:00 -3 1 0 1.2200 -3
These are temperature data from sensors in different locations. The field Parameter ID
determines if the data was taken from sensor -1, -2 or -3. I'm exporting this data to a CSV file to analyze it. The desired format is:
Date Time -1 -2 -3
1988-05-16 12:00:00 27.060 9.0300 1.2000
1988-05-17 12:00:00 27.100 9.0200 1.2300
1988-05-18 12:00:00 27.190 9.0400 1.2200
It groups data by date and time, and separates in columns regarding the value of Parameter ID
.
I'm not sure about how doable this is with AWK, but I'm once I mostly use SSH to prepare this data, a solution with AWK is very interesting for me. Also, if there are other tools to achieve this on bash, I'm interested :) (sed, whatever, but got to use native linux commands because I'm not allowed to install anything on the server)
What I'm doing nowadays
Nowadays I run a code for each sensor
$ awk '$NF == "-1" {print $1";"$2";"$6}' > netcdf_extract_1.csv
$ awk '$NF == "-2" {print $1";"$2";"$6}' > netcdf_extract_2.csv
$ awk '$NF == "-3" {print $1";"$2";"$6}' > netcdf_extract_3.csv
And import them on python. Then I group them by date and time and got my table.
Upvotes: 1
Views: 299
Reputation: 518
Here is my attempt:
awk 'BEGIN{print "Date\t","Time\t", "-1\t", "-2\t", "-3\t";
PROCINFO["sorted_in"] = "@ind_str_asc" }
NR > 1{ a[$1] = a[$1]" " $6 ; b[$1] = $2;next }
END{ for ( i in a ) print i, b[i], a[i] }' file1 | column -t
Output:
Date Time -1 -2 -3
1988-05-16 12:00:00 27.060 9.0300 1.2000
1988-05-17 12:00:00 27.100 9.0200 1.2300
1988-05-18 12:00:00 27.190 9.0400 1.2200
As Roman has everything in one array I splittet the task up to two arrays.
Upvotes: 1
Reputation: 92854
GNU awk
solution:
awk 'BEGIN{ PROCINFO["sorted_in"]="@ind_str_asc"; }
NR > 1{ a[$1"\t"$2][$7]=$6 }
END{
printf "Date\tTime\t-1\t-2\t-3\n";
for (i in a) print i,a[i][-1],a[i][-2],a[i][-3]
}' file | column -t
a[$1"\t"$2][$7]=$6
:
a
- multidimensional array$1"\t"$2
- array key/index constructed by concatenation of Date/Time fields[$7]
- inner array indexed with Parameter ID field value$6
- crucial value from Temp fieldThe output:
Date Time -1 -2 -3
1988-05-16 12:00:00 27.060 9.0300 1.2000
1988-05-17 12:00:00 27.100 9.0200 1.2300
1988-05-18 12:00:00 27.190 9.0400 1.2200
Upvotes: 2