Reputation: 841
I have a file. This file has about 3,000 lines
I have selected four lines of it. The content is like:
user=bio-wangxf group=bio-jinwf etime=1556506215 start=1556506216 unique_node_count=1 end=1556524815 Exit_status=0
user=bio-wangxf group=bio-jinwf jobname=cellranger start=1556506216 end=1556555583 Exit_status=0 resources_used.cput=338425
user=maad-inspur01 group=maad-huangsd jobname=2d-1d9-4.3-1152-RK2 queue=cal-l start=1554626044 exec_host=cu017/0-23 end=1554626044
user=maad-inspur01 group=maad-huangsd jobname=testmatlab queue=cal-l ctime=1554632326 qtime=1554632326 etime=1554632326 start=1554632328 owner=maad-inspur01@ln01 exec_host=cu191/0-11 Resource_List.nodect=1 Resource_List.nodes=1:ppn=12 session=15549 unique_node_count=1 end=1554643410 Exit_status=0 resources_used.cput=7102 resources_used.mem=31315760kb resources_used.vmem=96803568kb resources_used.walltime=03:04:42
user=iese-liul group=iese-zhengchm jobname=ssh queue=fat ctime=1555483302 qtime=1555483302 etime=1555483302 start=1555489505 owner=iese-liul@ln04 exec_host=fat02/0-17,126-142 Resource_List.neednodes=1:ppn=35 Resource_List.nodect=1 Resource_List.nodes=1:ppn=35 Resource_List.walltime=72:00:00 session=31961 total_execution_slots=35 unique_node_count=1 end=1555498389 Exit_status=0 resources_used.cput=38523
Now i want to select the user, group, start, end.
The correct result should be like:
user=bio-wangxf group=bio-jinwf start=1556506216 end=1556524815
user=bio-wangxf group=bio-jinwf start=1556506216 end=1556555583
user=maad-inspur01 group=maad-huangsd start=1554626044 end=1554626044
user=maad-inspur01 group=maad-huangsd start=1554632328 end=1554643410
user=iese-liul group=iese-zhengchm start=1555489505 end=1555498389
Because each row has a different num of column, i can not use awk to select.
I have tried:
awk '{if($15~/end/) print $1" "$2" "$4" "$15; else if($18~/end/) print $1" "$2" "$8" "$18}' filename
I can not get the correct result. some lines is missed, because start and end is not in the fixed column.
Who can help me?
Upvotes: 2
Views: 94
Reputation: 67211
If you are OK with perl. Check the below solution:
perl -lane 'for(@F){$a.=" ".$_ if(/user=|start=|end=|group=/)}print $a;undef $a' your_file
Upvotes: 0
Reputation: 37394
You can still use awk:
$ awk '{
for(i=1;i<=NF;i++) # loop fields
if($i~/^(user|group|start|end)=/) # look for keyword
b=b (b==""?"":OFS) $i # buffer matching field
print b # print buffer
b="" # reset and repeat
}' file
Output:
user=bio-wangxf group=bio-jinwf start=1556506216 end=1556524815
user=bio-wangxf group=bio-jinwf start=1556506216 end=1556555583
user=maad-inspur01 group=maad-huangsd start=1554626044 end=1554626044
user=maad-inspur01 group=maad-huangsd start=1554632328 end=1554643410
user=iese-liul group=iese-zhengchm start=1555489505 end=1555498389
Fields will be output in original order.
Upvotes: 4
Reputation: 26471
When you have a file with records/lines which consist of key-value pairs in the form of key1=value1_FS_key2=value2_FS_key3=value3 ...
where _FS_
is a field-separator (delimiter), I generally would store all key value pairs in an array where you can use the key to lookup the value or the object of interest. In this case it is the complete key-value combination.
In awk this reads like:
awk '{for(i=1;i<=NF;++i) if(match($i,"=")) a[substr($i,1,RSTART-1)]=$i}
{ print a["user"],a["group"],a["start"],a["end"] }
{ delete a }' file
This method is extremely flexible and POSIX compliant. The following modifications are easily made:
awk 'BEGIN{FS=OFS=";"}{...}'
Of course, a problem could arise when you want to print a key which is not in the line. Assume "group" is not available in the line, currently, it would print something like:
user=bio-wangxf start=1556506216 end=1556555583
This might not be what you want, and maybe you would like to have something like
user=bio-wangxf group=NA start=1556506216 end=1556555583
This can then be done with the usage of a simple function
awk 'function lookup(key) { return (key in a ? a[key] : key"=NA") }
{for(i=1;i<=NF;++i) if(match($i,"=")) a[substr($i,1,RSTART-1)]=$i}
{ print lookup("user"),lookup("group"),lookup("start"),lookup("end") }
{ delete a }' file
Upvotes: 1
Reputation: 22012
Please try the following:
awk '
BEGIN {f["user"] = f["group"] = f["start"] = f["end"] = 1}
{for (i=1; i<=NF; i++) {
split($i, a, "=")
if (f[a[1]]) printf("%s ", $i)
}
print ""
}' filename
The ugly point is each line contains an extra whitespace at the end of line.
Hope this helps.
Upvotes: 0