Reputation: 77
I am trying read a CSV text file and print all entries of one column (sorted), based on a condition.
The input sample is as below:
Computer ID,User ID,M
Computer1,User3,5
Computer2,User5,8
computer3,User4,9
computer4,User10,3
computer5,User9,0
computer6,User1,11
The user-ID (2nd column) needs to be printed if the hours (third column) is greater than zero. However, the printed data should be sorted based on the user-id.
I have written the following script:
while IFS=, read -r col1 col2 col3 col4 col5 col6 col7 || [[ -n $col1 ]]
do
if [ $col3 -gt 0 ]
then
echo "$col2" > login.txt
fi
done < <(tail -n+2 user-list.txt)
The output of this script is:
User3
User5
User4
User10
User1
I am expecting the following output:
User1
User3
User4
User5
User10
Any help would be appreciated. TIA
Upvotes: 1
Views: 1094
Reputation: 37464
Using awk for condition handling and sort
for ordering:
$ awk -F, ' # comma delimiter
FNR>1 && $3 { # skip header and accept only non-zero hours
a[$2]++ # count instances for duplicates
}
END {
for(i in a) # all stored usernames
for(j=1;j<=a[i];j++) # remove this if there are no duplicates
print i | "sort -V" # send output to sort -V
}' file
Output:
User1
User3
User4
User5
User10
If there are no duplicated usernames, you can replace a[$2]++
with just a[$2]
and remove the latter for
. Also, no real need for sort
to be inside awk program, you could just as well pipe data from awk to sort
, like:
$ awk -F, 'FNR>1&&$3{a[$2]++}END{for(i in a)print i}' file | sort -V
FNR>1 && $3
skips the header and processes records where hours column is not null. If your data has records with negative hours and you only want positive hours, change it to FNR>1 && $3>0
.
Or you could use grep
with PCRE andsort
:
$ grep -Po "(?<=,).*(?=,[1-9])" file | sort -V
Upvotes: 1
Reputation: 12425
Sort ASCIIbetically:
tail -n +2 user-list.txt | perl -F',' -lane 'print if $F[2] > 0;' | sort -t, -k2,2
computer6,User1,11
computer4,User10,3
Computer1,User3,5
computer3,User4,9
Computer2,User5,8
Or sort numerically by the user number:
tail -n +2 user-list.txt | perl -F',' -lane 'print if $F[2] > 0;' | sort -t, -k2,2V
computer6,User1,11
Computer1,User3,5
computer3,User4,9
Computer2,User5,8
computer4,User10,3
Upvotes: 1
Reputation: 12887
awk -F, 'NR == 1 { next } $3 > 0 { match($2,/[[:digit:]]+/);map[$2]=substr($2,RSTART) } END { PROCINFO["sorted_in"]="@val_num_asc";for (i in map) { print i } }' user-list.txt > login.txt
Set the field delimiter to commas with -F, Ignore the header with NR == 1 { next } Set the index of an array (map) to the user when the 3rd delimited field is greater than 0. The value is set the number part of the User field (found with the match function) In the end block, set the sort order to value, number, ascending and loop through the map array created.
Upvotes: 2
Reputation: 15239
The problem with your script (and I presume with the "sorting isn't working") is the place where you redirect (and may have tried to sort) - the following variant of your own script does the job:
#!/bin/bash
while IFS=, read -r col1 col2 col3 col4 col5 col6 col7 || [[ -n $col1 ]]
do
if [ $col3 -gt 0 ]
then
echo "$col2"
fi
done < <(tail -n+2 user-list.txt) | sort > login.txt
Edit 1: Match new requirement
Sure we can fix the sorting; sort -k1.5,1.7n > login.txt
Of course, that, too, will only work if your user IDs are all 4 alphas and n digits ...
Upvotes: 1