Reputation: 35
I am writing a AWK script that is going to have the user input the fields and have the script count the amount of times each word appears in that field. I have the code set up so that it already so that it prints out all of the fields and the amount of times each word occurs but I am trying to have only the user specified fields get counted. The user will be inputting CSV files so I am setting the FS to a comma.
Knowing that AWK assumes that all arguments are that are inputted are going to be a file, I set the arguments to an array and then delete them from ARGV array so it will not throw an error.
#!/usr/bin/awk -f
BEGIN{ FS = ",";
for(i = 1; i < ARGC-1; i++){
arg[i] = ARGV[i];
delete ARGV[i];
}
}
{
for(i=1; i <=NF; i++)
words[($i)]++
}
END{
for( i in words)
print i, words[i];
}
So if the user inputs a CSV file such as...
A,B,D,D
Z,C,F,G
Z,A,C,D
Z,Z,C,Q
and the user wants to have only field 3 counted for the output should be...
C 3
F 1
Or if the user 1 and 3 for the fields...
A 2
B 1
C 1
Z 4
Upvotes: 0
Views: 75
Reputation: 133760
Could you please try following(I have written this on mobile so couldn't test it).
awk -v fields="1,3" '
BEGIN{
FS=OFS=","
num=split(fields,array,",")
for(j=1;j<=num;j++){
a[array[j]]
}
}
{
for(i=1;i<=NF;i++){
if(i in a){
count[$i]++
}
}
}
END{
for(h in count){
print h,count[h]
}
}
' Input_file
I believe this should work for parsing multiple Input_files too. If needed you could try passing multiple files to it.
Explanation: Following is only for explanation purposes.
-v fields="1,3"
creating a variable named fields whose value is user defined, it should be comma separated, for an example I have taken 1 and 3, you could keep it as per Your need too.
BEGIN{......}
starting BEGIN section here where mentioning field separator and output field separator as Comma for all lines of Input_file(s). Then using split I am splitting variable fields to an array named array whose delimiter is comma. Variable num is having length of fields variable in it. Starring a for loop from 1 to till value of num. In it creating an array named a whose index is value of array whose index is variable j value.
MAIN Section: now starting a for loop which traverse through all of the fields of lines. Then it checks if any field number is coming into array named a which we created in BEGIN section, if yes then it is creating an array named count with index of current column + taking its count too. Which we need as per OP's requirement.
Finally in this program's END
section traversing through array count and printing it's indexes with their counts.
Upvotes: 2
Reputation: 37464
Another:
$ awk -F, -v p="1,2" '{ # parameters in comma-separated var
split(p,f) # split parameters to fields var
for(i in f) # for the given fields
c[$f[i]]++ # count chars in them
}
END { # in the end
for(i in c)
print i,c[i] # output chars and counts
}' file
Output for fields 1 and 2:
A 2
B 1
C 1
Z 4
Upvotes: 2