Reputation: 85
my question concerns the following: I have the file:
FileA:
Peter Programmer
Frank Chemist
Charles Physicist
John Programmer
Alex Programmer
Harold Chemist
George Chemist
I now got all the job information from FileA and saved it to a unique list (FileB).
FileB:
Programmer
Chemist
Physcist
(Assume the FileA goes on and on with many more people and redundant information)
What I want to do now is get all the job classes from FileA and create a new file for each Job-Class so that in the end I have:
FileProgrammer
Peter Programmer
John Programmer
Alex Programmer
FileChemist
Frank Chemist
Harold Chemist
George Chemist
FilePhysicist
Charles Physicist
I want to grep
the pattern of the job name from the list in the Jobs File and create a new file for every job name which exists in the other original file.
So in reality, I have 56 Unique Elements in my list and the original file has several columns ( tab delimited).
What I did so far was this:
cut -f2 | sort | uniq > Jobs
grep -f(tr '\t' '\n' < "${Jobs}") "${FileA}" > FileA+"${Jobs}"
I assumed that on each new pattern match a new file would be created but I realized that it would just copy the file because there is no increment or iterative file creation.
Since my experience with bash is yet to be developed in depth, I hope you guys can help me. Thanks in advance.
@update: Input file looks like this:
4 23454 22110 Direct + 3245 Corrected
3 21254 12110 Indirect + 2319 Paused-@2
11 45233 54103 Direct - 1134 Not-Corrected
Essentially, I want everything that has the status in column 7 of Corrected to be in a file named corrected and so for every unique value of column 7.
Upvotes: 2
Views: 118
Reputation: 2111
You can do it with grep
inside a loop with:
for i in $(cat FileB); do grep $i$ FileA >> File$i; done
Note that in FileA of your question you wrote "Physicist" and in FileB you wrote "Physcist", so they won't match. Anyway if you write both of them properly, the above command will work.
Upvotes: 1
Reputation: 85895
The answer craves for need of Awk, here is how you do it,
awk '{unique[$2]=(unique[$2] FS $1)}\
END {for (i in unique) { \
len=split(unique[i],temp); \
for (j=1;j<=len;j++) print temp[j],i > "File"i".txt"} }' \
file
The idea is to create a hash-map, with unique[$2]=(unique[$2] FS $1)
, which literally means, treat $2
as the index for the array unique
and have values appended from $1
, so at the end of each line processing of your input file, the array looks like this,
# <key> <value(s)>
Chemist Frank Harold George
Physicist Charles
Programmer Peter John Alex
The END
clause is executed after all the lines are processed, so from the array constructed, using the split()
function which splits on a single whistespace, we store the contents of the array value to the array temp
, and len
contains the number of elements resulting after the split.
A loop for each hash index and with each of the split element, the values are printed and stored in the file.
Upvotes: 2