Reputation: 203
I have a myriad of text files of size 300k+ lines.
The files are in this general format:
Username <user> filename <file>
<some large amount of text on one line>
...
The text file has this strict format- one line of formatted header text, followed by one really long line, which is the meat and potatoes of the file.
What I want to do is go through the file and for every set of lines (a set consisting of headers and the one line) look for some matching string within this long line .
If the string is there, then I want to print user
and file
. If not, then we continue on and don't print anything. And for those who will ask, the point of this exercise is just to print this out and then i will do some manipulation at a later point.
I know how to do this, but it is sort of brute force- just store the user and file when you detect them and if we detect the matching string, we print user
and file
. If not, just continue. However, this is extremely inefficient:
#!/usr/bin/sh
##not exact, just roughly what i am doing
while read line; do
if [[ $line =~ Username ([^ ]+) filename ([^ ]+) ]];then
#store our variables
continue
fi
if [[ $line =~ "string" ]];then
#print user and file
fi
done < inputfile
Basically, is there some efficient way to detect the string I am looking for, then look back x number of lines (x corresponding to number of header lines) and then pull out the info I need? Thanks
PS Not so set on doing this in bash- perl works too.
EDIT: DESIRED OUTPUT
<user>, <file>
<user>, <file>
...
Upvotes: 2
Views: 458
Reputation: 321
Awk solution, relying on each record being two lines (and the first line of the file being the header for the first record):
NR%2 { name = $2; file =$4; next }
/string/ { print name, file }
Upvotes: 1
Reputation: 246837
For really heavy text processing like this, perl is a good choice:
perl -nE '
if ($. % 2 == 1) {
($user, $file) = (split ' ')[1,3];
}
elsif (/search string/) {
say "$user, $file";
}
' file1 file2 ...
That can be "golfed" down to a more terse one-liner, if you like that kind of thing.
Upvotes: 1