David Halliday
David Halliday

Reputation: 47

Grep a list of files that contains specified string and get output with file name & desired data

I have a list of files for ex.

user1.txt
user2805927.txt
admin.txt

and on each file there are some datas like:

unwanted data line1
unwanted data line2
unwanted data line n

Usage · 220
other lines that I don't need

And I want to get just the number "220" and it is different on each file

One of the problems, there's a symbol · ALT CODE + 250 that I can't write it on Putty

Is there any way to get an output filename + data, like:

users1.txt | 220
user2805927.txt | 85
admin.txt | 18

Upvotes: 1

Views: 626

Answers (1)

tripleee
tripleee

Reputation: 189357

You can grep for an arbitrary character code (with a couple of exceptions -- 0 and 255 are used internally in GNU grep).

xargs grep -o $'\xfa.*' -m 1 <filenames.txt

The Bash "C-style" string $'...' lets you use a hex character code \xfa (equivalent to decimal 250) and grep -o says to only print the match, not the whole line. With -m 1 we limit to the first match in each file, in case there would be several. xargs says to run grep with the file names in the file as command-line arguments; this causes grep to also print the file name in front of each match.

users1.txt:· 220
user2805927.txt:· 85
admin.txt:· 18

Post-processing this output left as an exercise. (If you have grep -P you can put a \\K after the hex code to exclude it from the match easily.)

Here's a sed variation:

xargs -n 1 -i sed -n '/^Usage [^0-9]*/!d;s//{} | /p;q' {} <filenames.txt

If the current line doesn't match the regular expression, delete it and start over with the next line. Otherwise, replace the match with the current file name (xargs -i replaces {}with the file name) and print the line, then quit processing the current file. xargs -n 1 says to run a new invocation of the sed command for each filename (though this is required by -i anyway so implied anyway.)

Upvotes: 1

Related Questions