Reputation: 47
I have a list of files for ex.
user1.txt
user2805927.txt
admin.txt
and on each file there are some datas like:
unwanted data line1
unwanted data line2
unwanted data line n
Usage · 220
other lines that I don't need
And I want to get just the number "220" and it is different on each file
One of the problems, there's a symbol · ALT CODE + 250
that I can't write it on Putty
Is there any way to get an output filename + data, like:
users1.txt | 220
user2805927.txt | 85
admin.txt | 18
Upvotes: 1
Views: 626
Reputation: 189357
You can grep
for an arbitrary character code (with a couple of exceptions -- 0 and 255 are used internally in GNU grep
).
xargs grep -o $'\xfa.*' -m 1 <filenames.txt
The Bash "C-style" string $'...'
lets you use a hex character code \xfa
(equivalent to decimal 250) and grep -o
says to only print the match, not the whole line. With -m 1
we limit to the first match in each file, in case there would be several. xargs
says to run grep
with the file names in the file as command-line arguments; this causes grep
to also print the file name in front of each match.
users1.txt:· 220
user2805927.txt:· 85
admin.txt:· 18
Post-processing this output left as an exercise. (If you have grep -P
you can put a \\K
after the hex code to exclude it from the match easily.)
Here's a sed
variation:
xargs -n 1 -i sed -n '/^Usage [^0-9]*/!d;s//{} | /p;q' {} <filenames.txt
If the current line doesn't match the regular expression, delete it and start over with the next line. Otherwise, replace the match with the current file name (xargs -i
replaces {}
with the file name) and print the line, then quit processing the current file. xargs -n 1
says to run a new invocation of the sed
command for each filename (though this is required by -i
anyway so implied anyway.)
Upvotes: 1