Reputation: 871
I have to remove everything but 1, 2, or 3 digits (0-9, or 10-99, or 100) preceding % (I don't want to see the %, though) from another command's output and pipe it forward to another command. I know that
sed -n '/%/p'
will show only the line(s) containing %, but that's not what I want. How can I get rid of the rest of the unwanted text and leave only these numbers to then pipe them to another command?
Upvotes: 16
Views: 48316
Reputation: 10039
sed -n "/[0-9]\{1,2\}%/ s/^[^0-9]*\([0-9]\{1,2\}\)%.*/\1/p
/100%/ s/.*/100/p
"
the 100% is to be extracted because otherwise number of kind 987% (or 123% if filtered on 1 at 1st position) are also send to output
Upvotes: 0
Reputation: 342303
Use awk
instead of sed
.
$ cat file
one two 100% three
10% four 1% five
$ awk '{
for(i=1;i<=NF;i++)
if ($i ~/%$/) { print $i+0} }
'file
100
10
1
For each field, check to see if there is %
sign at the end. If yes, print the number. ($i+0 means to convert to integer). Minimal Regular expression used.
Upvotes: 0
Reputation: 27050
EDIT: I have misunderstood the OP and posted an invalid answer. I changed it to an answer that, I believe, would solve the problem in the more general scenario.
For a file such as the one below:
$ cat input
abc
123%
123
abc%
this is 456% and nothing more
456
Use sed -n -E 's/(^|.*[^0-9])([0-9]{1,3})%.*/\2/p' input
$ sed -n -E 's/(^|.*[^0-9])([0-9]{1,3})%.*/\2/p' input
123
456
The -n
flag makes sed to suppress automatic output of the lines. Then, we use the -E
flag which will allow us to use extended regular expressions. (In GNU sed, the flag is not -E
but instead is -r
).
Now comes the s///
command. The group (^|.*[^0-9])
matchs either a beginning of line (^
) or a series of zero or more chars (.*
) ending in a non-digit char ([^0-9]
).
[0-9]\{1,3\}
just matches one to three digits and is bound to a group (by the (
and )
group delimiters) if the group is preceded by (^|.*[^0-9])
and followed by %
. Then .*
matches everything before and after this pattern. After this, we replace everything by the second group (([0-9]{1,3})
) using the backreference \2
. Since we passed -n
to sed, nothing would be printed but we passed the p
flag to the s///
command. The result is that if the replacement is executed then the resulted line is printed. Note the p
is a flag of s///
, not the p
command, because it comes just after the last /
.
Upvotes: 3
Reputation: 12583
Here's my shot:
sed "/^[0-9]{1,3}%$/ bnum; d; :num s/%//"
If the line is 1-3 digits followed by a %, it removes the %-sign. Otherwise, it removes the entire line. So, for input such as
adsf
50
52%
1
12%
test%
1234%
%%%
85%
bye
It yields
52
85
Upvotes: 0
Reputation: 246744
If you're not completely tied to sed, this is exactly what grep -o
does:
grep -o '[0-9]\{1,3\}%'
Upvotes: 31
Reputation: 93700
sed -e 's/[^0-9]*\([0-9]*\)%.*/\1/'
captures the digits in a group and because the pattern matches everything (the leading and trailing .*
) it all gets discarded.
(my pattern matches any number of digits since sed
regular expressions don't support handy shortcuts like [0-9]{1,3}
that you see in perlre and others so I elected to keep it simple to illustrate the principle you cared about)
Edit: to fix quoting and replace leading .*
with [^0-9]*
to avoid the greedy match consuming the numbers. Once again more straightforward with perlre where you can use a non-greedy .?*
Upvotes: 0