Reputation: 139
I'm writing a little bash script that scan a list of text lines, each of which has the format:
num1 num2 num3 filename
For each line, I only want to parse out the first numerical token. This is my code:
printf "input line: %s\n" "${line}"
let number="${line//^[0-9]+/}"
printf "regexp parsed %s\n" "${number}"
Well, it does parse out the first number in the line, but also outputs an error message:
input line: 11531 1008 16 12555 310b /usr/bin/gresource
./statistics.sh: line 21: let: number=11531 1008 16 12555 310b /usr/bin/gresource: syntax error in expression (error token is "1008 16 12555 310b /usr/bin/gresource")
regexp parsed 11531
Why do I get this error message? How can I apply the regexp $[0-9]+
on $line
without getting the error?
Upvotes: 0
Views: 832
Reputation: 530920
Parameter expansions expect patterns, not regular expressions. Further, your attempt would remove the number rather than capturing it. What's really happening is that let
is converting the entire line to a number by commenting on, but ignoring, the non-numeric part of the line. (That is, it only "works" because the line actually starts with a number.)
Consider the following, using the extended pattern equivalent to the regular expression [0-9]+
. Note that your regular expression, treated as a pattern, doesn't match anything.
$ echo "$line"
11531 1008 16 12555 310b /usr/bin/gresource
$ echo "${line//^[0-9]+/}"
11531 1008 16 12555 310b /usr/bin/gresource
$ shopt -s extglob
$ echo "${line/+([0-9])}"
1008 16 12555 310b /usr/bin/gresource
Use a regular expression match.
[[ $line =~ [0-9]+ ]] && number=${BASH_REMATCH[0]}
Upvotes: 1
Reputation: 8406
If the lines are all that format, use cut
, since there'd be no need to parse for numbers:
cut -d ' ' -f 1 <<< 'num1 num2 num3 filename'
Output:
num1
For an input file do:
cut -d ' ' -f 1 inputfile.txt
Upvotes: 1