tofu
tofu

Reputation: 125

Extracting substring in linux using expr and regex

So I have just begun learning regular expressions. I have to extract a substring within a large string.

My string is basically one huge line containing a lot of stuff. I have identified the pattern based on which I need to extract. I need the number in this line A lot of stuff<li>65,435 views</li>a lot of stuff This number is just for example.

This entire string is in fact one big line and my file views.txt contains a lot of such lines.

So I tried this,

while read p
do
y=`expr "$p": ".*<li>\(.*\) views "`
echo $y
done < views.txt

I wished to iterate over all such lines within this views.txt file and print out the numbers.

And I get a syntax error. I really have no idea what is going wrong here. I believe that I have correctly flanked the number by <li> and views including the spaces.

My (limited) interpretation of the above regex leads me to believe that it would output the number.

Any help is appreciated.

Upvotes: 2

Views: 805

Answers (1)

Thomas Dickey
Thomas Dickey

Reputation: 54583

The syntax error is because the ":" is not separated from "$p" by a space (or tab). With that fixed, the regex has a trailing blank which will prevent it matching. Fixing those two problems, your sample script works as intended.

Upvotes: 5

Related Questions