John Ullman
John Ullman

Reputation: 3

Awk print results for regular expression issue

I am trying to print all of the words followed by a # using awk from a file the code below works on the words that have a space after the word but some of the words do not have a space and it prints the whole line. Is there a way to have the code only print the word in the case where there is no space?

Code used:

gawk.exe "{ for(i=1; i<=NF; i++) if($i ~ /#[A-Z]*/) {print $i}}" "file.csv"

Sample Data:

#Realtree #HuntWithAnEDGE #RealtreeEDGE #DeerHunting",https://www.facebook.com/Realtree/photos/a.103244392286/10158628671852287/?

My results:

#Realtree
#HuntWithAnEdge
#RealtreeEDGE
#DeerHunting",https://www.facebook.com/Realtree/photos/a.103244392286/10158628671852287/?

On the last result all I need is #DeerHunting

Upvotes: 0

Views: 33

Answers (2)

Gilles Qu&#233;not
Gilles Qu&#233;not

Reputation: 185005

Using as much possible your awk:

gawk "{for(i=1; i<=NF; i++) if (match($i, /#[a-zA-Z]+/, a)) {print a[0]}}" file

But if you have , you should have too, so:

grep -o "#[a-zA-Z]\+" file

or

grep -oP "#\w+" file  # please, tell me if windows have -P switch

Output:

#Realtree
#HuntWithAnEDGE
#RealtreeEDGE
#DeerHunting

Upvotes: 1

mevets
mevets

Reputation: 10435

The ^expr$ is needed to match the entire field, rather than just find a sub-field that matches.... For example, this matches any field that begins with a hash (#)

if($i ~ /#[A-Z]*/)

This matches any field the begins with a hash followed by any number of A-Z, including zero but no other characters.

if($i ~ /^#[A-Z]*$/)

This matches any field the begins with a hash followed by 1 or more A-Z, but no other characters.

if($i ~ /^#[A-Z]*$/)

Upvotes: 0

Related Questions