Reputation: 317
I apologize if my question is stupid. I just stared learning regex a few hours back.
I am trying to extract all hashtags
Special characters are allowed until it reaches a space/a hastag/newline
Here is my current regex: \#{1}(\S|\N)
I tried changing it to \#{1}.+(\S|\N)
because i assumed the .+
will allow it to keep matching until it reaches a new line or space
======================TESTHASH========================
#3!x_j@`(/l3W#qfSnl#6R7x1b,jBb0p#Oq/:o#!tH3AITK^Yyp#B,
#qwe#%#T &#v#v#N###O###2#` `S}^&9 #M # Aa23%2##p#?#w#a
#123#9#Z a%h#&#C###;###? a#u#g#Q#r#8# #a#A#l#p#r#b#}#c
#R#M#(#p###K###l###1###b 2#D\'>.w/Y_2 sha2&2{] #4x$D~kR
#lbTb1k3# #Dlo ## #j# #W H#tjsR.Lzkc #B*xt&nFty?il#jp
#>p8BTU2###PW!aB###z###-VM (s82hdk#T 8sUJWfuy2#-#f~fh)
#d{jyi|^ofYD#q)!#special~!@$%^&*()#_+`-=[];\',./?><\":}{
======================TESTHASH========================
Upvotes: 3
Views: 518
Reputation: 16105
How about #[^#\s\n]+
?
##
two hashtags of length zero, or zero hashtags? #[^#\s\n]*
is equivalent to Sweeper's regex, but without the look-ahead. #[^#\s\n]+
additionally requires that hashtags don't have zero characters after them.)This is what #[^#\s\n]+
matches:
It seems to secretly spell out "NICE"; I wonder if this is an exercise and you're using StackOverflow to think for you? :-)
Upvotes: 3
Reputation: 21
\#[^\s\#]*(\s|\#)
Matches a # followed by any number of chars other than whitespace and # which is followed by a whitespace or #
This should work
Upvotes: 0
Reputation: 271625
I made a few changes to your regex to get it match these:
This is the regex:
\#.*?(?=\s|\n|\#|$)
Changes I've made:
used a lazy "zero or more" quantifier *?
. This means that it will keep matching until (?=\s|\n|\#|$)
is not true, whereas with a greedy quantifier, it will match all the way to the end of the line, then backtracks until (?=\s|\n|\#|$)
is true.
removed {1}
, this is unnecessary
\#
and $
. They are characters that when encountered, should stop the match.#
into the match.Upvotes: 3