Pradeep
Pradeep

Reputation: 6603

Python regular expression for words not starting and not ending with

I am trying to construct regular expression in pythond for following rules,

  1. Accept Words containing only alphabets
  2. Words may contain - ( hypen)
  3. word can not end with special character, for eg. :) ( pls consider these two)
  4. Word can not start with _ (underscore) but may end with _ (underscore)

For eg.

Accept Words

Hello
Hello-World
Hello_
Hello1

Reject words

_hello_
hello:
hello:)

I have come up with following regular expression,

'(?!_)[\w-]+(?!:)'

It still accepts all words just skipping _ at the stat and : at the end,

Can somebody point, what's the wrong with my regular expression Thanks

Upvotes: 0

Views: 3001

Answers (2)

Peter Alfvin
Peter Alfvin

Reputation: 29389

There's still quite a bit of ambiguity in what you're asking for, but here's another solution for the sample set you gave, pre this fiddle

^[A-Za-z-]+[_\d]?$

Upvotes: 0

Dogbert
Dogbert

Reputation: 222108

You can add a leading and trailing \b.

words = ["Hello", "Hello-World", "Hello_", "Hello1", "_hello_", "hello:",
         "hello:)" ]

import re

for word in words:
  print re.findall(r'\b(?!_)[\w-]+(?!:)\b', word)

Output:

['Hello']
['Hello-World']
['Hello_']
['Hello1']
[]
[]
[]

From http://docs.python.org/2/library/re.html

\b Matches the empty string, but only at the beginning or end of a word. A word is defined as a sequence of alphanumeric or underscore characters, so the end of a word is indicated by whitespace or a non-alphanumeric, non-underscore character.

Upvotes: 1

Related Questions