Hashik
Hashik

Reputation: 201

re.split() doesn't split after some characters

I am trying to split a string,no matter what i do, the string is not splitting after some characters, not only the space nothing works(Split for other characters).I am learning the 're' module so can you be more precise?.Thank you.

import re
String = "Integrity Home Care has an opening on our Leadership Team for a Salaried Private Care Nursing Supervisor.TOoooooo"
print(re.split(r'\s*',String,re.I|re.M))

Gives the following result: After Execution

Upvotes: 1

Views: 1496

Answers (3)

kindall
kindall

Reputation: 184270

You're passing re.I|re.M (10) as the value for the maxsplit argument, so it stops splitting after ten times, just as you told it to.

If you don't want to pass in a value for maxsplit, use a named argument for the flags:

re.split(r'\s*', String, flags=re.I|re.M)

Another option is not to pass the flags in as an argument but rather include them in the regular expression itself.

re.split(r'(?im)\s*',String)

I have retained the case-insensitivity flag in these examples, but your regex doesn't match any characters that could have case anyway, so you could leave it out.

Now to the regex itself. The * matches zero or more occurrences of the preceding pattern. This matches pretty much everywhere in the string, so in theory the string could be split anywhere, which is why you're getting that warning about non-empty patterns. These matches are ignored, but it would be better to use +, which means one or more occurrences, in its place.

Finally I would be remiss not to mention that you might be able to get away with just using String.split(), which splits on whitespace by default, so you could potentially do away with the regular expression.

Upvotes: 2

Ibrahim
Ibrahim

Reputation: 6098

You could also simply remove re.I|re.M from your code. Try this:

print(re.split(r'\s',String))

Output:

['Integrity', 'Home', 'Care', 'has', 'an', 'opening', 'on', 'our', 'Leadership', 'Team', 'for', 'a', 'Salaried', 'Private', 'Care', 'Nursing', 'Supervisor.TOoooooo']

Upvotes: 0

DYZ
DYZ

Reputation: 57085

re.split(r'\s*',String,re.I|re.M) must be re.split(r'\s*',String,flags=re.I|re.M). The third positional parameter ro re.split is the maximal number of fragments, and you set it to re.I|re.M, which is 10.

Upvotes: 0

Related Questions