Ans
Ans

Reputation: 105

Trying to split strings into multiple parts with Python

I am trying to split a string in the following manner. Here is a sample strings:

"Hello this is a string.-2.34 This is an example1 string."

Please note that "" is a U+F8FF unicode character and the type of the string is Unicode.

I want to break the string as:

"Hello this is a string.","-2.34"," This is an example1 string."

I have written a regex to split the string but using this I cannot get the numeric part that I want. (-2.34 in first string)

My code:

import re
import os
from django.utils.encoding import smart_str, smart_unicode

text = open(r"C:\data.txt").read()
text = text.decode('utf-8')
print(smart_str(text))

pat = re.compile(u"\uf8ff-*\d+\.*\d+")
newpart = pat.split(text)
firstpart = newpart[::1]

print ("first part of the string ----")
for f in firstpart:
f = smart_str(f)
print ("-----")
print f

Upvotes: 1

Views: 261

Answers (1)

unutbu
unutbu

Reputation: 880897

You need to put parentheses around -*\d+\.*\d+ if you want to keep it in the result of re.split:

import re
text = u"Hello this is a string.\uf8ff-2.34 This is an example1 string."
print(re.split(u'\uf8ff(-*\d+\.*\d+)', text))

yields

[u'Hello this is a string.', u'-2.34', u' This is an example1 string.']

Upvotes: 5

Related Questions