Reputation: 65
I have a string that I need to extract values out of. The problem is the string is inconsistent. Here's an example of the script that has the string within it.
import re
RAW_Data = "Name Multiple Words Zero Row* (78.59/0) Name Multiple Words2* (96/24.56) Name Multiple Words3* (0/32.45) Name Multiple Words4* (96/12.58) Name Multiple Words5* (96/0) Name Multiple Words Zero Row6* (0) Name Multiple Words7* (96/95.57) Name Multiple Words Zero Row8* (0) Name Multiple Words9*"
First_Num = re.findall(r'\((.*?)\/*', RAW_Data)
Seg_Length = re.findall(r'\/(.*?)\)', RAW_Data)
#WithinParenthesis = re.findall(r'\((.*?)\)', RAW_Data) #This works correctly
print First_Num
print Seg_Length
del RAW_Data
What I need to get out of the string are all values within the parenthesis. However, I need some logic that will handle the absence of the "/" between the numbers. Basically if the "/" doesn't exist make both values for First_Num and Seg_Length equal to "0". I hope this makes sense.
Upvotes: 1
Views: 164
Reputation: 43169
Use a simple regex and add some programming logic:
import re
rx = r'\(([^)]+)\)'
string = """Name Multiple Words Zero Row* (78.59/0) Name Multiple Words2* (96/24.56) Name Multiple Words3* (0/32.45) Name Multiple Words4* (96/12.58) Name Multiple Words5* (96/0) Name Multiple Words Zero Row6* (0) Name Multiple Words7* (96/95.57) Name Multiple Words Zero Row8* (0) Name Multiple Words9*"""
for match in re.finditer(rx, string):
parts = match.group(1).split('/')
First_Num = parts[0]
try:
Seg_Length = parts[1]
except IndexError:
Seg_Length = None
print "First_Num, Seg_Length: ", First_Num, Seg_Length
You might get along with a regex alone solution (e.g. with conditional regex), but this approach is likely to be still understood in three months. See a demo on ideone.com.
Upvotes: 1
Reputation: 2447
You are attempting to find values on each side of '/' that you know may not exist. Pull back to the always known condition for your initial search. Use a Regular Expression to findall of data within parenthesis. Then process these based on if '/' is in the value.
Upvotes: 0