wanderors
wanderors

Reputation: 2292

extracting values from a string Python

Working on a bot application, so I need to extract the values from the message string and pass it to a variable. The message string can be in different ways like :

message = 'name="Raj",lastname="Paul",gender="male", age=23'
message = 'name="Raj",lastname="Paul",age=23'
message = 'name="Raj",lastname="Paul",gender="male"'

The data user provided can contain all values, or sometimes age or gender field will be missing.

Where I am stuck is , I am not sure how to check if age is present in the message text. If it is then extract value corresponding to age. If age is not in message, ignore age.

It is possible to check each one word in a loop and extract the string, but it becomes quite lengthy. Please let me know if there is more easier ways

Like

if Age is present in message then get the value of age,
if lastname is present in message then get the value of lastname
if gender is present in message then get the value of gender
if name is present in message then get the value of name

Upvotes: 0

Views: 14890

Answers (5)

amrtw09
amrtw09

Reputation: 313

message = 'name="Raj",lastname="Paul",gender="male", age=23'

new_msg = message.replace('"', '').replace(' ', '').split(',')  # 2nd replace to delete the extra space before age

msg_dict = dict([x.split('=') for x in new_msg])

print(msg_dict)

This code returns the following output as a dictionary. You can loop through each message and it will put the right attribute with the right key.

{'name': 'Raj', 'lastname': 'Paul', 'gender': 'male', 'age': '23'}

Upvotes: 1

ju.arroyom
ju.arroyom

Reputation: 182

This is another possibility:

message1 = 'name="Raj",lastname="Paul",gender="male", age=23'

message2 = 'name="Raj",lastname="Paul",age=23'

message3 = 'name="Raj",lastname="Paul",gender="male"'

messages = [message1, message2, message3]

splits = [m.split(",") for m in messages]

def flatten(lst):
    temp = []
    for l in lst:
        val1, val2 = l.split("=")
        val1 = val1.strip()
        val2 = val2.strip()
        temp.append(val1)
        temp.append(val2)
    return temp

clean = list(map(lambda x: flatten(x), splits))

final = [x for x in clean if 'age' in x]

final

This would keep those messages that contain 'age'

Upvotes: 0

Chrispresso
Chrispresso

Reputation: 4131

One thing you can do is use a regular expression and extract individual portions.

For instance, assume your message is message = 'name="Raj",lastname="Paul",gender="male", age=23', you can make your regular expression (?P<var>.*?)=(?P<out>.*?),

Here is what I would do:

import re
message = 'name="Raj",lastname="Paul",gender="male", age=23'
message += ',' # Add a comma for the regex
findall = re.findall(r'(?P<var>.*?)=(?P<out>.*?),', message) # Note the additional comma
extracted = {k.strip(): v.strip() for k,v in findall}
if 'age' in extracted:
    print(extracted['age']) # prints 23

extracted then would be a map that looks like this: {'name': '"Raj"', 'lastname': '"Paul"', 'gender': '"male"', 'age': '23'}. You can get rid of the double quotes if you really want and convert age to an int from there.

To get all the fields present you could do:

for field in extracted:
    print(field, extracted[field])

# Prints
name "Raj"
lastname "Paul"
gender "male"
age 23

Upvotes: 1

Mark
Mark

Reputation: 92461

If you just want to test for age you can search the string. If you want to use this for other things in addtion to checking the age, you can split it up into a dictionary.

message = 'name="Raj",lastname="Paul",gender="male", age=23'
pairs = [pair.replace('"', '').strip() for pair in message.split(',')]
d = dict([p.split('=') for p in pairs])

'age' in d # True
d['name'] # 'Raj'

Upvotes: 1

Austin
Austin

Reputation: 26037

Use regex:

(?:[, ])age=(\d+)

which extracts numbers following 'age=' from the string.

Code:

import re

message = 'name="Raj",lastname="Paul",gender="male", age=23'
m = re.search(r'(?:[, ])age=(\d+)', message)
if m:
    print(m.group(1))

# 23

Upvotes: 1

Related Questions