Murat G
Murat G

Reputation: 61

Python: split string with delimiters from a list

I'd like to split a string with delimiters which are in a list.

The string has this pattern: Firstname, Lastname Email

The list of delimiters has this: [', ',' '] taken out of the pattern.

I'd like to split the string to get a list like this ['Firstname', 'Lastname', 'Email']

For a better understanding of my problem, this is what I'm trying to achieve:

The user shall be able to provide a source pattern: %Fn%, %Ln% %Mail% of data to be imported and a target pattern how the data shall be displayed:

%Ln%%Fn%; %Ln%, %Fn; %Mail%

This is my attempt:

data = "Firstname, Lastname Email"

for delimiter in source_pattern_delimiter:
    prog = re.compile(delimiter)
    data_tuple = prog.split(data)

How do I 'merge' the data_tuple list(s)?

Upvotes: 4

Views: 2756

Answers (5)

Martin Evans
Martin Evans

Reputation: 46759

You are asking for a template based way to reconstruct the split data. The following script could give you an idea how to progress. It first splits the data into the three parts and assigns each to a dictionary entry. This can then be used to give a target pattern:

import re

data = "Firstname, Lastname Email"

# Find a list of entries and display them
entries = re.findall("(\w+)", data)
print entries       

# Convert the entries into a dictionary
dEntries = {"Fn": entries[0], "Ln": entries[1], "Mail": entries[2]}

# Use dictionary-based string formatting to provide a template system
print "%(Ln)s%(Fn)s; %(Ln)s, %(Fn)s; %(Mail)s" % dEntries

This displays the following:

['Firstname', 'Lastname', 'Email']
LastnameFirstname; Lastname, Firstname; Email

If you really need to use the exact template system you have provided then the following could be done to first convert your target pattern into one suitable for use with Python's dictionary system:

def display_with_template(data, target_pattern):
    entries = re.findall("(\w+)", data)
    dEntries = {"Fn": entries[0], "Ln": entries[1], "Mail": entries[2]}

    for item in ["Fn", "Ln", "Mail"]:
        target_pattern= target_pattern.replace("%%%s%%" % item, "%%(%s)s" % item)

    return target_pattern % dEntries

print display_with_template("Firstname, Lastname Email", r"%Ln%%Fn%; %Ln%, %Fn%; %Mail%")

Which would display the same result, but uses a custom target pattern:

LastnameFirstname; Lastname, Firstname; Email

Upvotes: 0

jme
jme

Reputation: 20695

What about splitting on spaces, then removing any trailing commas?

>>> data = "Firstname, Lastname Email"
>>> [s.rstrip(',') for s in data.split(' ')]
['Firstname', 'Lastname', 'Email']

Upvotes: 0

Adam Bartoš
Adam Bartoš

Reputation: 717

A solution without regexes and if you want to apply a particular delimiter at a particular position:

def split(s, delimiters):
    for d in delimiters:
        item, s = s.split(d, 1)
        yield item
    else:
        yield s

>>> list(split("Firstname, Lastname Email", [", ", " "]))
["Firstname", "Lastname", "Email"]

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174706

Seems you want something like this,

>> s = "Firstname, Lastname Email"
>>> delim = [', ',' ']
>>> re.split(r'(?:' + '|'.join(delim) + r')', s)
['Firstname', 'Lastname', 'Email']

Upvotes: 1

rodic
rodic

Reputation: 445

import re

re.split(re.compile("|".join([", ", " "])), "Firstname, Lastname Email")

hope it helps

Upvotes: 4

Related Questions