Ryan Saxe
Ryan Saxe

Reputation: 17869

Find specific string sections in python

I want to be able to grab sections of strings with a function. Here is an example:

def get_sec(s1,s2,first='{',last='}'):
    start = s2.index(first)
    end = -(len(s2) - s2.index(last)) + 1
    a = "".join(s2.split(first + last))
    b = s1[:start] + s1[end:]
    print a
    print b
    if a == b:
        return s1[start:end] 
    else:
        print "The strings did not match up"
string = 'contentonemore'
finder = 'content{}more'
print get_sec(string,finder)
#'one'

So that example works...my issue is I want multiple sections, not just one. So my function needs to be able to work for any amount of sections, for example:

test_str = 'contwotentonemorethree'
test_find = 'con{}tent{}more{}'
print get_sec(test_str,test_find)
#['one','two','three']

any ideas on how I can make that function work for an arbitrary number of replacements?

Upvotes: 2

Views: 914

Answers (4)

log0
log0

Reputation: 10917

You probably want to use the standard python regex library

import re
a = re.search('con(.*)tent(.*)more(.*)','contwotentonemorethree')
print a.groups()
# ('two', 'one', 'three')

or print re.findall('con(.)tent(.)more(.*)','contwotentonemorethree') # [('two', 'one', 'three')]

edit:
you can escape special character in a string using

re.escape(str)

example:

part1 = re.escape('con(')
part2 = re.escape('(tent')
print re.findall(part1 + '(.*)' + part2,'con(two)tent')

Upvotes: 2

Elazar
Elazar

Reputation: 21615

It is not just "use regex". you are trying to actually implement regex. well, the easiest way for implemeting regex will be using the re library. of course.

Upvotes: 1

xgord
xgord

Reputation: 4776

Looks like you want something with regular expressions.

Here's python's page about regular expressions: http://docs.python.org/2/library/re.html

As an example, if say you knew that the string would only be broken into segments "con", "tent", "more" you could have:

import re
regex = re.compile(r"(con).*(tent).*(more).*")

s = 'conxxxxtentxxxxxmore'

match = regex.match(s)

Then find the indices of the matches with:

index1 = s.index(match.group(1))
index2 = s.index(match.group(2))
index3 = s.index(match.group(3))

Or if you wanted to find the locations of the other characters (.*):

regex = re.compile(r"con(.*)tent(.*)more(.*)")

Upvotes: 0

Joran Beasley
Joran Beasley

Reputation: 113998

ummm use regex?

import re
re.findall("con(.*)tent(.*)more(.*)",my_string)

Upvotes: 0

Related Questions