fox
fox

Reputation: 16496

looking to parse a string in python

If I have a series of python strings that I'm working with that will always take the form of

initialword_content

and I want to strip out the initialword portion, which will always be the same number of characters, and then I want to turn all instances of _ into spaces -- since content may have some underscores in it -- what's the easiest way to do that?

Upvotes: 1

Views: 149

Answers (3)

TerryA
TerryA

Reputation: 59974

I used slicing and the replace() function. replace() simply... replaces!

string = 'initialword_content'
content = string[12:] # You mentioned that intialword will always be the same length, so I used slicing.
content = content.replace('_', ' ')

For example:

>>> string = 'elephantone_con_ten_t' # elephantone was the first thing I thought of xD
>>> content = string[12:]
>>> content
... con_ten_t
>>> content = content.replace('_', ' ')
>>> content
... con ten t

However, if you also want to reference "elephantone" somewhere else, do this:

>>> string = 'elephantone_con_ten_t'
>>> l = string.split('_', 1) # This will only strip the string ONCE from the left.
>>> l[0]
... 'elephantone'
>>> l[1].replace('_', ' ')
... 'con ten t'

Upvotes: 0

Jun HU
Jun HU

Reputation: 3314

strs = "initialword_content"
strs = strs[12:].replace("_", " ")
print strs

Due to the initialword always has same number of character, so you can just get the suffix of the string. And use string.replace to replace all "_" into spaces.

Upvotes: 3

eumiro
eumiro

Reputation: 212825

First, split the string once (with the parameter 1 to split) to get two parts: the throw-away 'initialword' and the rest, where you replace all underscores with spaces.

s = 'initialword_content' 
a, b = s.split('_', 1)
b = b.replace('_', ' ')
# b == 'content'

s = 'initialword_content_with_more_words' 
a, b = s.split('_', 1)
b = b.replace('_', ' ')
# b == 'content with more words'

This can be done with a single command:

s.split('_', 1)[1].replace('_', ' ')

another way:

' '.join(s.split('_')[1:])

or, if the length of "initialword" is always the same (and you don't have to calculate it each time), take the @JunHu's solution.

Upvotes: 2

Related Questions