Rohan
Rohan

Reputation: 93

Append sections of string to list in Python

I have a particularly long, nasty string that looks something like this:

nastyString = '  nameOfString1, Inc_(stuff)\n  nameOfString2, Inc_(stuff)\n  '

and so on. The key defining feature is that each "nameOfString" is followed by a \n with two spaces after it. The first nameOfString has two spaces in front of it as well.

I'm trying to create a list that would look something like this:

niceList = [nameOfString1, Inc_(stuff), nameOfString2, Inc_(Stuff)] and so on.

I've tried to use newString = nastyString.split() as well as newString = nastyString.replace('\n ', ''), but ultimately, these solutions can't work because each nameOfString has a space after the comma and before the 'I' of Inc. Furthermore, not all the nameOfStrings have an 'Inc,' but most do have some sort of space in their name.

Would really appreciate some guidance or direction on how I could tackle this issue, thanks!

Upvotes: 0

Views: 73

Answers (4)

keyvan vafaee
keyvan vafaee

Reputation: 466

if you don't like to replacing '\n' do this :

import re
nastyString = '  nameOfString1, Inc_(stuff)\n  nameOfString2, Inc_(stuff)\n  '
word =re.findall(r'.',nastyString)
s=""
for i in word:
     s+=i
print s

output :'nameOfString1, Inc_(stuff) nameOfString2, Inc_(stuff) '

now you can use split()

print s.split(',')

Upvotes: 1

Vikash Singh
Vikash Singh

Reputation: 14021

nastyString = '  nameOfString1, Inc_(stuff)\n  nameOfString2, Inc_(stuff)\n  '
# replace '\n' with ','
nastyString = nastyString.replace('\n', ',')
# split at ',' and `strip()` all extra spaces
niceList = [v.strip() for v in nastyString.split(',') if v.strip()]

output:

niceList
['nameOfString1', 'Inc_(stuff)', 'nameOfString2', 'Inc_(stuff)']

Update: OP shared new input:

That's awesome, never knew about the strip function. However, I actually am trying to including the "Inc" section, so I was hoping for output of: ['nameOfString1, Inc_(stuff)', 'nameOfString2, Inc_(stuff)'] and so on, any advice?

nastyString = '  nameOfString1, Inc_(stuff)\n  nameOfString2, Inc_(stuff)\n  '
niceList = [v.strip() for v in nastyString.split('\n') if v.strip()]

new output:

niceList
['nameOfString1, Inc_(stuff)', 'nameOfString2, Inc_(stuff)']

Upvotes: 1

Deepak Singh
Deepak Singh

Reputation: 411

May be you can try something like this.

 [word for word in nastyString.replace("\n", "").replace(",", "").strip().split(' ') if word !='']

Output:

['nameOfString1', 'Inc_(stuff)', 'nameOfString2', 'Inc_(stuff)']

Upvotes: 2

Ajax1234
Ajax1234

Reputation: 71471

You can use regular expressions:

import re

nastyString = '  nameOfString1, Inc_(stuff)\n  nameOfString2, Inc_(stuff)\n  '

new_string = [i for i in re.split("[\n\s,]", nastyString) if i]

Output:

['nameOfString1', 'Inc_(stuff)', 'nameOfString2', 'Inc_(stuff)']

Upvotes: 1

Related Questions