Reputation: 63
I have the following output
123399383 (blahthing1)(blahthing2)(blahthing3)(blahthing4)
I tried using replace to replace the () with a comma which worked but its still a single item and the entire line shows up in a single cell of my csv. What I'd like is
123399383,blahthing1,blahthing2,blahthing3,blahthing4
So each is a separate cell in my csv. Example is one of hundreds of lines I'm going through. Thanks for the time and any help you can throw me.
Upvotes: 0
Views: 50
Reputation: 92440
re.split()
will let you split on the specific characters you have. This will allow non-word characters to exist in the strings:
import re
s = '123399383 (blah++thing1)(blaht-&^hing2)(blah thing3)(blahthing4)'
# split on space or closing parenthesis
# and opening parentheses
re.split(r'[\s\)]\(', s)
# ['123399383', 'blah++thing1', 'blaht-&^hing2', 'blah thing3', 'blahthing4)']
Upvotes: 1
Reputation: 580
An alternative way to solve it using regex is;
original = "123399383(blahthing1)(blahthing2)(blahthing3)(blahthing4)"
new = re.sub("\W+", ",", s)[:-1]
print(new)
Upvotes: 0
Reputation: 520958
For your exact type of string, we can use re.findall
here for a regex based approach:
inp = "123399383 (blahthing1)(blahthing2)(blahthing3)(blahthing4)"
output = ','.join(re.findall(r'\w+', inp))
print(output) # 123399383,blahthing1,blahthing2,blahthing3,blahthing4
Upvotes: 1