Reputation: 4479
Consider the following string variable:
data = '23jodfjkle lj ioerz\nlkdsjflj sldjj\\difd ioiörjlezr'
What i want to create is string with alphabetical characters, character \n and character ö. Therefore i wrote the following:
(" ".join(re.findall("[a-zA-Z]+|\n|ö", data)))
But what i take is:
'jodfjkle ljkgfj opz ioerz \n lkdsjflj sldjj difd ioi ö rjlezr'
Why are there spaces around the characters \n and ö? What should i change in order to take a solution without spaces:
'jodfjkle ljkgfj opz ioerz\nlkdsjflj sldjj difd ioiörjlezr'
Upvotes: 2
Views: 49
Reputation: 11961
By using the |
operator in your regex, the Python regex parser considers [a-zA-Z]+
, \n
and ö
as different matches. When you use " ".join()
you therefore introduce a space around all matches, including the \n
and the ö
.
To achieve your desired output move the \n
and ö
inside the square brackets:
print(" ".join(re.findall("[a-zA-Z\nö]+", data)))
Output
jodfjkle lj ioerz\nlkdsjflj sldjj difd ioiörjlezr
Upvotes: 4