Reputation: 63
For my program I have a function that changes a string into a list however when it hits a newline character it combines the two words on either side of the newline character. Example:
"newline\n problem"
Prints out like this in main function:
print(serperate_words)
newlineproblem
Here is the code:
def stringtolist(lines):
# string of acceptable characters
acceptable = "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'’- "
new_string = ''
for i in lines:
# runs through the string and checks to see what characters are in the string
if i in acceptable:
i = i.lower()
# if it is an acceptable character it is added to new string
new_string += i
elif i == '.""':
# if it is a period or quotation marks it is replaced with a space in the new string
new_string += ' '
else:
# for every other character it is removed and not added to new string
new_string += ''
#splits the string into a list
seperate_words = new_string.split(' ')
return seperate_words
Upvotes: 0
Views: 92
Reputation: 876
Because of the multiple transformations described in the comments of the original code, a more flexible approach could be to use the translate()
method of strings (together with the maketrans()
function):
def stringtolist(lines):
import string
acceptable_chars = string.ascii_letters + string.digits + "'`- "
space_chars = '."'
delete_chars = ''.join(set(map(chr, xrange(256))) - set(acceptable_chars))
table = string.maketrans(acceptable + space_chars, acceptable.lower() + (' '*len(space_chars)))
return lines.translate(table, delete_chars).split()
Upvotes: 0
Reputation: 3682
You can just check for the newline character and skip it. Here's an example.
for word in string:
if ch is not '/n':
newstring += ch
Or use
.strip() to remove newlines altogether
Upvotes: 0
Reputation: 15423
You can split a string with multiple delimiters:
def stringtolist(the_string):
import re
return re.split('[ \.\n]', the_string)
You can add other delimiters to the list if you want (like quotes, ...) => re.split('[ \.\n\'\"]', the_string)
Upvotes: 1