Liam87
Liam87

Reputation: 63

Making string into list in python

For my program I have a function that changes a string into a list however when it hits a newline character it combines the two words on either side of the newline character. Example:

"newline\n   problem"

Prints out like this in main function:

print(serperate_words)
newlineproblem

Here is the code:

def stringtolist(lines):
    # string of acceptable characters
    acceptable = "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'’- " 
    new_string = ''
    for i in lines:
        # runs through the string and checks to see what characters are in the string
        if i in acceptable:
            i = i.lower()
            # if it is an acceptable character it is added to new string
            new_string += i
        elif i == '.""':
            # if it is a period or quotation marks it is replaced with a space in the new string
            new_string += ' '
        else:
            # for every other character it is removed and not added to new string
            new_string += ''


    #splits the string into a list
    seperate_words = new_string.split(' ')
    return seperate_words 

Upvotes: 0

Views: 92

Answers (3)

davidedb
davidedb

Reputation: 876

Because of the multiple transformations described in the comments of the original code, a more flexible approach could be to use the translate() method of strings (together with the maketrans() function):

def stringtolist(lines):
    import string
    acceptable_chars = string.ascii_letters + string.digits + "'`- "
    space_chars = '."'
    delete_chars = ''.join(set(map(chr, xrange(256))) - set(acceptable_chars))
    table = string.maketrans(acceptable + space_chars, acceptable.lower() + (' '*len(space_chars)))
    return lines.translate(table, delete_chars).split()

Upvotes: 0

reticentroot
reticentroot

Reputation: 3682

You can just check for the newline character and skip it. Here's an example.

for word in string:
    if ch is not '/n':
        newstring += ch

Or use

.strip() to remove newlines altogether

Upvotes: 0

Julien Spronck
Julien Spronck

Reputation: 15423

You can split a string with multiple delimiters:

def stringtolist(the_string):
    import re
    return re.split('[ \.\n]', the_string)

You can add other delimiters to the list if you want (like quotes, ...) => re.split('[ \.\n\'\"]', the_string)

Upvotes: 1

Related Questions