LeCoda
LeCoda

Reputation: 1016

Taking strings, splitting and then joining with Python

I'm trying to take a list of suburbs and reformat them. I'm having trouble unsplitting the list at the moment and getting the right format.

dataframe = pd.read_csv('x.csv')
suburbs = list(dataframe.x)
#suburbs



x = []
for n in x:
    y = n.split()
    x =[]
    for n in y:
        if (n not in dont_want):
            x.append(n)

    x.append(x)

x

Running this next script ends up with an error.

x
separator = ','
print(separator.join(x))

this is the error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-182-345db222de4e> in <module>
      1 suburb_clean
      2 separator = ','
----> 3 print(separator.join(suburb_clean))

TypeError: sequence item 0: expected str instance, list found

This is the sort of logic I was going to follow

# .join() with lists
numList = ['1', '2', '3', '4']
separator = ', '
print(separator.join(numList))

# .join() with tuples
numTuple = ('1', '2', '3', '4')
print(separator.join(numTuple))

s1 = 'abc'
s2 = '123'

# each element of s2 is separated by s1
# '1'+ 'abc'+ '2'+ 'abc'+ '3'
print('s1.join(s2):', s1.join(s2))

# each element of s1 is separated by s2
# 'a'+ '123'+ 'b'+ '123'+ 'b'
print('s2.join(s1):', s2.join(s1))

The output I am wanting is a list of suburbs for each state, with no spaces inbetween the names but %20 instead. I realise having the suburbs underneath each state may be really difficult, so just having properly formatted suburbs in general is my first outcome I would like.

I'm stuck on re-formating these strings to fit the right format and would love some guidance

Upvotes: 0

Views: 450

Answers (1)

Patrick Artner
Patrick Artner

Reputation: 51643

You collect parts in list x and put x in suburb_clean and then try to join suburb_clean which contains x wich is a list. str.join(iterable) wants an iterable of strings: Thats your error.

Fixed code:

suburbs = ["want this", "want this (meh)", "this as well (nope)"]

suburb_clean = []

dont_want = frozenset( ["(meh)", "(nope)"] )

for urb in suburbs:
    cleaned_name = []
    name_parts = urb.split()

    for part in name_parts:
        if part in dont_want:
            continue
        cleaned_name.append(part)

    suburb_clean.append(' '.join(cleaned_name))

print(suburbs)      #  ['want this', 'want this (meh)', 'this as well (nope)']
print(suburb_clean) #  ['want this', 'want this', 'this as well']

Upvotes: 1

Related Questions