How to remove only the trailing empty columns in a dataframe?

Question

I have a dataframe like this:

data = [['a','b','','d','e','f'],
        ['g','','','h'],
        ['i','j','','k'],
        ['l','','m']]

df = pd.DataFrame(data)

I have tried:

df = df.fillna('')
sep = '*'
df.applymap(str).apply(
    axis=1, func=lambda s: sep.join(el for el in s if el) 
).to_csv(
    'output.csv', index=False, header=False
)

In the file output.csv, empty columns are removed, But I just want to remove the trailing columns as there are in the data frame.

Output.csv generated by the above code:

a*b*d*e*f
g*h
i*j*k
l*m

Expected output.csv:

a*b**d*e*f
g**h
i*j**k
l**m

Tim Biegeleisen · Accepted Answer

Empty strings are falsy in Python, meaning that your list comprehension will filter out data frame elements which are empty string. To get the output you want, you can simply remove the list comprehension and join the original list directly, since you seem to want to include all elements:

df = df.fillna('')
sep = '*'
df.applymap(str).apply(
    axis=1, func=lambda s: sep.join(s).strip('*')
).to_csv(
    'output.csv', index=False, header=False
)

How to remove only the trailing empty columns in a dataframe?

Answers (1)

Related Questions