Reputation: 224
I have seen many links related to my question:
How to remove extraneous square brackets from a nested list inside a dictionary?
Removing square brackets from Dataframe
Removing square brackets from Dataframe
but none of that worked
below is my example:
df1
column1 column2 column3 ..... upto 'n' number of columns
[data1] data1 data1
NAN data2 data2
[data2] data3 [data3, data3, testing how are you guys hope you guys are doing :)]
[data3] data3 [data4, dummy text to test to test test test]
NAN data4 [data5]
below is my tried code:
df1[column1] = df[column1].str[0]
# not working !
# want to give df1 instead of df1[columns] because there are lot of
# columns
i want to remove only the bracket, not anything else and want to give only dataframe not along with columns because there are lot of columns !
expected output:
column1 column2 column3 ..... upto 'n' number of columns
data1 data1 data1
NAN data2 data2
data2 data3 data3, data3, testing how are you guys hope you guys are doing :)
data3 data3 data4, dummy text to test to test test test
NAN data4 data5
Upvotes: 0
Views: 1104
Reputation: 23146
Try with apply
, explode
and groupby
:
>>> df.apply(lambda x: x.explode().astype(str).groupby(level=0).agg(", ".join))
column1 column2 column3
0 data1 data1 data1
1 nan data2 data2
2 data2 data3 data3, data3, testing how are you guys hope yo...
3 data3 data3 data4, dummy text to test to test test test
4 nan data4 data5
pandas.explode()
to transform each list element to its own row, replicating index values.groupby
identical index values and aggregate using str.join()
.apply
to apply the same function to all columns of the DataFrame.Upvotes: 1
Reputation: 112
for i in range(0, df.shape[0]):
df1['column1'][i] = str(df['column1'][i]).strip('[]')
I didn't test this with an example dataframe, but with my experience with pandas it should work.
Edit: this tested code works
import pandas as pd
df = pd.DataFrame({'column': ['test', '[test]']})
df1 = pd.DataFrame({'column1': ['a', 'b']})
for i in range(0, df.shape[0]):
df1['column1'][i] = str(df['column'][i]).strip('[]')
Upvotes: 0