Reputation: 2876
I am using the current dataframe:
df = pd.DataFrame({'columnA':[1111,1111,2222,3333,4444,4444,5555,6666],
'columnB':['AAAA','AAAA','BBBB','AAAA','BBBB','BBBB','AAAA','BBBB'],
'columnC':['one','two','one','one','one','sales','two','one'],
'NUM1':[1,3,5,7,1,0,4,5],
'NUM2':[5,3,6,9,2,4,1,1],
'W':list('aaabbbbb')})
and I am trying to use a dynamic column in the following code:
#First aggregate the data
d = {'columnB':'unique', 'columnC':'unique' }
df2 = df.groupby('columnA').agg(d)
#Convert list to string for each cell of the inventory field
mylist = ["columnB","columnC"]
for x in mylist:
columnName = x
#print("df2."+columnName+".apply(', '.join)")
df2[columnName] = df2[columnName].apply(', '.join)
and it works fine in Jupyter. My issue is that it does not work when I run it on visualstudio. I am getting this error:
sequence item 0: expected str instance, float found
after print the dataframe's type I am getting this:
<class 'pandas.core.frame.DataFrame'>
Here is the full error message:
Traceback (most recent call last): File "stage1.py", line 112, in main() File "stage1.py", line 57, in main templateScenarios[columnName] = templateScenarios[columnName].apply(', '.join) File "/Users/apolo.siskos/anaconda3/lib/python3.6/site-packages/pandas/core/series.py", line 2355, in apply mapped = lib.map_infer(values, f, convert=convert_dtype) File "pandas/_libs/src/inference.pyx", line 1574, in pandas._libs.lib.map_infer TypeError: sequence item 0: expected str instance, float found
Upvotes: 1
Views: 1119
Reputation: 862551
There is problem NaN
s values, so is possible remove them by dropna
and use custom function with join
:
df = pd.DataFrame({'columnA':[1111,1111,2222,3333,4444,4444,5555,6666],
'columnB':[np.nan,np.nan,'BBBB','AAAA','BBBB','BBBB','AAAA','BBBB'],
'columnC':['one','two','one','one','one','sales','two','one'],
'NUM1':[1,3,5,7,1,0,4,5],
'NUM2':[5,3,6,9,2,4,1,1],
'W':list('aaabbbbb')})
f = lambda x: ', '.join(x.dropna().unique())
d = {'columnB': f, 'columnC':f}
df2 = df.groupby('columnA').agg(d)
print (df2)
columnB columnC
columnA
1111 one, two
2222 BBBB one
3333 AAAA one
4444 BBBB one, sales
5555 AAAA two
6666 BBBB one
Upvotes: 1