Pandas Dataframe Entire Column to String Data Type

Question

I know you can do this with a series, but I can't seem to do this with a dataframe.

I have the following:

     name     note                     age
0    jon      likes beer on tuesdays      10
1    jon      likes beer on tuesdays
2    steve    tonight we dine in heck     20
3    steve    tonight we dine in heck

I am trying to produce the following:

     name     note                     age
0    jon      likes beer on tuesdays      10
1    jon      likes beer on tuesdays      10
2    steve    tonight we dine in heck     20
3    steve    tonight we dine in heck     20

I know how to do this with string values using group by and join, but this only works on string values. I'm having issues converting the entire column of age to a string data type in the dataframe.

Any suggestions?

jezrael · Accepted Answer

Use GroupBy.first with GroupBy.transform if want repeat first values per groups:

g = df.groupby('name')
df['note'] = g['note'].transform(' '.join)
df['age'] = g['age'].transform('first')

If need processing multiple columns - it means all numeric with first and all strings by join you can generate dictionary by columns names with functions, pass to GroupBy.agg and last use DataFrame.join:

cols1 = df.select_dtypes(np.number).columns
cols2 = df.columns.difference(cols1).difference(['name'])
d1 = dict.fromkeys(cols2, lambda x: ' '.join(x))
d2 = dict.fromkeys(cols1, 'first')
d = {**d1, **d2}
df1 = df[['name']].join(df.groupby('name').agg(d), on='name')

Pandas Dataframe Entire Column to String Data Type

Answers (1)

Related Questions