Pandas split and concatenate list result

Question

I have a dataframe like this:

index               int64
idline              int64
name               object
idname             object
Amount            float64
UnitPrice         float64
Qty               float64
LineTxCodeId       object
TotalAmt          float64
Number             object
CurrencyRef        object
TxnDate            object
Customer           object
CustomerId         object
DueBalance        float64
TotalTaxesRate    float64
Classname          object
ClassId            object
year                int64
client             object

I have a list of Customer with différents names. So I want to group by this data frame to have sum order by customer and years. In order to group customer with a name nearly the same, I decide to split Customer data based on the first 3 words. this is my code:

df['year'] = pd.DatetimeIndex(df['TxnDate']).year # add column year
df['client'] = df['Customer'].str.split(' ').str[:3] # add colum with 3 first word

the issue is that df['client'] become a list for each row. like that: [San, francisco, design]

I want to have a string like this: 'San Francisco design'

What should I do?

goal is to have this groupby:

df1 = df.groupby(['client']).agg({'Amount': ['sum']})

It does not work now because of client which is a list...

Thanks for helping.

Koralp Catalsakal · Accepted Answer

You can use the join command while assigning the 'client' column:

import pandas as pd 
df = pd.DataFrame(['San Francisco Design Company 1','San Francisco Design Company 2'],columns =['Customer'])
df['client'] = df['Customer'].str.split(' ').str[:3].str.join(' ')
print(df)
                         Customer                client
0  San Francisco Design Company 1  San Francisco Design
1  San Francisco Design Company 2  San Francisco Design

Pandas split and concatenate list result

Answers (1)

Related Questions