nsokuk
nsokuk

Reputation: 13

How to concat multiple text fields in pandas dataframe

How to concat unique values of some text columns of a pandas dataframe in to a single column. For example:

data = [[1,"US","California","Los Angeles"],
        [1,"US","California","San Francisco"],
        [1,"US","California","San Diego"],
        [1,"US","Texas","Austin"],
        [2,"IND","Maharashtra","Mumbai"],
        [2,"IND","Maharashtra","Pune"],
        [2,"IND","Maharashtra","Nagpur"]]

df = pd.DataFrame(data, columns = ['Country_Id', 'Country','State','Place'])

From above dataframe, how do I generate output with one field as Country_Id and second with a text field containing the unique values of Country, State, Place.

Like:

Please ignore the meaning of the combined text field

Upvotes: 1

Views: 134

Answers (1)

Andy L.
Andy L.

Reputation: 25239

Use groupby and apply with double join on unique and genexp

df.groupby('Country_Id').apply(lambda x: ' '.join(' '.join(x[col].unique()) for col in x))
                        .to_frame('Country-State-Place')


Out[434]:
                                                       Country-State-Place
Country_Id
1           US California Texas Los Angeles San Francisco San Diego Austin
2           IND Maharashtra Mumbai Pune Nagpur

Upvotes: 2

Related Questions