user3120266
user3120266

Reputation: 425

Merging two data together in Pandas

I am trying to concatenate data together in pandas but it doesn't appear to be working out well for me.

I have some data that I wanted to convert to a numeric and I was able to do this. I then want to have it rejoin the data set.

Here is what the original data looks like:

 CallDate            Agent          Group     Direction
0  2015-09-01         Adam          Billing    Inbound
1  2015-09-01         Nathaniel     Billing    Outbound
2  2015-09-01         Jessica       Claims     Inbound
3  2015-09-01         Tom           Billing    Outbound
4  2015-09-01         Jane          CCS        Inbound

Here is my code to convert the group to a numeric

data['Group']=data['Group'].astype(str)
data.Group=data['Group'].apply(lambda x:len(x))

This worked and gave me what I was looking for 0 1 1 1 2 13 3 1 4 6

I then try to merge this back to the group (basically I want to know what each name/number correspond to)

y=pd.concat([data,data.Group], ignore_index=True)
y [:5]

But the results are the same as the original database

Is there something obvious I am missing or an easier work around that I am not thinking of.

Upvotes: 1

Views: 72

Answers (2)

Nader Hisham
Nader Hisham

Reputation: 5414

you can use the cat function to concatenate two series in pandas check the Documentation here for the cat function .

also you can get the number of characters in any word easily by using the len function df.Group.str.len()

df['Group'] = df.Group.str.cat(df.Group.str.len().astype(str), sep=' => ')
df
Out[42]:
CallDate    Agent          Group         Direction
2015-09-01  Adam          Billing => 7   Inbound
2015-09-01  Nathaniel     Billing => 7   Outbound
2015-09-01  Jessica       Claims => 6    Inbound
2015-09-01  Tom           Billing => 7   Outbound
2015-09-01  Jane          CCS => 3       Inbound

Upvotes: 1

WoodChopper
WoodChopper

Reputation: 4375

pd.concat() is to concatenate two DataFrame. I think you are trying to concatenate two columns in a DataFrame.

data
Out[42]: 
     CallDate      Agent    Group Direction
0  2015-09-01       Adam  Billing   Inbound
1  2015-09-01  Nathaniel  Billing  Outbound
2  2015-09-01    Jessica   Claims   Inbound
3  2015-09-01        Tom  Billing  Outbound
4  2015-09-01       Jane      CCS   Inbound

data.Group = data.Group + data.Group.apply(lambda x:" / "+str(len(x)))

data
Out[44]: 
     CallDate      Agent        Group Direction
0  2015-09-01       Adam  Billing / 7   Inbound
1  2015-09-01  Nathaniel  Billing / 7  Outbound
2  2015-09-01    Jessica   Claims / 6   Inbound
3  2015-09-01        Tom  Billing / 7  Outbound
4  2015-09-01       Jane      CCS / 3   Inbound

You can find more details in pandas concat API documentation

Update for new column,

data['Group_1'] = data.Group + data.Group.apply(lambda x:" / "+str(len(x)))

data
Out[56]: 
     CallDate      Agent    Group Direction      Group_1
0  2015-09-01       Adam  Billing   Inbound  Billing / 7
1  2015-09-01  Nathaniel  Billing  Outbound  Billing / 7
2  2015-09-01    Jessica   Claims   Inbound   Claims / 6
3  2015-09-01        Tom  Billing  Outbound  Billing / 7
4  2015-09-01       Jane      CCS   Inbound      CCS / 3

Upvotes: 2

Related Questions