Reputation: 13
i have csv data
index username
1 ailee
2 yura
3 sony
4 lily
5 alex
6 eunji
7 hyun
8 jingo
9 kim
10 min
and dataframe result of cluster :
index cluster
1 1
3 1
5 1
7 1
8 1
9 2
4 2
2 2
10 2
6 2
it it possible to add a username column in pd.dataframe based on csv data?
Upvotes: 1
Views: 70
Reputation: 863741
You can use join
:
print (df2.join(df1))
cluster username
index
1 1 ailee
3 1 sony
5 1 alex
7 1 hyun
8 1 jingo
9 2 kim
4 2 lily
2 2 yura
10 2 min
6 2 eunji
Or map
:
#map by column cluster
df2['username'] = df2.cluster.map(df1.username)
#map by index
df2['username1'] = df2.index.to_series().map(df1.username)
print (df2)
cluster username username1
index
1 1 ailee ailee
3 1 ailee sony
5 1 ailee alex
7 1 ailee hyun
8 1 ailee jingo
9 2 yura kim
4 2 yura lily
2 2 yura yura
10 2 yura min
6 2 yura eunji
Upvotes: 0
Reputation: 818
I am using 'DataFrame.merge' for this. Here is the code
>>> import StringIO as sio
>>> import pandas as pd
>>> s1='''index username
1 ailee
2 yura
3 sony
4 lily
5 alex
6 eunji
7 hyun
8 jingo
9 kim
10 min'''
>>> s2 = '''index cluster
1 1
3 1
5 1
7 1
8 1
9 2
4 2
2 2
10 2
6 2'''
>>> df1=pd.read_csv(sio.StringIO(s1), index_col=0, delim_whitespace=True)
>>> df2=pd.read_csv(sio.StringIO(s2), index_col=0, delim_whitespace=True)
>>> df1
username
index
1 ailee
2 yura
3 sony
4 lily
5 alex
6 eunji
7 hyun
8 jingo
9 kim
10 min
>>> df2
cluster
index
1 1
3 1
5 1
7 1
8 1
9 2
4 2
2 2
10 2
6 2
>>> df1.merge(df2, left_index=True, right_index=True)
username cluster
index
1 ailee 1
3 sony 1
5 alex 1
7 hyun 1
8 jingo 1
9 kim 2
4 lily 2
2 yura 2
10 min 2
6 eunji 2
Upvotes: 1