Reputation: 3031
I have two different dataframes, where one is an extended version of another. (1) How do I combine the two efficiently based on if the two dataframes share the same Name? (2) also is there a way to add in four spaces for code in the stackoverflow box without typing in four spaces for each line? That can be time consuming.
more details
One is a full dataframe with multiple listings of a value (sortedregsubjdf). the other contains only unique values of that other dataframe (as it is a dataframe of network centralities) - called sortedcentralitydf
sortedregsubjdf
Name Organization Year Centrality
6363 (Buz) Business And Commerce doclist[524] 2012 0.503677
8383 (Buz) Business And Commerce doclist[697] 2012 0.503677
1170 (Buz) Business And Commerce doclist[103] 2012 0.503677
1579 (Eco) Economics News doclist[140] 2013 0.500624
10979 (Gop) Provincial Government News doclist[941] 2013 0.501232
4374 (Gop) Provincial Government News doclist[368] 2013 0.501232
10988 (Npt) Not-For-Profits doclist[942] 2013 0.498810
sortedcentralitiesdf (business and commerce only appears once since it contains unique values, where sortedregsubjdf has multiple values)
Name Centrality
316 (Buz) Business And Commerce 0.503677
448 (Eco) Economics News 0.500624
499 (Gop) Provincial Government News 0.501232
366 (Npt) Not-For-Profits 0.498810
217 (Pdt) New Products And Services 0.504600
This was my code to combine the two dataframes, but I was wondering if there was a more efficient way to do so?
for i, val in enumerate(sortedcentralitydf.Name):
for x, xval in enumerate(sortedregsubjdf.Name):
if val == xval:
#print val, xval
sortedregsubjdf.Centrality[sortedregsubjdf.Name == xval] = sortedcentralitydf.Centrality[sortedcentralitydf.Name == val].iloc[0]
Upvotes: 0
Views: 149
Reputation: 7131
Pandas has a merge
function. It sounds like something like this would work...
import pandas as pd
merged_df = pd.merge(sortedregsubjdf, sortedcentralitiesdf, on='Name')
Upvotes: 2