Reputation: 749
I want to add a new column to my exisitng dataframe.
I am doing this,
def test(self, sess, df):
for index, frame in df.iterrows():
medical_plan = sess.query(MedicalPlan.id).filter(MedicalPlan.issuer_id == frame['issuer_id'],
MedicalPlan.hios_plan_identifier == frame['hios_plan_identifier'],
MedicalPlan.plan_year == frame['plan_year'],
MedicalPlan.group_or_individual_plan_type == frame['group_or_individual_plan_type']).first()
sess.commit()
frame['medical_plan_id'] = list(medical_plan)[0]
df = df.append(frame)
print df
The df before the loop is ,
wellthie_issuer_identifier ... service_area_id
0 UHC99806 ... 1
[1 rows x 106 columns]
Normally the column and data should be added to this row. But I am getting 2 rows instead and only the last loop value inserted. df after the loop, column is getting created but the data is wrong.
wellthie_issuer_identifier ... medical_plan_id
0 UHC99806 ... NaN
0 UHC99806 ... 879519.0
[2 rows x 107 columns]
How do I achieve this. Output I should get as below-
wellthie_issuer_identifier ... service_area_id medical_plan_id
0 UHC99806 ... 1 879519.0
[1 rows x 107 columns]
try 1:
I called the get_id method like below -
def test(self, sess, df):
print ("in test", df)
for index, frame in df.iterrows():
id = self.get_id(sess, frame)
df['medical_plan_id'] = df.apply(id, axis=1)
print df
Upvotes: 0
Views: 84
Reputation: 3591
def test(self, sess, df):
def get_id(frame):
medical_plan = sess.query(MedicalPlan.id).filter(MedicalPlan.issuer_id == frame['issuer_id'],
MedicalPlan.hios_plan_identifier == frame['hios_plan_identifier'],
MedicalPlan.plan_year == frame['plan_year'],
MedicalPlan.group_or_individual_plan_type == frame['group_or_individual_plan_type']).first()
sess.commit()
return list(medical_plan)[0]
df['medical_plan_id']=df.apply(get_id, axis =1)
print(df)
If you want medical_plan_id
to be an int, you can change the last line of get_id
to return int(list(medical_plan)[0])
. Also, you probably could do
medical_plan = sess.query(MedicalPlan.id).filter(
all([MedicalPlan.attribute == frame.attribute for attribute in
['issuer_id','hios_plan_identifier','plan_year','group_or_individual_plan_type']])).first()
or
attributes = ['issuer_id','hios_plan_identifier','plan_year','group_or_individual_plan_type']
medical_plan = sess.query(MedicalPlan.id).filter(all(MedicalPlan[attributes]==frame[attributes])).first())
(I can't say for certain whether that will work without knowing what kind of object MedicalPlan
is.)
Upvotes: 1