Iterating through a dataframe and adding a new row

Question

I want to add a new column to my exisitng dataframe.

I am doing this,

def test(self, sess, df):

    for index, frame in df.iterrows():
        medical_plan = sess.query(MedicalPlan.id).filter(MedicalPlan.issuer_id == frame['issuer_id'],
                                  MedicalPlan.hios_plan_identifier == frame['hios_plan_identifier'],
                                  MedicalPlan.plan_year == frame['plan_year'],
                                  MedicalPlan.group_or_individual_plan_type == frame['group_or_individual_plan_type']).first()
        sess.commit()
        frame['medical_plan_id'] = list(medical_plan)[0]
        df = df.append(frame)
    print df

The df before the loop is ,

  wellthie_issuer_identifier       ...       service_area_id
0                   UHC99806       ...                     1

[1 rows x 106 columns]

Normally the column and data should be added to this row. But I am getting 2 rows instead and only the last loop value inserted. df after the loop, column is getting created but the data is wrong.

 wellthie_issuer_identifier       ...       medical_plan_id
0                   UHC99806       ...                   NaN
0                   UHC99806       ...              879519.0

[2 rows x 107 columns]

How do I achieve this. Output I should get as below-

 wellthie_issuer_identifier       ...       service_area_id  medical_plan_id
0                   UHC99806       ...                     1    879519.0

[1 rows x 107 columns]

try 1:

I called the get_id method like below -

 def test(self, sess, df):
        print ("in test", df)
        for index, frame in df.iterrows():
            id = self.get_id(sess, frame)
            df['medical_plan_id'] = df.apply(id, axis=1)
        print df

Acccumulation · Accepted Answer

def test(self, sess, df):
     def get_id(frame):
            medical_plan = sess.query(MedicalPlan.id).filter(MedicalPlan.issuer_id == frame['issuer_id'],
                                          MedicalPlan.hios_plan_identifier == frame['hios_plan_identifier'],
                                          MedicalPlan.plan_year == frame['plan_year'],
                                          MedicalPlan.group_or_individual_plan_type == frame['group_or_individual_plan_type']).first()
            sess.commit()
            return list(medical_plan)[0]
      df['medical_plan_id']=df.apply(get_id, axis =1)
      print(df)

If you want medical_plan_id to be an int, you can change the last line of get_id to return int(list(medical_plan)[0]). Also, you probably could do

    medical_plan = sess.query(MedicalPlan.id).filter(
            all([MedicalPlan.attribute == frame.attribute for attribute in 
                 ['issuer_id','hios_plan_identifier','plan_year','group_or_individual_plan_type']])).first()

or

        attributes = ['issuer_id','hios_plan_identifier','plan_year','group_or_individual_plan_type']
        medical_plan = sess.query(MedicalPlan.id).filter(all(MedicalPlan[attributes]==frame[attributes])).first())

(I can't say for certain whether that will work without knowing what kind of object MedicalPlan is.)

Iterating through a dataframe and adding a new row

Answers (1)

Related Questions