supar
supar

Reputation: 233

How to update row by row of dataframe using python pandas

I don't know whether it can be achieved or not using python pandas. Here is the scenario I'm trying to do

I created a databases connection to MSSQL using python (pyodbc, sqlalchemy)

I read one table and saved it as dataframe like this

data = pd.read_sql_table('ENCOUNTERP1', conn)

and the dataframe looks like this

ENCOUNTERID DIAGCODE DIAGSEQNO POA DIAGVERFLAG
0        78841   3GRNFC         3   P
1        89960                  6
2        86479  N18BZON         9   K
3        69135    MPPY3         9   9           0
4        32422   DS6SBT         2               P
5        69135                  4   D           H
6        92019      PP0         1
7        42105                  2               L
8        99256        U         1               J
9        33940  II9ZODF         3   2
10       33940       OH         1
11       65108   CI6COE         8   U
12       77871   Y3ZHN1         7               S
13       65108  73BJBZV         8   7
14       99256        7         1               T

Now I have one more dataframe (dp = pd.read_sql_table('tblDiagnosis', conn))which has DIAGCODE column in it and they all are unique

I want to get those DIAGCODE from dataframe dp and update it to dataframe data['DIAGCODE']

I tried to do like this iterate over each row and update another dataframe row by row but here in this code the second for loop will start from 0 index every time so, finally the entire row is filled with one value.

for index, row in dp.iterrows(): 
        for i, r in data.iterrows():
            r['DIAGCODE'] = row['Code']

First of all the two dataframe's are not equal in size this is dataframe for data

Code Description Category IcdSet 0 001 001 - CHOLERA CHOLERA 9 1 0010 0010 - CHOLERA D/T V. CHOLERAE CHOLERA 9 2 0011 0011 - CHOLERA D/T V. EL TOR CHOLERA 9 3 0019 0019 - CHOLERA NOS CHOLERA 10 4 002 002 - TYPHOID/PARATYPHOID FEV TYPHOID AND PARATYPHOID FEVERS 9 5 0020 0020 - TYPHOID FEVER TYPHOID AND PARATYPHOID FEVERS 9

and the output should be something like this

ENCOUNTERID DIAGCODE DIAGSEQNO POA DIAGVERFLAG 0 78841 001 3 P 1 89960 0010 6 2 86479 0011 9 K 3 69135 0019 9 9 0 4 32422 002 2 P 5 69135 0020 4 D H

I would like to add one condition from dataframe dp like this

for index, row in dp.iterrows(): for i, r in data.iterrows(): if row['Code'] == 10: r['DIAGCODE'] = row['Code']

Upvotes: 1

Views: 2170

Answers (1)

AC at CA
AC at CA

Reputation: 735

I assume that the two tables have same row size and are both in desired order you wanted. If it's correct, then you can simply use:

df = pd.concat([data, pd], axis=1)

Then extract the columns you wanted:

df = df.ix[;,['ENCOUNTERID','CODE', 'DIAGSEQNO', 'POA', 'DIAGVERFLAG']].rename(columns={'CODE': 'DIAGCODE'})

If this meets your requirement, please vote.


Sorry, the .ix was deprecated even it can still be used without problem. So please use

df = df[['ENCOUNTERID','CODE', 'DIAGSEQNO', 'POA', 'DIAGVERFLAG']].rename(columns={'CODE': 'DIAGCODE'})

BTW, the issue in your code is that you were using two loops which makes the last value of inside loop to be the final value of outside loop. So here is solution:

for row, r in zip(pd.iterrows(),data.iterrows()):
    r[1]['DIAGCODE']=row[1]['CODE']

Upvotes: 1

Related Questions