Reputation: 233
I don't know whether it can be achieved or not using python pandas. Here is the scenario I'm trying to do
I created a databases connection to MSSQL using python (pyodbc, sqlalchemy)
I read one table and saved it as dataframe like this
data = pd.read_sql_table('ENCOUNTERP1', conn)
and the dataframe looks like this
ENCOUNTERID DIAGCODE DIAGSEQNO POA DIAGVERFLAG
0 78841 3GRNFC 3 P
1 89960 6
2 86479 N18BZON 9 K
3 69135 MPPY3 9 9 0
4 32422 DS6SBT 2 P
5 69135 4 D H
6 92019 PP0 1
7 42105 2 L
8 99256 U 1 J
9 33940 II9ZODF 3 2
10 33940 OH 1
11 65108 CI6COE 8 U
12 77871 Y3ZHN1 7 S
13 65108 73BJBZV 8 7
14 99256 7 1 T
Now I have one more dataframe (dp = pd.read_sql_table('tblDiagnosis', conn)
)which has DIAGCODE column in it and they all are unique
I want to get those DIAGCODE from dataframe dp
and update it to dataframe data['DIAGCODE']
I tried to do like this iterate over each row and update another dataframe row by row but here in this code the second for loop will start from 0 index every time so, finally the entire row is filled with one value.
for index, row in dp.iterrows():
for i, r in data.iterrows():
r['DIAGCODE'] = row['Code']
First of all the two dataframe's are not equal in size this is dataframe for data
Code Description Category IcdSet
0 001 001 - CHOLERA CHOLERA 9
1 0010 0010 - CHOLERA D/T V. CHOLERAE CHOLERA 9
2 0011 0011 - CHOLERA D/T V. EL TOR CHOLERA 9
3 0019 0019 - CHOLERA NOS CHOLERA 10
4 002 002 - TYPHOID/PARATYPHOID FEV TYPHOID AND PARATYPHOID FEVERS 9
5 0020 0020 - TYPHOID FEVER TYPHOID AND PARATYPHOID FEVERS 9
and the output should be something like this
ENCOUNTERID DIAGCODE DIAGSEQNO POA DIAGVERFLAG
0 78841 001 3 P
1 89960 0010 6
2 86479 0011 9 K
3 69135 0019 9 9 0
4 32422 002 2 P
5 69135 0020 4 D H
I would like to add one condition from dataframe dp like this
for index, row in dp.iterrows():
for i, r in data.iterrows():
if row['Code'] == 10:
r['DIAGCODE'] = row['Code']
Upvotes: 1
Views: 2170
Reputation: 735
I assume that the two tables have same row size and are both in desired order you wanted. If it's correct, then you can simply use:
df = pd.concat([data, pd], axis=1)
Then extract the columns you wanted:
df = df.ix[;,['ENCOUNTERID','CODE', 'DIAGSEQNO', 'POA', 'DIAGVERFLAG']].rename(columns={'CODE': 'DIAGCODE'})
If this meets your requirement, please vote.
Sorry, the .ix was deprecated even it can still be used without problem. So please use
df = df[['ENCOUNTERID','CODE', 'DIAGSEQNO', 'POA', 'DIAGVERFLAG']].rename(columns={'CODE': 'DIAGCODE'})
BTW, the issue in your code is that you were using two loops which makes the last value of inside loop to be the final value of outside loop. So here is solution:
for row, r in zip(pd.iterrows(),data.iterrows()):
r[1]['DIAGCODE']=row[1]['CODE']
Upvotes: 1