Ragnar
Ragnar

Reputation: 2690

Compare the presence of a column in a DF into another DF, then fill

I have 2 DF where I want to check if df1["A"] is in df2. If not fill df2["A"] with 0.

I got it with and ugly for loop and I try to optimize this but I cannot find out how to do it.

testing_list = list(testing_df.columns)

for i in range(len(training_df.columns)):
    if not training_df.columns[i] in testing_list:
        testing_df[training_df.columns[i]] = 0

Upvotes: 0

Views: 36

Answers (1)

jezrael
jezrael

Reputation: 862921

Use DataFrame.reindex with new columns created by Index.union:

testing_df = pd.DataFrame({
        'A':list('abcdef'),
         'B':[4,5,4,5,5,4],
         'F':list('aaabbb')
})

training_df = pd.DataFrame({
        'A':list('abcdef'),
         'C':[7,8,9,4,2,3],
         'D':[1,3,5,7,1,0],

})

cols = testing_df.columns.union(training_df.columns, sort=False)
df = testing_df.reindex(cols, axis=1, fill_value=0)
print (df)
   A  B  F  C  D
0  a  4  a  0  0
1  b  5  a  0  0
2  c  4  a  0  0
3  d  5  b  0  0
4  e  5  b  0  0
5  f  4  b  0  0

If want add columns for both DataFrames with sorted columns use DataFrame.align:

testing_df, training_df = testing_df.align(training_df, fill_value=0)
print (testing_df)
   A  B  C  D  F
0  a  4  0  0  a
1  b  5  0  0  a
2  c  4  0  0  a
3  d  5  0  0  b
4  e  5  0  0  b
5  f  4  0  0  b

print (training_df)
   A  B  C  D  F
0  a  0  7  1  0
1  b  0  8  3  0
2  c  0  9  5  0
3  d  0  4  7  0
4  e  0  2  1  0
5  f  0  3  0  0

Upvotes: 1

Related Questions