Alison LT
Alison LT

Reputation: 405

Replace value of column based on value in separate column

I have a pandas DataFrame that looks like:

  ID  |   StateName   |   ZipCode
____________________________________  
  0          MD            20814     
  1                        90210   
  2          DC            20006   
  3                        05777   
  4                        12345

I have a function that will fill in StateName based on ZipCode value:

def FindZip(x):
    search = ZipcodeSearchEngine()
    zipcode = search.by_zipcode(x)
    return zipcode['State']

I want to fill in the blanks in the column StateName - based on the value of the corresponding ZipCode. I've unsuccessfully tried this:

test['StateName'] = test['StateName'].apply(lambda x: FindZip(test['Zip_To_Use']) if x == "" else x)

Basically, I want to apply a function to a column different from the column I am trying to change. I would appreciate any help! Thanks!

Upvotes: 1

Views: 85

Answers (2)

niraj
niraj

Reputation: 18208

You can try following:

test['StateName'] = test.apply(lambda x: FindZip(test['Zip_To_Use']) 
                                      if x['StateName'] == "" 
                                      else x['StateName'], axis = 1)

The above code applies to dataframe instead of StateName and using axis = 1, applies to columns.

Updated:

Updated with multiple condition in if statement (looking at the solution below):

test['StateName'] = test.apply(lambda x: FindZip(test['Zip_To_Use']) 
                                if ((x['StateName'] == "") and  (x['Zip_To_Use'] != ""))
                                else x['StateName'], axis = 1)

Upvotes: 2

Alison LT
Alison LT

Reputation: 405

I came up with a not very "pandorable" workaround. I would still love to see a more "pythonic" or "pandorable" solution if anyone has ideas! I essentially created a new list of the same length as the DataFrame and iterated through each row and then wrote over the column with the new list.

state = [FindState(test['Zip_To_Use'].iloc[i]) if (test['StateName'].iloc[i] == "" and test['Zip_To_Use'].iloc[i] != "") 
         else test['StateName'].iloc[i] for i in range(len(test))]

Restated in a regular for loop (for readability):

state = []
for i in range(len(test)):
    if (test['StateName'].iloc[i] == "" and test['Zip_To_Use'].iloc[i] != ""):
        state.append(FindState(test['Zip_To-Use'].iloc[i]))
    else:
        state.append(test['StateName'].iloc[i])

And then reassigned the column with this new list

test['StateName'] = state

Please let me know if you have a better solution!

Upvotes: 0

Related Questions