Pedro Cintra
Pedro Cintra

Reputation: 177

Is there any way to improve this loop?

There are some inputs and they can be strings inside a list or None

I need to add df.col_2 to df.col_1 if the input matches its respective column

i.e.:

if df.INPUT_1.iloc[0] in input_1:
        df.col_1.iloc[0] = df.col_1.iloc[0] + df.col_2.iloc[0]

I have the following code that can do the work for two inputs:

if input_1 == None and input_2 == None:
    for i in range(len(df)):
        df.col_1.iloc[i] = df.col_1.iloc[i] + df.col_2.iloc[i]
elif input_1 != None and input_2 == None:
    for i in range(len(df)):
        if (df.INPUT_1.iloc[i] in input_1):
            df.col_1.iloc[i] = df.col_1.iloc[i] + df.col_2.iloc[i]
elif input_1 == None and input_2 != None:
    for i in range(len(df)):
        if (df.INPUT_2.iloc[i] in input_2):
            df.col_1.iloc[i] = df.col_1.iloc[i] + df.col_2.iloc[i]
elif input_1 != None and input_2 != None:
    for i in range(len(df)):
        if (df.INPUT_1.iloc[i] in input_1) and (df.INPUT_2.iloc[i] in input_2):
            df.col_1.iloc[i] = df.col_1.iloc[i] + df.col_2.iloc[i]

But i'm after a more pythonic way so i don't have to write a extensive code if i need to check another input

Upvotes: 0

Views: 55

Answers (2)

JohanL
JohanL

Reputation: 6891

How about:

for i in range(len(df)):
    if ((not input_1 or df.INPUT_1.iloc[i] in input_1) and 
        (not input_2 or df.INPUT_2.iloc[i] in input_2)):
        df.col_1.iloc[i] += df.col_2.iloc[i]

Thus, inverting so that the for loop is the outer structure and then test for the different contstraints individually for each value. Depending on the the data, it may be a bit less efficient, but it is probably easier to grasp.

Edit:

Or to align more with the original code:

for i in range(len(df)):
    if ((input_1 is not None or df.INPUT_1.iloc[i] in input_1) and 
        (input_2 is not None or df.INPUT_2.iloc[i] in input_2)):
        df.col_1.iloc[i] += df.col_2.iloc[i]

As pointed out by @MoritzVetter below, my un-edited answer treats an empty list differently than the original code. The solution I presented will ignore empty lists while the original code will assume them to be lists in which to look for inputs. The reason I made this choice was because I believed this to be the intention of the original code, but of course I do not know this for sure.

If empty lists are to be considered as valid, it should be noted, as done in the comment, a better way to compare with None is to use the var is not None rather than var != None. It may be a minor point, but it makes the code a bit more "Pythonic" and there are actually cases in where the results may be different (with overloaded tests for equality).

Upvotes: 2

Moritz Vetter
Moritz Vetter

Reputation: 129

I'd simply do something like this:


def check_valid(entry, sequence):
    if sequence is None: 
        return true

    return entry in sequence


for i in range(len(df)):
    if check_valid(df.INPUT_1.iloc[i], input_1) and check_valid(df.INPUT_1.iloc[i], input_2):
        df.col_1.iloc[i] = df.col_1.iloc[i] + df.col_2.iloc[i]

# For n inputs
CHECK_THOSE = (
  (df.INPUT_1.iloc, input_1),
  (df.INPUT_2.iloc, input_2),
  # ...
)

for i in range(len(df)):
    if all(check_valid(entry, sequence) for entry, sequence in CHECK_THOSE):
        df.col_1.iloc[i] = df.col_1.iloc[i] + df.col_2.iloc[i]


Upvotes: 1

Related Questions