Kadir
Kadir

Reputation: 1615

Parameterized for loop in Python

I need to write a parameterized for loop.

# This works but...
df["ID"]=np_get_defined(df["methodA"+"ID"], df["methodB"+"ID"],df["methodC"+"ID"])

# I need a for loop as follows
df["ID"]=np_get_defined(df[sm+"ID"] for sm in strmethods)

and I get the following error:

ValueError: Length of values does not match length of index

Remaining definitions:

import numpy as np

df is a Pandas.DataFrame

strmethods=['methodA','methodB','methodC']

def get_defined(*args):
    strs = [str(arg) for arg in args if not pd.isnull(arg) and 'N/A' not in str(arg) and arg!='0']
    return ''.join(strs) if strs else None
np_get_defined = np.vectorize(get_defined)

Upvotes: 0

Views: 357

Answers (2)

mata
mata

Reputation: 69042

df["ID"]=np_get_defined(df[sm+"ID"] for sm in strmethods) means you're passing a generator as single argument to the called method.

If you want to expand the generated sequence to a list of arguments use the * operator:

df["ID"] = np_get_defined(*(df[sm + "ID"] for sm in strmethods))
# or:
df["ID"] = np_get_defined(*[df[sm + "ID"] for sm in strmethods])

The first uses a generator and unpacks its elements, the second uses a list comprehension instead, the result will be the same in either case.

Upvotes: 1

mic4ael
mic4ael

Reputation: 8310

I think the reason why it doesn't work is that your DataFrame consists of columns with different lengths.

Upvotes: 0

Related Questions