Reputation: 1615
I need to write a parameterized for loop.
# This works but...
df["ID"]=np_get_defined(df["methodA"+"ID"], df["methodB"+"ID"],df["methodC"+"ID"])
# I need a for loop as follows
df["ID"]=np_get_defined(df[sm+"ID"] for sm in strmethods)
and I get the following error:
ValueError: Length of values does not match length of index
Remaining definitions:
import numpy as np
df
is a Pandas.DataFrame
strmethods=['methodA','methodB','methodC']
def get_defined(*args):
strs = [str(arg) for arg in args if not pd.isnull(arg) and 'N/A' not in str(arg) and arg!='0']
return ''.join(strs) if strs else None
np_get_defined = np.vectorize(get_defined)
Upvotes: 0
Views: 357
Reputation: 69042
df["ID"]=np_get_defined(df[sm+"ID"] for sm in strmethods)
means you're passing a generator as single argument to the called method.
If you want to expand the generated sequence to a list of arguments use the *
operator:
df["ID"] = np_get_defined(*(df[sm + "ID"] for sm in strmethods))
# or:
df["ID"] = np_get_defined(*[df[sm + "ID"] for sm in strmethods])
The first uses a generator and unpacks its elements, the second uses a list comprehension instead, the result will be the same in either case.
Upvotes: 1
Reputation: 8310
I think the reason why it doesn't work is that your DataFrame
consists of columns with different lengths.
Upvotes: 0