Reputation: 67
So I try to convert a pandas data frame to my customized class function and here is the code for it:
import os
import pandas as pd
import math
cwd = os.path.abspath('')
files = os.listdir(cwd)
df = pd.DataFrame()
for file in files:
if file.endswith('.XLSX'):
df = df.append(pd.read_excel(file), ignore_index=True)
#print(df)
array = df.values.tolist()
print(array)
class Item():
def __init__(self, name, cost, gender, prime):
self.__name = name
self.__cost = cost
self.__gender = gender
self.__prime = prime
def __repr__(self):
return f"Item({self.__name},{self.__cost},{self.__gender},{self.__prime})"
mylist = [Item(*k) for k in array if k[0] and k[1] and k[2] and k[3]]
#print(mylist)
However, there are missing elements in the data frame, so when converting it to the list using array = df.values.tolist()
instead of being an "None" for the empty part, the result would produce "nan" instead. This, in fact will cause the filtering process in "mylist" not working.
So, can you should me the code to do instead. Thank you in advance.
Upvotes: 1
Views: 56
Reputation: 908
There are two ways
filter
import math
...
array = df.values.tolist()
array = filter(lambda e: all(map(lambda ee: not isinstance(ee, (float, int) or not math.isnan(ee), e)), array))
...
...
df = df.dropna()
array = df.values.tolist()
...
Upvotes: 0
Reputation: 181
Much easier to do while it's still a pandas DataFrame. If you insert a
df.dropna(inplace=True)
before you df.values.tolist()
then any rows with missing values should be removed.
Upvotes: 1