How to remove lines with empty elements within a lists converted from a pandas data frame using python?

Question

So I try to convert a pandas data frame to my customized class function and here is the code for it:

import os
import pandas as pd
import math
cwd = os.path.abspath('') 
files = os.listdir(cwd)  
df = pd.DataFrame()
for file in files:
    if file.endswith('.XLSX'):
        df = df.append(pd.read_excel(file), ignore_index=True)
#print(df)
array = df.values.tolist()
print(array)

class Item():
    
    def __init__(self, name, cost, gender, prime):
        self.__name = name
        self.__cost = cost
        self.__gender = gender
        self.__prime = prime

    def __repr__(self):
        return f"Item({self.__name},{self.__cost},{self.__gender},{self.__prime})"
    


mylist = [Item(*k) for k in array if k[0] and k[1] and k[2] and k[3]]
#print(mylist)

However, there are missing elements in the data frame, so when converting it to the list using array = df.values.tolist() instead of being an "None" for the empty part, the result would produce "nan" instead. This, in fact will cause the filtering process in "mylist" not working.

So, can you should me the code to do instead. Thank you in advance.

tchar · Accepted Answer

There are two ways

Use filter

import math
...
array = df.values.tolist()
array = filter(lambda e: all(map(lambda ee: not isinstance(ee, (float, int) or not math.isnan(ee), e)), array))
...

Use pandas

...
df = df.dropna()
array = df.values.tolist()
...

How to remove lines with empty elements within a lists converted from a pandas data frame using python?

Answers (2)

Related Questions