zhangruibo101
zhangruibo101

Reputation: 67

How to remove lines with empty elements within a lists converted from a pandas data frame using python?

So I try to convert a pandas data frame to my customized class function and here is the code for it:

import os
import pandas as pd
import math
cwd = os.path.abspath('') 
files = os.listdir(cwd)  
df = pd.DataFrame()
for file in files:
    if file.endswith('.XLSX'):
        df = df.append(pd.read_excel(file), ignore_index=True)
#print(df)
array = df.values.tolist()
print(array)

class Item():
    
    def __init__(self, name, cost, gender, prime):
        self.__name = name
        self.__cost = cost
        self.__gender = gender
        self.__prime = prime

    def __repr__(self):
        return f"Item({self.__name},{self.__cost},{self.__gender},{self.__prime})"
    


mylist = [Item(*k) for k in array if k[0] and k[1] and k[2] and k[3]]
#print(mylist)

However, there are missing elements in the data frame, so when converting it to the list using array = df.values.tolist() instead of being an "None" for the empty part, the result would produce "nan" instead. This, in fact will cause the filtering process in "mylist" not working.

So, can you should me the code to do instead. Thank you in advance.

Upvotes: 1

Views: 56

Answers (2)

tchar
tchar

Reputation: 908

There are two ways

  1. Use filter
import math
...
array = df.values.tolist()
array = filter(lambda e: all(map(lambda ee: not isinstance(ee, (float, int) or not math.isnan(ee), e)), array))
...

  1. Use pandas
...
df = df.dropna()
array = df.values.tolist()
...

Upvotes: 0

Mars Buttfield-Addison
Mars Buttfield-Addison

Reputation: 181

Much easier to do while it's still a pandas DataFrame. If you insert a

df.dropna(inplace=True)

before you df.values.tolist() then any rows with missing values should be removed.

Upvotes: 1

Related Questions