Ethan Jaworski
Ethan Jaworski

Reputation: 1

How can I loop through my spreadsheet cell-by-cell and find cells that are blank?

I am trying to loop through an enourmous spreadsheet using various rules. First I would like to find all of the blank cells.

Currently I am trying something like this using pandas:

excel_file = <file name>
df = pd.read_excel(excel_file)
for item in df.items():
     if item == "":
          <add cell to list of blank cells>

Upvotes: 0

Views: 70

Answers (2)

MRasheed
MRasheed

Reputation: 29

If you need to make a list of empty cells as Excel references try this code:

def ExcelNullCells(excel_file_path):
# important note this function assume the first row will be a header in excel file and your data start from first column in excel   
    import pandas as pd
    import numpy as np
# making function to confert decimel column number to excel column referance
    def DEC2Alpha(x):
        import numpy as np
        alpha="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
        lst=[[0,r:=x%26,x:=x//26]]
        n=0
        while lst[n][2]>1:
            lst.append([n:=lst[n][0]+1, r:=x%26,x:=x//26])
        tr_arr= np.array(lst).T
        last_remin=tr_arr[2][len(tr_arr[0])-1]
        res_num= np.flip(np.append(tr_arr[1],last_remin))
        res_num[len(res_num)-1]=res_num[len(res_num)-1]+1
        if res_num[0]==0:
            res_num=res_num[1:]
        res_alpha="".join([alpha[num-1] for num in res_num])
        return res_alpha
#reading excel file and convert it to array 
    path = excel_file_path
    data = pd.read_excel(path)
    data_np= np.array(data)
#checking Null value in array and return Null cells value as excel cell referance
    a=pd.isna(data)
    nalist=np.where(a)
    excel_column=list(map(DEC2Alpha,nalist[1]))
    excel_row=list(map(lambda row:row+2,nalist[0]))
    excel_cell=list(map(lambda val :val[0]+str(val[1]), zip(excel_column, excel_row)))
    return excel_cell

Upvotes: -1

nuenen
nuenen

Reputation: 51

If you want to use the dataframe, for finding empty cells you can use numpy.where. If you use numpy.where(df = '') you can get indices of rows and columns where the cells are empty:

data = {
'A': [1, 2, '', 4],
'B': [5, '', 7, 8],
'C': [9, 10, 11, '']
}
df = pd.DataFrame(data)
empty_string_indices = np.where(df == '')
empty_string_list = list(zip(empty_string_indices[0], empty_string_indices[1]))
print(empty_string_list)

And the output would be:

[(1, 1), (2, 0), (3, 2)]

This is the solution for empty strings specifically. If your empty cells actually contain NaN values, use np.where(pd.isnull(df)) instead

Upvotes: 4

Related Questions