soggypotato
soggypotato

Reputation: 23

Making function to check for matching data in excel keeps returning false

I wanted to make a function using python to check whether a number (obtained based on a file string) is exist or not in an excel using pandas, if the number exist it will do something like "... is registered" and do some function, and vice versa.

The code that i tried to use is like this:

from tkinter import filedialog
import pandas as pd
import re

file = filedialog.askopenfilename(initialdir="C:/", title="choose file", filetypes=(("xlsx", "*.xlsx"), ("xls", "*.xls")))
name = (r'/(\w+)_(\w+)_excel.xlsx')
number = re.search(name, file).group(2)

references = pd.read_excel(r'C:/example.xlsx')

if float(number) == references:
    print(number + " is registered") #and more function
else:
    print(number + " is not registered") #and more function

When i used this code it will return ValueError:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I looked up for it a little and found that (as far as i understand) this happen because the function is not specific enough and can return both true or false, and then i tried using .any or .all after the references in the function but somehow it keeps returning false even if the number does exist in excel.

When i tried to check it using print(number == references) it will show a list and it would tell me where the matching data is found, but somehow when i tried to make a function out of it, it keeps returning false.

The references excel just contains a "Title" in cell A1 and the rest is just the number registered in it.

I tried look for it and found something about pandas.DataFrame.isin but i dont understand how to put it as a function and i dont think i can use it on my problem.

I am sorry if this is a fairly easy question but somehow i cannot find the answer for this, also i am new in python and english is not my first language.

Upvotes: 2

Views: 61

Answers (1)

JvdV
JvdV

Reputation: 75840

I'm still relatively new to Python, but let me think along:

  • You are pulling a variable from an excel-filename, let's say 1234ABCD from something like TestFile_1234ABCD_excel.xlsx. Right?
  • Then you are pulling in the content of a file called C:/example.xlsx into a dataframe, correct?
  • Last, you are checking if the variable is found somewhere in the dataframe?

So, what I think went wrong here is that you are comparing a dataframe against a variable using == which is the equality operator. This might well be the reason for returning False but I'm in no place to say that this is the actual cause since I don't fully understand Pyhton mechanics as yet.

However what does work for me replicating your issue was the following; I simply used:

if number in references.values:
    print(number + " is registered") #and more function
else:
    print(number + " is not registered") #and more function

This worked because "the dataframe class provides a member variable i.e DataFrame.values. It returns a numpy representation of all the values in dataframe. We can use the in & not in operators on these values to check if a given element exists or not." Source

Upvotes: 1

Related Questions