Shane Bowe
Shane Bowe

Reputation: 41

Trying to search pandas DataFrame, but coming up empty

I am writing a python program that is attached to a Google Sheet, it's purpose is to be able to search through the spreadsheet and based on user input find the row that contains the "ID" input. I am trying to get it to return the other values in the row, to be used for pricing formulas, and eventually building a price tag.

The problem comes from the fact that when I run my program using a local CSV it runs fine. After connecting to a Google Sheet, it now tells me my "index 0 is out of bounds for axis 0 with size 0" my data frame is structured identically to the one running from a local CSV and when I print the df it even displays all the data, but when I try to run my search it tells me everything is empty. For example, if I were to run print(costi) I would get "[]" when I should be getting ["value"] (value being whatever is in that cell)

    from __future__ import print_function

import os.path

from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

from google.oauth2 import service_account

import pandas as pd
from PIL import Image, ImageDraw, ImageFont
import math


SERVICE_ACCOUNT_FILE = 'keys.json'
SCOPES = ['https://www.googleapis.com/auth/spreadsheets']

creds = None
creds = service_account.Credentials.from_service_account_file(
        SERVICE_ACCOUNT_FILE, scopes=SCOPES)


# The ID and range of a sample spreadsheet.
SPREADSHEET_ID = '********************'


service = build('sheets', 'v4', credentials=creds)

# Call the Sheets API
sheet = service.spreadsheets()
result = sheet.values().get(spreadsheetId=SPREADSHEET_ID,
                            range="Inv!A1:G610").execute()
values = result.get('values', [])

df = pd.DataFrame(values, columns = ['Location', 'Inv Status Code', 'Model #', 'Alternate ID', 'Serial #', 'Model Description', 'Actual Cost'])

#Input to be used for searching
searchAlt = int(input("Stock Number: ").lstrip("0"))
condition = int(input("Condition 1-10: "))

#Identifying all columns in row where altId column matches, placing results into array
costi = df.loc[df['Alternate ID'] == searchAlt]['Actual Cost'].values

The table has the columns 'Location', 'Inv Status Code', 'Model #', 'Alternate ID', 'Serial #', 'Model Description', 'Actual Cost'. This is a huge ask, and I am sure I am missing very important things in here, and have some pretty horrible practice. This is my first coding project ever, and have been able to get along for most of it but this is halting me completely. Thank you so much for any and all help, and for everyone's time!

Upvotes: 0

Views: 114

Answers (1)

grey_ranger
grey_ranger

Reputation: 1030

This is a hard one to debug without having access to the values or df. I would suggest print(values) right after you define it, to make sure that you are reading the values (and don't have an empty dataframe). Then you can do the same after df. If both of those are printing real dataframes, the next step is to check your searchAlt value. It's a common mistake to assume one datatype (here, searchAlt is type int) when the actual dataframe has a different type. You can check this by running:

print(df['Alternate ID'])

It should show something like:

0    2  # your values here
1    2
Name: Alternate ID, dtype: int64

If the last line says dtype: object or anything other than int, then the searchAlt comparison won't find any matches and you'll end up with an empty table. Note that int and int64 are both ok, since numpy can compare between them. But something like object means that pandas is interpreting the column as a string. In that case, you'll have to change the datatype yourself. See the pandas docs for details: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.convert_dtypes.html

Upvotes: 1

Related Questions