QHarr
QHarr

Reputation: 84465

Pandas: Read specific Excel cell value into a variable

Situation:

I am using pandas to parse in separate Excel (.xlsx) sheets from a workbook with the following setup: Python 3.6.0 and Anaconda 4.3.1 on Windows 7 x64.

Problem:

I have been unable to find how to set a variable to a specific Excel sheet cell value e.g. var = Sheet['A3'].value from 'Sheet2' using pandas?

Question:

Is this possible? If so, how?

What i have tried:

I have searched through the pandas documentation on dataframe and various forums but haven't found an answer to this.

I know i can work around this using openpyxl (where i can specify a cell co-ordinate) but I want:

  1. To use pandas -if possible;
  2. Only read in the file once.

I have imported numpy, as well as pandas, so was able to write:

xls = pd.ExcelFile(filenamewithpath) 

data = xls.parse('Sheet1')
dateinfo2 = str(xls.parse('Sheet2', parse_cols = "A", skiprows = 2, nrows = 1, header = None)[0:1]).split('0\n0')[1].strip()

'Sheet1' being read into 'data' is fine as i have a function to collect the range i want.

I am also trying to read in from a separate sheet ('sheet2'), the value in cell "A3", and the code i have at present is clunky. It gets the value out as a string, as required, but is in no way pretty. I only want this cell value and as little additional sheet info as possible.

Upvotes: 17

Views: 116801

Answers (3)

Nilanjan
Nilanjan

Reputation: 176

You can use pandas read_excel which has skip_footer argument. This should work, where skipendrows is number of end rows you want to skip.

data = xls.read_excel(filename, 'Sheet2', parse_cols = "A", skipsrows = 2, skip_footer=skipendrows, header =None)

Upvotes: 3

Arthur D. Howland
Arthur D. Howland

Reputation: 4557

Reading an Excel file using Pandas is going to default to a dataframe. You don't need an entire table, just one cell. The way I do it is to make that cell a header, for example:

# Read Excel and select a single cell (and make it a header for a column)
data = pd.read_excel(filename, 'Sheet2', index_col=None, usecols = "C", header = 10, nrows=0)

Will return a "list" of 1 header(s) and no data. Then isolate that header:

# Extract a value from a list (list of headers)
data = data.columns.values[0]
print (data)

Upvotes: 14

Yannis P.
Yannis P.

Reputation: 2765

Elaborating on @FLab's comment use something along those lines:

Edit:

Updated the answer to correspond to the updated question that asks how to read some sheets at once. So by providing sheet_name=None to read_excel() you can read all the sheets at once and pandas return a dict of DataFrames, where the keys are the Excel sheet names.

import pandas as pd
In [10]:

df = pd.read_excel('Book1.xlsx', sheetname=None, header=None)
df
Out[11]:
{u'Sheet1':    0
 0  1
 1  1, u'Sheet2':     0
 0   1
 1   2
 2  10}
In [13]:
data = df["Sheet1"]
secondary_data = df["Sheet2"]
secondary_data.loc[2,0]
Out[13]:
10

Alternatively, as noted in this post, if your Excel file has several sheets you can pass sheetname a list of strings, sheet names to parse eg.

df = pd.read_excel('Book1.xlsx', sheetname=["Sheet1", "Sheet2"], header=None)

Credits to user6241235 for digging out the last alternative

Upvotes: 11

Related Questions