Reputation: 84465
Situation:
I am using pandas
to parse in separate Excel (.xlsx
) sheets from a workbook with the following setup: Python 3.6.0
and Anaconda 4.3.1
on Windows 7 x64.
Problem:
I have been unable to find how to set a variable to a specific Excel sheet cell value e.g. var = Sheet['A3'].value
from 'Sheet2'
using pandas
?
Question:
Is this possible? If so, how?
What i have tried:
I have searched through the pandas
documentation on dataframe
and various forums but haven't found an answer to this.
I know i can work around this using openpyxl
(where i can specify a cell co-ordinate) but I want:
pandas
-if possible;I have imported numpy
, as well as pandas
, so was able to write:
xls = pd.ExcelFile(filenamewithpath)
data = xls.parse('Sheet1')
dateinfo2 = str(xls.parse('Sheet2', parse_cols = "A", skiprows = 2, nrows = 1, header = None)[0:1]).split('0\n0')[1].strip()
'Sheet1'
being read into 'data'
is fine as i have a function to collect the range i want.
I am also trying to read in from a separate sheet ('sheet2'
), the value in cell "A3"
, and the code i have at present is clunky. It gets the value out as a string, as required, but is in no way pretty. I only want this cell value and as little additional sheet info as possible.
Upvotes: 17
Views: 116801
Reputation: 176
You can use pandas read_excel which has skip_footer argument. This should work, where skipendrows is number of end rows you want to skip.
data = xls.read_excel(filename, 'Sheet2', parse_cols = "A", skipsrows = 2, skip_footer=skipendrows, header =None)
Upvotes: 3
Reputation: 4557
Reading an Excel file using Pandas is going to default to a dataframe. You don't need an entire table, just one cell. The way I do it is to make that cell a header, for example:
# Read Excel and select a single cell (and make it a header for a column)
data = pd.read_excel(filename, 'Sheet2', index_col=None, usecols = "C", header = 10, nrows=0)
Will return a "list" of 1 header(s) and no data. Then isolate that header:
# Extract a value from a list (list of headers)
data = data.columns.values[0]
print (data)
Upvotes: 14
Reputation: 2765
Elaborating on @FLab's comment use something along those lines:
Edit:
Updated the answer to correspond to the updated question that asks how to read some sheets at once.
So by providing sheet_name=None
to read_excel()
you can read all the sheets at once and pandas return a dict
of DataFrames, where the keys are the Excel sheet names.
import pandas as pd
In [10]:
df = pd.read_excel('Book1.xlsx', sheetname=None, header=None)
df
Out[11]:
{u'Sheet1': 0
0 1
1 1, u'Sheet2': 0
0 1
1 2
2 10}
In [13]:
data = df["Sheet1"]
secondary_data = df["Sheet2"]
secondary_data.loc[2,0]
Out[13]:
10
Alternatively, as noted in this post, if your Excel file has several sheets you can pass sheetname
a list of strings, sheet names to parse eg.
df = pd.read_excel('Book1.xlsx', sheetname=["Sheet1", "Sheet2"], header=None)
Credits to user6241235 for digging out the last alternative
Upvotes: 11