Reputation: 9622
I read the questions "How to get a value from a cell of a dataframe?" and "How to select the last column of dataframe". I am a bit confused as to what is the accepted method in 2018 for getting a value from the cell of a pandas dataframe, as get_value
has been deprecated and the documentation on at
is somewhat sparse. I want to select the row by label and the column by position/integer. Is my method below in which I use iloc
the accepted approach in 2018? I have this code:
import pandas as pd
import random
import string
import csv
import io
## Make a table with random column labels and rows in random order.
s = '\n'.join((
## Separate columns by comma.
','.join((
## Join letters without a separator.
''.join((
## Randomly choose 3 characters for the labels from ascii_letters.
random.choice(
string.ascii_letters) for i in range(3))) for j in range(3))),
## Insert a header to distinguish rows with identical index keys x and y.
'"Header I, II and III"',
## Randomly shuffle the rows under the first header.
'\n'.join(random.sample(('x,0,1', 'y,2,3',), 2)),
## Insert a header to distinguish rows with identical index keys x and y.
'"Header IV, V and VI"',
## Randomly shuffle the rows under the second header.
'\n'.join(random.sample(('x,4,5', 'y,6,7'), 2)),
))
def preprocess(s):
header = ''
for l in csv.reader(io.StringIO(s)):
if len(l) == 1:
header = l[0]
continue
## Append the header to distinguish identical labels x and y.
l[0] = header + '; ' + l[0]
yield l
print(s, end='\n\n')
## Preprocess the string to avoid duplicate row index keys x and y.
df = pd.DataFrame(preprocess(s))
print(df, end='\n\n')
## Set the index to be that of the first column.
df = df.set_index(0)
## First select the column by index using iloc
## and then select the relevant row by index label.
value = df.iloc[:,-1]['Header I, II and III; x']
print(value)
It generates a string s
looking like this:
YuT,Uva,AsE
"Header I, II and III"
y,2,3
x,0,1
"Header IV, V and VI"
y,6,7
x,4,5
Which is converted to a dataframe by the function preprocess
looking like this:
0 1 2
0 ; YuT Uva AsE
1 Header I, II and III; y 2 3
2 Header I, II and III; x 0 1
3 Header IV, V and VI; y 6 7
4 Header IV, V and VI; x 4 5
It is the value (the integer 1) from the last column of the row with the label Header I, II and III; x
, which I'm interested in. Is this the correct approach in 2018?
value = df.iloc[:,-1]['Header I, II and III; x']
I just read some very interesting questions "Loc vs. iloc vs. ix vs. at vs. iat?" from February 2015 and "pandas iloc vs ix vs loc explanation?" from July 2015. It would be great, if I could do something like this, but I can't:
value = df.at['Header I, II and III; x', -1]
value = df['Header I, II and III; x'][-1]
Upvotes: 2
Views: 6461
Reputation: 9622
This solution definitely works:
value = df.at['Header I, II and III; x', df.columns[-1]]
For those like myself that like old-fashioned dictionaries it is also possible to do the following, if the dataframe does not contain duplicate row labels, which prevents the transpose
from working:
d = df.transpose().to_dict('list')
value = d['Header I, II and III; x'][-1]
Upvotes: 1
Reputation: 402523
First thing's first. ix
is deprecated, but ix
allowed you to mix labels and indexers, and made a lot of guesses about what it was passed.
In today's day and age (with the current stable release being v0.22
), ix
is deprecated, so stick to explicit label or positional based indexers: loc
for label based slicing, iloc
for index based slicing; at
for label based item access, and iat
for index based item access.
If you know what your labels are, use at
to access a single item -
df.at['Header I, II and III; x', df.columns[-1]]
If you know the position, use iat
-
df.iat[2, -1]
In general, use *at
when you want to access a single element, and *loc
when you want to access a row/column slice.
Upvotes: 3