user40780
user40780

Reputation: 1920

use tuple as index in pandas Series

I have a very simple task. Essentially, I want to create a pandas Series and use tuple values as index. For example,

series_tmp = pd.Series()
series_tmp[(0,'a')] = 1

What I want to do is that, I want to create one more row in the pd.Series, whose index is (0,'a') and whose value is 1.

The above code gets the error:

KeyError: '[0 1] not in index'

Any help?

I know about multi-index, but it cannot help my case. Because I need to have very complex tuples like ('a',(2,'c'),'d') as a key.

Conclusion: Thanks for all the wonderful answers! To add a row with tuple as index, we should do:

series_tmp = series_tmp.append(pd.Series([1],index=[(0,'a')]))

Upvotes: 6

Views: 6894

Answers (4)

ronkov
ronkov

Reputation: 1583

The solution is problematic if you add a duplicate value, then the indexing doesn't work anymore:

s = pd.Series()
s = s.append(pd.Series([1],index=[(0,'a')])).append(pd.Series([0],index=[(0,'a')]))
s[(0, 'a')]

That would raise a KeyError: 'key of type tuple not found and not a MultiIndex'

I would consider to use repr to convert tuples:

s = pd.Series()
s[repr((0, 'a'))] = 1
s[repr((0, 'a'))] = 0
s[repr((0, 'a'))]

Upvotes: 0

James
James

Reputation: 36623

If you are creating a series object with a multi-index from data, you can do so by constructing a dictionary with tuples as keys, and the data as values. Then pass that to the series constructor.

import pandas as pd

d = {(0,'a'):1, (0,'b'):1.5, (1,'a'):3, (1,'b'):3.5}
s = pd.Series(d)
s
# returns:
0  a    1.0
   b    1.5
1  a    3.0
   b    3.5
dtype: float64

Edit based on comments:

For this situation, an index of explicit tuples is required. In that case, you can construct the index ahead of time, then use that as the index parameter when constructing the series.

ix = pd.Index([(1,'a'), ('a',(2,'b')), (2,('b',1))])
s = pd.Series(data=[1,5,9], index=ix)
s
# returns:
(1, a)         1
(a, (2, b))    5
(2, (b, 1))    9
dtype: int64

# check indexing into the series object
s[('a',(2,'b'))]
# returns:
5

Upvotes: 3

Shihe Zhang
Shihe Zhang

Reputation: 2771

In :series_tmp = pd.Series([5,6],index=[(0,'a'),(1,'b')])
    series_tmp
Out:(0, a)    5
    (1, b)    6
    dtype: int64

Upvotes: 1

kjmerf
kjmerf

Reputation: 4345

Try it like this:

df = pd.DataFrame(columns=['a', 'b'], index=pd.MultiIndex.from_tuples([('0', 'a'), ('1', 'b')]))

print(df)

Output:

       a    b
0 a  NaN  NaN
1 b  NaN  NaN

Upvotes: 1

Related Questions