Reputation: 1920
I have a very simple task. Essentially, I want to create a pandas Series and use tuple values as index. For example,
series_tmp = pd.Series()
series_tmp[(0,'a')] = 1
What I want to do is that, I want to create one more row in the pd.Series, whose index is (0,'a') and whose value is 1.
The above code gets the error:
KeyError: '[0 1] not in index'
Any help?
I know about multi-index, but it cannot help my case. Because I need to have very complex tuples like ('a',(2,'c'),'d') as a key.
Conclusion: Thanks for all the wonderful answers! To add a row with tuple as index, we should do:
series_tmp = series_tmp.append(pd.Series([1],index=[(0,'a')]))
Upvotes: 6
Views: 6894
Reputation: 1583
The solution is problematic if you add a duplicate value, then the indexing doesn't work anymore:
s = pd.Series()
s = s.append(pd.Series([1],index=[(0,'a')])).append(pd.Series([0],index=[(0,'a')]))
s[(0, 'a')]
That would raise a KeyError: 'key of type tuple not found and not a MultiIndex'
I would consider to use repr
to convert tuples:
s = pd.Series()
s[repr((0, 'a'))] = 1
s[repr((0, 'a'))] = 0
s[repr((0, 'a'))]
Upvotes: 0
Reputation: 36623
If you are creating a series object with a multi-index from data, you can do so by constructing a dictionary with tuples as keys, and the data as values. Then pass that to the series constructor.
import pandas as pd
d = {(0,'a'):1, (0,'b'):1.5, (1,'a'):3, (1,'b'):3.5}
s = pd.Series(d)
s
# returns:
0 a 1.0
b 1.5
1 a 3.0
b 3.5
dtype: float64
For this situation, an index of explicit tuples is required. In that case, you can construct the index ahead of time, then use that as the index
parameter when constructing the series.
ix = pd.Index([(1,'a'), ('a',(2,'b')), (2,('b',1))])
s = pd.Series(data=[1,5,9], index=ix)
s
# returns:
(1, a) 1
(a, (2, b)) 5
(2, (b, 1)) 9
dtype: int64
# check indexing into the series object
s[('a',(2,'b'))]
# returns:
5
Upvotes: 3
Reputation: 2771
In :series_tmp = pd.Series([5,6],index=[(0,'a'),(1,'b')])
series_tmp
Out:(0, a) 5
(1, b) 6
dtype: int64
Upvotes: 1
Reputation: 4345
Try it like this:
df = pd.DataFrame(columns=['a', 'b'], index=pd.MultiIndex.from_tuples([('0', 'a'), ('1', 'b')]))
print(df)
Output:
a b
0 a NaN NaN
1 b NaN NaN
Upvotes: 1