McRip
McRip

Reputation: 100

DataFrame with string column-names

I have the following problem: I constructed a DataFrame with integer column-names and a period-index. Now, if I rename the columns using the following function:

df.rename(columns = lambda x: str(x), inplace=True)

Hence, I convert the type of the columns to string I observe the following weird behavior: Before the operation if I retreive one column from the frame I got a Series. Now, on some columns I obtain a DataFrame: formerly df.loc[:,1] gave a Series:

Now, df.loc[:,'1'] gives a DataFrame with a PeriodIndex of length 0 and the full original columns of df.

Does anybody have an idea whether I am doing something wrong or did I stumble upon a bug?

Here is a code-snippet which reproduces the bug(?):

A = pd.DataFrame(dict(zip(range(0,9000), [pd.Series([1,2,3], [pd.Period(1), pd.Period(2), pd.Period(3)]) for x in range(0,9000)])))

A[5000]
A.rename(columns = lambda x: str(x), inplace=True)

A['5000'] # This should return a DataFrame with a zero-PeriodIndex and the full columns!

Thank you very much in advance and best regards Marc

Upvotes: 1

Views: 3267

Answers (1)

Jeff
Jeff

Reputation: 128958

this is in master. looks correct

In [11]: A = pd.DataFrame(dict(zip(range(0,9000), [pd.Series([1,2,3], [pd.Period(1), pd.Period(2), pd.Period(3)]) for x in range(0,9000)])))

In [12]: A['5000']
Out[12]: 
<class 'pandas.core.frame.DataFrame'>
PeriodIndex: 0 entries
Columns: 9000 entries, 0 to 8999
dtypes: int64(9000)

In [13]: A[5000]
Out[13]: 
1-01-01    1
1-01-02    2
1-01-03    3
Freq: D, Name: 5000, dtype: int64

In [14]: A.rename(columns = lambda x: str(x), inplace=True)

In [15]: A['5000']
Out[15]: 
1-01-01    1
1-01-02    2
1-01-03    3
Freq: D, Name: 5000, dtype: int64

In [16]: A[5000]
KeyError: u'no item named 5000'

Upvotes: 1

Related Questions