Nils Gudat
Nils Gudat

Reputation: 13800

Fill pandas Panel object with data

This is probably very very basic but I can't seem to find a solution anywhere. I'm trying to construct a 3D panel object in pandas and then fill it with data which I read from several csv files. An example of what I'm trying to do would be the following:

import numpy as np
import pandas as pd

year = np.arange(2000,2005)
obs = np.arange(1,5)
variables = ['x1','x2']

data = pd.Panel(items = obs, major_axis = year, minor_axis = variables)

So that data[i] gives me all the data belonging to one of the observation units in the panel:

data[1]
        x1      x2
2000    NaN     NaN
2001    NaN     NaN
2002    NaN     NaN
2003    NaN     NaN
2004    NaN     NaN

Then, I read in data from a csv which gives me a DataFrame that looks like this (I'm just creating an equivalent object here to make this a working example):

x1data = pd.DataFrame(data = zip(year, np.random.randn(5)), columns = ['year', 'x1'])
x1data
    year    x1
0   2000    -0.261514
1   2001    0.474840
2   2002    0.021714
3   2003    -1.939358
4   2004    1.167545

No I would like to replace the NaN's in the x1 column of data[1] with the data that is in the x1data dataframe. My first idea (given that I'm coming from R) was to simply make sure that I select an object from x1data that has the same dimension as the x1 column in my panel and assign it to the panel:

data[1].x1 = x1data.x1

However, this doesn't work which I guess is due to the fact that in x1data, the years are a column of the dataframe, whereas in the panel they are whatever it is that shows up to the left of the columns (the "row names", would this be an index)?

As you can probably tell from my question I'm far from really understanding what's going on in the pandas data structure so any help would be greatly appreciated!

Upvotes: 2

Views: 1206

Answers (1)

Nils Gudat
Nils Gudat

Reputation: 13800

I'm guessing this question didn't elicit a lot of replies at it was simply too stupid, but just in case anyone ever comes across this and is as clueless as I was, the very simple answer is to access the panel using the .iloc method, as:

data.iloc[item, major_axis, minor_axis]

where each of the arguments can be single elements or lists, in order to write on slices of the panel. My question above would have been solved by

data.iloc[1, np.arange(2000,2005), 'x1'] = np.asarray(x1data.x1)

or

data.iloc[1, year, 'x1'] = np.asarray(x1data.x1)

Note than had I not used np.asarray, nothing would have happened as data.iloc[] creates an object that has the years as index, while x1data.x1 has an index starting at 0.

Upvotes: 2

Related Questions