Reputation: 2699
This is likely an easy fix, but I don't know how to do it.
I have extended the pandas.Series
class so that it can contain datasets for my research. Here's the code that I've written so far:
import pandas as pd
import numpy as np
from allantools import oadev
class Tombstone(pd.Series):
"""An extension of ``pandas.Series``, which contains raw data from a
tombstone test.
Parameters
----------
data : array-like of floats
The raw data measured in volts from a lock-in amplifier. If no scale
factor is provided, this data is presumed to be in units of °/h.
rate : float
The sampling rate in Hz
start : float
The unix time stamp of the start of the run. Used to create the index
of the Tombstone object. This can be calculated by
running ``time.time()`` or similar. If no value is passed, the index
of the Tombstone object will be in hours since start.
scale_factor : float
The conversion factor between the lock-in amplifier voltage and deg/h,
expressed in deg/h/V.
Attributes
----------
adev : 2-tuple of arrays of floats
Returns the Allan deviation in degrees/hour in a 2-tuple. The first
tuple is an array of floats representing the integration times. The
second tuple is an array of floats representing the allan deviations.
noise : float
The calculated angular random walk in units of °/√h taken from the
1-Hz point on the
Allan variance curve.
arw : float
The calculated angular random walk in units of °/√h taken from the
1-Hz point on the
Allan deviation curve.
drift : float
The minimum allan deviation in units of °/h.
"""
def __init__(self, data, rate, start=None, scale_factor=0, *args, **kwargs):
if start:
date_index = pd.date_range(
start=start*1e9, periods=len(data),
freq='%.3g ms' % (1000/rate), tz='UTC')
date_index = date_index.tz_convert('America/Los_Angeles')
else:
date_index = np.arange(len(data))/60/60/rate
super().__init__(data, date_index)
if scale_factor:
self.name = 'voltage'
else:
self.name = 'rotation'
self.rate = rate
@property
def _constructor(self):
return Tombstone
@property
def adev(self):
tau, dev, _, _ = oadev(np.array(self), rate=self.rate,
data_type='freq')
return tau, dev
@property
def noise(self):
_, dev, _, _ = oadev(np.array(self), rate=self.rate, data_type='freq')
return dev[0]/60
# alias
arw = noise
@property
def drift(self):
tau, dev, _, _ = oadev(np.array(self), rate=self.rate,
data_type='freq')
return min(dev)
I can run this in a Jupyter notebook:
>>> t = Tombstone(np.random.rand(60), rate=10)
>>> t
0.000000 0.497036
0.000028 0.860914
0.000056 0.626183
0.000083 0.537434
0.000111 0.451693
...
The output of the last term shows the pandas.Series
as expected.
But when I pass 61 elements to the constructor, I get an error
>>> t = Tombstone(np.random.rand(61), rate=10)
>>> t
TypeError: cannot concatenate a non-NDFrame object
Even with large datasets, I can still run commands without problem:
>>> from matplotlib.pyplot import loglog, show
>>> t = Tombstone(np.random.rand(10000), rate=10)
>>> t.noise
>>> loglog(*t.adev); show()
But I always get an error when I ask Jupyter notebook to pretty print t
.
After poking through the stack trace, it seems that the problem is in when pandas tries to concatenate the first few elements and the last few elements with an ellipsis in between. Running the code below reproduces the last few lines of the stack trace:
>>> pd.concat(t.iloc[10:], t.iloc[:-10])
TypeError Traceback (most recent call last)
<ipython-input-12-86a3d2f95e07> in <module>()
----> 1 pd.concat(t.iloc[10:], t.iloc[:-10])
/Users/wheelerj/miniconda3/lib/python3.5/site-packages/pandas/tools/merge.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, copy)
1332 keys=keys, levels=levels, names=names,
1333 verify_integrity=verify_integrity,
-> 1334 copy=copy)
1335 return op.get_result()
1336
/Users/wheelerj/miniconda3/lib/python3.5/site-packages/pandas/tools/merge.py in __init__(self, objs, axis, join, join_axes, keys, levels, names, ignore_index, verify_integrity, copy)
1389 for obj in objs:
1390 if not isinstance(obj, NDFrame):
-> 1391 raise TypeError("cannot concatenate a non-NDFrame object")
1392
1393 # consolidate
TypeError: cannot concatenate a non-NDFrame object
Upvotes: 1
Views: 405
Reputation: 2329
I think the problem is in your call to super().__init__()
. pd.Series.__init__()
has a number of additional arguments that you aren't passing through. In my case I was getting the fastpath
parameter set, but not handling it.
If I tweak your __init__()
like this, it seems to work:
def __init__(self, data=None, index=None, rate=None, start=None, scale_factor=0, *args, **kwargs):
if index is None and rate is not None:
if start:
date_index = pd.date_range(
start=start*1e9, periods=len(data),
freq='%.3g ms' % (1000/rate), tz='UTC')
date_index = date_index.tz_convert('America/Los_Angeles')
else:
date_index = np.arange(len(data))/60/60/rate
else:
date_index=index
super().__init__(data, date_index, *args, **kwargs)
if scale_factor:
self.name = 'voltage'
else:
self.name = 'rotation'
self.rate = rate
You need to ensure that take
and indexing through iloc
return objects of your type (Tombstone
in this case).
Upvotes: 0
Reputation: 2699
I found a fix, which should work in my case. I still think there is a way to solve it by representing the slices as NDFrame objects. Maybe someone else on SO can figure that out.
If I override the __repr__
built-in function inside of my Tombstone
class,
def __repr__(self):
ret = 'Tombstone('
ret += 'rate=%.3g' % self.rate
# etc...
ret += ')'
return ret
I can run the following:
>>> t = Tombstone(np.random.rand(61), rate=10)
>>> t
Tombstone(rate=10)
Upvotes: 0