Reputation: 8315
I have a set of lists, each for storing certain characteristics of events. I also have an index that corresponds to event numbers. So, for a particular event number say, index = 123
, I could look at the elements of the various lists at that index (e.g. event_color[123]
) to look at the character of an event. I want to collect these lists in some object. I could also attach a simple metadata object like a dictionary to the object.
What would be a good type of object for this type of data?
Here's the beginnings of an idea:
data = {}
data["color"] = ["red", "green", "blue"]
data["mass"] = [100, 98, 90]
data["speed"] = [10, 11, 9]
data["metadata"] = {"event_type": "2015-12-11T1442Z"}
Perhaps the object could be told which event number to use and then the various current characteristics could be requested of it.
EDIT: Following a suggestion by gkusner, I created the following data structure class:
class Data(object):
def __init__(
self
):
self._index = 0
self._data = {}
def index(
number = None
):
if number is not None:
self._index = number
return self._index
def indices(
self
):
return [index for index in self._data]
def variable(
self,
index = None,
name = None,
value = None
):
if index is not None:
self._index = index
if name is not None:
if value is not None:
try:
self._data[self._index][name] = value
except:
self._data[self._index] = {}
self._data[self._index][name] = value
return self._data[self._index][name]
def variables(
self,
index = 0
):
return [
variable for variable, value in self._data[self._index].iteritems()
]
Upvotes: 0
Views: 1235
Reputation: 510
A possible solution would be to use the pandas library, and in particular a DataFrame
object. If you are not familiar with that library, here is a short introductory tutorial. The library has lots of features that may be useful (for example, to deal with date/time data). Whether it is useful for you depends on the algorithms you want to use (as suggested by Peter Wood's comment).
For your short example, you could build the data
object as a DataFrame
as
import pandas as pd
data = pd.DataFrame({'color': ['red', 'green', 'blue'],
'mass': [100, 98, 90],
'speed': [10,11,9]})
Then you can access either the full data
object or particular elements in it like, e.g.
>> print data
color mass speed
0 red 100 10
1 green 98 11
2 blue 90 9
>> data.loc[1, 'mass']
98
>> data.loc[0, 'color']
'red'
You could also perform operations on the columns and save the results as a new column in your data
object, e.g.:
>>data['momentum'] = data['mass']*data['speed']
>>print data
color mass speed momentum
0 red 100 10 1000
1 green 98 11 1078
2 blue 90 9 810
>>data.loc[2, 'momentum']
810
What I am not sure is about the metadata bit you want. I understand this would be some metadata for the whole object (not a particular event). I don't know of an easy way to add 'global metadata' to a DataFrame
, but you could maybe add an extra column with the information (even if it is the same for all the events). In your example:
data = pd.DataFrame({'color': ['red', 'green', 'blue'],
'mass': [100, 98, 90],
'speed': [10,11,9],
'event_type': "2015-12-11T1442Z"})
which results in
>>print data
color event_type mass speed
0 red 2015-12-11T1442Z 100 10
1 green 2015-12-11T1442Z 98 11
2 blue 2015-12-11T1442Z 90 9
Upvotes: 0
Reputation: 1244
a dictionary (specifically a nested dictionary):
data = {}
index = 123
data[index] = {}
data[index]["color"] = ["red", "green", "blue"]
data[index]["mass"] = [100, 98, 90]
data[index]["speed"] = [10, 11, 9]
data[index]["metadata"] = {"event_type": "2015-12-11T1442Z"}
notice index is not quoted
you could also define it as a class with each index value defining an instance but that might be over-kill for your needs
Upvotes: 2