zola25
zola25

Reputation: 1911

Convert a Pandas DataFrame into a list of objects

I want to convert a Pandas DataFrame into a list of objects.

This is my class:

class Reading:

    def __init__(self):
        self.HourOfDay: int = 0
        self.Percentage: float = 0

I read up on .to_dict, so I tried

df.to_dict(into=Reading)

but it returned

TypeError: unsupported type

I don't want a list of tuples, or a list of dicts, but a list of Readings. Every question I've found so far seems to be about these two scenarios. But I want my own typed objects.

Thanks

Upvotes: 24

Views: 40578

Answers (3)

Victor Guillaud
Victor Guillaud

Reputation: 136

It would probably be better to initialise the class with arguments, as follows:

 class Reading:
   def __init__(self, h, p):
       self.HourOfDay = h 
       self.Percentage = p 

Then, to create a list of reading, you could use this function, that takes the DataFrame as an argument:

 def reading_list(df:pd.DataFrame)->list:
    return list(map(lambda x:Reading(h=x[0],p=x[1]),df.values.tolist()))

Execution is fast, even with a large dataset.

Upvotes: 12

NargesooTv
NargesooTv

Reputation: 897

having data frame with two column HourOfDay and Percentage, and parameterized constructor of your class you could define a list of Object like this:

 class Reading:

   def __init__(self, h, p):
       self.HourOfDay = h 
       self.Percentage = p 

 listOfReading= [(Reading(row.HourOfDay,row.Percentage)) for index, row in df.iterrows() ]  

Upvotes: 19

Brad Solomon
Brad Solomon

Reputation: 40878

Option 1: make Reading inherit from collections.MutableMapping and implement the necessary methods of that base class. Seems like a lot of work.

Option 2: Call Reading() in a list comprehension:

>>> import pandas as pd
>>> 
>>> df = pd.DataFrame({
...     'HourOfDay': [5, 10],
...     'Percentage': [0.25, 0.40]
... })
>>> 
>>> class Reading(object):
...     def __init__(self, HourOfDay: int = 0, Percentage: float = 0):
...         self.HourOfDay = int(HourOfDay)
...         self.Percentage = Percentage
...     def __repr__(self):
...         return f'{self.__class__.__name__}> (hour {self.HourOfDay}, pct. {self.Percentage})'
... 
>>> 
>>> readings = [Reading(**kwargs) for kwargs in df.to_dict(orient='records')]
>>> 
>>> 
>>> readings
[Reading> (hour 5, pct. 0.25), Reading> (hour 10, pct. 0.4)]

From docs:

into: The collections.Mapping subclass used for all Mappings in the return value. Can be the actual class or an empty instance of the mapping type you want. If you want a collections.defaultdict, you must pass it initialized.

Upvotes: 26

Related Questions