Zach
Zach

Reputation: 1351

Object not "resetting" in Jupyter Notebook unless I reset the kernel

This is kind of a weird problem and I'm not entirely sure how to ask it appropriately but I'll give it my best shot.

I have a custom class that is basically a wrapper for an API that updates an SQLite database with new data on each call (I can't add it to the question because it's massive and private).

What's weird is that it seems some information is being cached (I don't think this is possible but that's the only thing it reminds me of, like when you make edits in web dev and they don't update) because it works the first time, but when I try to reinitialize the object and run it again, it doesn't add any new data (when I know there is new data to be added) to the DB.

I know the code works because if I restart the kernel and run it again, it updates no problem.

I've tried deleting the object (del InitializedClass), re-initializing, and initializing with different values but nothing seems to work. It won't update the DB unless the kernel is restarted.

Has anyone ever had an issue like this? I'm happy to provide more information if this isn't enough but I don't know how else to describe it.

Thank you!!


EDIT

The below psuedocode is basically exactly what is happening

from something import SomeClass    

while True:

    obj = SomeClass() #      <---------  How can I "reset" this on each loop?

    obj.get_new_data_from_api()
    obj.update_raw_db()
    obj.process_raw_data()
    obj.update_processed_db()

    # i tried different combinations of deleting the object
    del obj
    del SomeClass
    from something import SomeClass

EDIT 2:

So as everyone mentioned, it was an issue with the class itself, but I still don't really understand why the error was happening. Basically, the end argument was not being updated (I thought it would have updated to the current time each time it was called) when I made the datetime.now() function call as the default kwarg (even after deleting the class and creating a new instance, this did not update). The issue is illustrated below:

class SomeBrokenClass():

    def __init__(self):    
        pass

    def get_endpoint(self, start, end):
        return 'https://some.api.com?start_date=%s&end_date=%s' % (start, end)

    # THE PROBLEM WAS WITH THIS METHOD ( .get_data() ):
    # When re-initializing the class, the `end` argument
    # was not being updated for some reason. Even if I completely
    # delete the instance of the class, the end time would not update.

    def get_data(self, start, end = int(datetime.now().timestamp() * 1000)):
        return pd.read_json(self.get_endpoint(start, end))

    def get_new_data_from_api(self):
        start_date = self.get_start_date()
        df = self.get_data(start_date)
        return df


class SomeWorkingClass():

    def __init__(self):    
        pass

    def get_endpoint(self, start, end):
        return 'https://some.api.com?start_date=%s&end_date=%s' % (start, end)

    def get_data(self, start, end):
        return pd.read_json(self.get_endpoint(start, end))

    def get_new_data_from_api(self):
        start_date = self.get_start_date()
        end_date = int(datetime.now().timestamp() * 1000) # BUT THIS WORKS FINE
        df = self.get_data(start_date, end_date)
        return df

Upvotes: 2

Views: 2800

Answers (2)

Blckknght
Blckknght

Reputation: 104752

Your issue has to do with the default value for a parameter in one of your methods:

def get_data(self, start, end = int(datetime.now().timestamp() * 1000)):
    ...

The default value is not recalculated each time the function is called. Rather, the expression given as the default is evaluated only once, when the method is defined, and the value is stored to be used as the default for all later calls. That doesn't work right here, since it evaluates datetime.now only at the time the module was loaded, not each time the function is called.

A common way to fix this is to set a sentinel value like None as the default, and then calculate the appropriate value inside the function if the sentinel is found:

def get_data(self, start, end=None):
    if end is None:
        end = int(datetime.now().timestamp() * 1000)
    ...

Upvotes: 5

Nathan Vērzemnieks
Nathan Vērzemnieks

Reputation: 5603

You're not "deleting the object and then reinitializing it" - you're removing the module from the global namespace and then adding it back. This does not re-execute the module's code:

# test.py
print("Hi!")

>>> import test
Hi!
>>> del test
>>> import test
<Nothing printed>

If you want to reload your module, you need to do so explicitly, as in this question:

>>> import importlib
>>> importlib.reload(test)
Hi!
<module 'test' from '/Users/rat/test.py'>

(edited to add the following) However, what you're trying to do here should never be necessary. You should never need to delete a class before creating a new instance of it. If reloading your module like this helps, the only reasons I can think of are these:

  1. The behavior that's not happening the second time you create a SomeClass instance is actually caused by code at the top level of the something module - that is, outside of any function or class definition, or
  2. SomeClass is recording something in its own class attributes and opting not to do something the second time it's instantiated.

In either of these cases the approach I'd take would be to find the code that's only executing once and extract it into a function so you can call it directly if you need to. Sorry for the vagueness, but without your code it's hard to be more precise. My bet would be on the first, and there are probably other scenarios, but this at least might give you a start.

Upvotes: 2

Related Questions