Python - function RAM memory efficiency

Question

I have this function:

def preprocess(data_input):
    data = data_input.copy()
    
    data = # some code
    data = # some code
    data = # some code
    
    return(data)

I have a dataframe df that comes as an input to this function after a long quantity of jupyter notebook cells, like:

preprocess(df)

With the current function, I assign the changes of the local df (inside the function) to the global df (outside the function) with df = preprocess(df). I do this thing in this way in order to avoid running the whole notebook each time I find an error in the function. So when I know the function is running correctly (after infinite validations and bugs corrections), I just change preprocess(df) for df = preprocess(df).

This answer says that the best case is to use the .copy() method (The answer is a bit old, from 2015). However, I'm wondering about memory efficiency. If I have a 4GB data frame, only ONE copy would make me to use 8GB RAM (the original + the copy). So, is there any way of replace the .copy() method with a more efficient alternative?

In addition, You are welcome if you have a better suggestion for avoiding running the whole notebook when debugging a function, I would appreciate it!

EDIT

I forgot to mention if I don't use .copy() method, I always receive a SettingWithCopyWarning.

Python - function RAM memory efficiency

Answers (1)

Related Questions