mondano
mondano

Reputation: 875

Unable to import Python function in Jupyter

I have a Jupyter notebook with Python 3.5. I use it to analyze data from a simulation, I have written in Python.

In the first cell, I run the simulation with

%run control.py

and I get the error

> ImportError                               Traceback (most recent call
> last) ...\code\control.py in
> <module>()
>      15 from supplier import Supplier
>      16 from heatmap import create_heatmaps
> ---> 17 from write2csv import get_dataframe_from_results, write_raw_data_from_simulation, get_aggregated_lines_per_run
>      18 #write_aggregated_results,
>      19 
> 
> ImportError: cannot import name 'get_dataframe_from_results'

my program is split among several files. When I remove the method 'get_dataframe_from_results' from the imports, it works. This method is along with several others in the file/module write2csv.

I don't understand, why only this method cannot be imported. All other functions from this file can be imported, so I rule out an issue with the source folder location.

The function itself does not contain anything out of the ordinary:

def get_dataframe_from_results(all_aggr_results):
    # convert results to pandas data frame from nested dictionary
    results_df = pd.Panel(all_aggr_results)
    STRATS = ("AN", "RE")
    RLZ = ("NOR", "DIS")
    vlzlist = []
    for vlz in sorted(all_aggr_results):
        outerlist = []
        for rl in RLZ:
            concatlist = []
            for strt in STRATS:
                concatlist.append(pd.DataFrame.from_dict(results_df[vlz][strt][rl], orient="index"))
            outerlist.append(pd.concat(concatlist, keys=STRATS))
        vlzlist.append(pd.concat(outerlist, keys=RLZ))
    results = pd.concat(vlzlist, keys=sorted(all_aggr_results))
    results.index.names = ["A", "B", "C", "C"]
    results["totalcost"] = results["AAA"] + results["BBB"] + results["CCC"] + results["DDD"]
    results.reset_index(inplace=True)  # transform multiindex to columns

    return results

The only "reason" why it could be special compared to other functions is that it uses pandas.

When I run the script control.py in PyCharm it works without problems. When I run it from the command line, I get

Error while finding spec for 'control.py' (: module 'control' has no attribute 'path')

When I leave out the function get_dataframe_from_results from my code, it works in Jupyter.

How can I get around this error in Jupyter and have my function?

The version of the notebook server is 4.1.0 and is running on:

Python 2.7.11 |Anaconda 4.0.0 (64-bit)| (default, Feb 16 2016, 09:58:36) [MSC v.1500 64 bit (AMD64)]

Current Kernel Information:

Python 3.5.1 |Anaconda 4.1.0 (64-bit)| (default, Jun 15 2016, 15:29:36) [MSC v.1900 64 bit (AMD64)]

Upvotes: 3

Views: 4637

Answers (1)

Patrick Kelly
Patrick Kelly

Reputation: 1381

This problem occurs when you are developing (editing) the external code at the same time you are developing the Jupyter notebook. Jupyter caching keeps it from reloading the external file after the first time it is imported.

The solution is to delete the external python cache directory __pycache__, and to then restart the Jupyter notebook through the menu entry "Kernel --> Restart and Clear Output". Doing both of these will force Jupyter to read a new/fresh copy of the external file, recognizing new symbols and other modifications as a result.

(I realize your question is more than a year old. But after struggling with this issue all morning today, I wanted to get a documented answer out for anyone else who runs into this.)

Upvotes: 8

Related Questions