Reputation: 73
I tried to dockerize my machine learning model written in python. The python script includes using pandas to load csv files. When I ran the image in a container, the pd.read_csv("FILENAME.csv")
command can't retrieve the csv file,(I think the problem might be that the csv file is not in the container). Any suggestions on what should I do to run this python script and read the csv files on docker.
dockerfile:
FROM python:latest
RUN pip install pandas
RUN pip install numpy
RUN pip install sklearn
COPY . /app
ENTRYPOINT ["python", "app/model1.py","death_clean.csv","condition_data_clean.csv"]
model1.py
import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
df1=pd.read_csv("/Users/yaoyan/Desktop/docker-trial/condition_data_clean.csv",error_bad_lines=False)
df2=pd.read_csv("/Users/yaoyan/Desktop/docker-trial/death_clean.csv",error_bad_lines=False)
df=pd.merge(df1,df2,on=['person_id'], how='left')
when I ran it, I got this error:
FileNotFoundError: File b'/Users/yaoyan/Desktop/docker-trial/condition_data_clean.csv' does not exist
Upvotes: 4
Views: 6313
Reputation: 24701
You should create a volume containing your data using docker volume
command. After this step you need to mount this storage using -v
option in docker run
, e.g. -v my_data_volum:/data
. Lastly, change your path appropriately in Python script, in this case it would be /data/my_csv.csv
. More informations in documentation.
Or if you insist on copying the file, use path /app/condition_data_clean.csv
in your pandas' read_csv
function.
Upvotes: 4