Reputation: 2877
I want to deploy my Tidymodels ML model to GCP so it can serve up predictions to other.
I am following along to this video from Julia Silge where she uses Vetiver and Docker to deploy to RStudio Connect.
I am also following this video from Mark Edmondson who created the googleCloudRunner
package in R for setting up a GCP project and defining the APIs and service accounts needed.
I have successfully authenticated to GCP i.e. my .Renviron file contains all the necessary variables for auto-authentication i.e. the path to my client secret and auth file, I have the permissions to write to my bucket, I can create the plumber file and build the docker file however I'm having issues running the docker image on my windows machine.
I get the following error which appears to indicate an issue with the docker image finding the googleCloudStorageR package. I've manually modified the docker file to call out this package but continue to get the same error.
Here is the script I've copied from Julia's blog
Any help to move forward with this project would be greatly appreciated.
```{r}
pacman::p_load(tidyverse,tidymodels,textrecipes,vetiver,pins,googleCloudRunner,googleCloudStorageR)
```
# set up a new gcp project using this function
# https://youtu.be/RrYrMsoIXsw?si=bJwEEqEzBGpIh_vg
```{r}
#cr_setup()
```
```{r}
lego_sets <- read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-09-06/sets.csv.gz')
```
```{r}
glimpse(lego_sets)
```
```{r}
lego_sets %>%
filter(num_parts > 0) %>%
ggplot(aes(num_parts)) +
geom_histogram(bins = 20) +
scale_x_log10()
```
```{r}
set.seed(123)
lego_split <- lego_sets %>%
filter(num_parts > 0) %>%
transmute(num_parts = log10(num_parts), name) %>%
initial_split(strata = num_parts)
```
```{r}
lego_train <- training(lego_split)
lego_test <- testing(lego_split)
```
```{r}
set.seed(234)
lego_folds <- vfold_cv(lego_train, strata = num_parts)
lego_folds
```
```{r}
lego_rec <- recipe(num_parts ~ name, data = lego_train) %>%
step_tokenize(name) %>%
step_tokenfilter(name, max_tokens = 200) %>%
step_tfidf(name)
lego_rec
```
```{r}
svm_spec <- svm_linear(mode = "regression")
lego_wf <- workflow(lego_rec, svm_spec)
```
```{r}
set.seed(234)
doParallel::registerDoParallel()
lego_rs <- fit_resamples(lego_wf, lego_folds)
collect_metrics(lego_rs)
```
```{r}
final_fitted <- last_fit(lego_wf, lego_split)
collect_metrics(final_fitted)
```
```{r}
final_fitted %>%
extract_workflow() %>%
tidy() %>%
arrange(-estimate)
```
```{r}
v <- final_fitted %>%
extract_workflow() %>%
vetiver_model(model_name = "lego-sets")
```
```{r}
v$metadata
```
## Publish and version model in GCS
```{r}
board <- board_gcs("ml-bucket-r")
board %>% vetiver_pin_write(v)
```
```{r}
vetiver_write_plumber(board, "lego-sets")
```
```{r}
vetiver_write_docker(v)
```
```{bash}
docker build -t lego-sets .
```
## Run the docker container and specify the environment variables
```{bash}
docker run --env-file .Renviron --rm -p 8000:8000 lego-sets
```
Upvotes: 0
Views: 63
Reputation: 11663
When you run the code to create your Dockerfile, can you try passing in some additional_pkgs
to get the right packages installed into the Docker container?
vetiver_write_docker(v, additional_pkgs = required_pkgs(board))
Check out the documentation here, which outlines for this argument:
additional_pkgs
A character vector of additional package names to add to the Docker image. For example, some boards like
pins::board_s3()
require additional software; you can userequired_pkgs(board)
here.
Upvotes: 1