Reputation: 655
I'm having issues structuring my Makefile to run my shell scripts in the desired order.
Here is my current makefile
## Create data splits
raw_data: src/data/get_data.sh
src/data/get_data.sh
hadoop fs -cat data/raw/target/* >> data/raw/target.csv
hadoop fs -cat data/raw/control/* >> data/raw/control.csv
hadoop fs -rm -r -f data/raw
touch raw_data_loaded
split_data: raw_data_loaded
rm -rf data/interim/splits
mkdir data/interim/splits
$(PYTHON_INTERPRETER) src/data/split_data.py
## Run Models
random_forest: split_data
nohup $(PYTHON_INTERPRETER) src/models/random_forest.py > random_forest &
under_gbm: split_data
nohup $(PYTHON_INTERPRETER) src/models/undersampled_gbm.py > under_gbm &
full_gbm: split_data
nohup $(PYTHON_INTERPRETER) src/models/full_gbm.py > full_gbm &
# Create predictions from model files
predictions: random_forest under_gbm full_gbm
nohup $(PYTHON_INTERPRETER) src/models/predictions.py > predictions &
The Problem
Everything works ok until I start the ##Run Models
section. These are all independent scripts, which can all run once split_data
is finished. I want to run each of the 3 model scripts simultaneously, so I run each in the background with &.
The problem is that my last task, predictions
begins to run at the same time as the three preceding tasks. What I Want to happen is for the 3 simultaneous model scripts to finish, and then predictions
runs.
My Attempt
My proposed solution is to run my final model task, full_gbm
without the &, so that predictions
doesn't run until that is finished. This should work, but I'm wondering if there is a less 'hacky' way to achieve this -- is there some way to structure the target variables to achieve the same result?
Upvotes: 1
Views: 1385
Reputation: 30968
You don't say which implementation of Make you're using. If it's GNU Make, you can invoke it with the -j
option to allow it to decide which jobs should be run in parallel. Then you can remove the nohup
and &
from all the commands; predictions
won't start until all of random_forest under_gbm full_gbm
have completed, and the build itself won't end until predictions
has completed.
Also, you won't lose the all-important exit status of the commands.
Upvotes: 2