tomas_s
tomas_s

Reputation: 11

Avoiding running all makefiles after running make on a target

I'm new to Make, and am trying to use it to automate a ml pipeline. I have defined two rules in two files: the first Makefile has file locations and a target to clean a dataset. The second one Makefile.label.surv1 has a target to extract labels using a python script. Below is the code for both:

Makefile

#--------------------------------------------------------------------------
# Survival Analysis - Churn prediction top makefile
#--------------------------------------------------------------------------

# date of the snapshot to consider
SNAP_TRN := 2019-06-18
SNAP_TST := 2019-07-24

# directories
DIR_DATA := data
DIR_BUILD := build
DIR_FEATURE := $(DIR_BUILD)/feature
DIR_METRIC := $(DIR_BUILD)/metric
DIR_MODEL := $(DIR_BUILD)/model
DIR_CONFIG := configs

# data files for training and predict
DATA_TRN := $(DIR_DATA)/processed/
DATA_TST := $(DIR_DATA)/processed/

# NOS feature selection
FEATS := $(DIR_CONFIG)/featimp_churnvol.csv
# Config files
CONFIG_PANEL := $(DIR_CONFIG)/config_panel.yaml
CONFIG_INPUT := $(DIR_CONFIG)/config_inpute.yaml

# Generates a clean dataset (inputted and one hot encoded) and labels for train and test
buildDataset: $(DATA_TRN) $(DATA_TST) $(CONFIG_PANEL) $(CONFIG_INPUT) $(FEATS)
    python src/buildDataset.py --train-file $< \
                               --test-file $(word 1, $^) \
                               --config-panel $(word 2, $^) \
                               --config-input $(word 3, $^) \
                               --feats $(lastword $^)   

Makefile.label.surv1

#--------------------------------------------------------------------------
# surv1: survival labels
#--------------------------------------------------------------------------
include Makefile


FEATURE_NAME := surv1

GRAN := weekly
STUDY_DUR := 1
       
Y_SURV_TRN := $(DIR_DATA)/survival/$(FEATURE_NAME)_train_$(SNAP_TRN)_$(GRAN)_$(STUDY_DUR).pkl
Y_SURV_TST := $(DIR_DATA)/survival/$(FEATURE_NAME)_test_$(SNAP_TST)_$(GRAN)_$(STUDY_DUR).pkl

$(Y_SURV_TRN) $(Y_SURV_TST): $(DATA_TRN) $(DATA_TST) $(CONFIG_PANEL) $(STUDY_DUR) $(GRAN)
    python ./src/generate_surv_labels.py --train-file $< \
                                         --test-file $(word 1, $^) \
                                         --train-label-file $(Y_SURV_TRN) \
                                         --test-label-file $(Y_SURV_TST)\
                                         --config-panel $(word 2, $^) \
                                         --study-dur $(word 3, $^) \
                                         --granularity $(lastword $^)

So when I run make -f Makefile.label.surv1, it also re-runs the target buildDataset, which I don't want in this case. In this case I haven't made any changes to buildDataset so I don't understand why make re-runs this target... Is there anyway to prevent a target from re-running others?

Upvotes: 0

Views: 91

Answers (1)

MadScientist
MadScientist

Reputation: 100836

If you don't provide a target to build on the make command line, make will build the default target. The default target is the FIRST explicit target defined in your makefile(s).

In your Makefile.label.survival the first thing you do before you define any other target, is include Makefile. That means that if any target is defined in Makefile, it will be considered the first explicit target.

And, indeed, Makefile defines buildDataset and so that is the default target and if you run make without any specific target, that's the target that will be built.

Also, this rule is very likely not right:

$(Y_SURV_TRN) $(Y_SURV_TST): ...

I'm not sure what you're hoping this will do, but if you expect that make will interpret this to mean that one invocation of the recipe will build both these files, that's not what this syntax means.

Upvotes: 1

Related Questions