How to merge two dataframes with conditional statements

Question

I have two df I want to merge.

df1 looks like this:

CODE    TAX   PRICE   TOTAL
4577    0.9   99.1    100
8965    NA    25      25
7788    1.5   13.5    15
4021    NA    20      20
6987    0.00  40      40

df1 is a dataframe containing codes, tax cost depending on prices and the total (sum) of taxes + prices.

df2 looks like this:

CODE    TAX.CLASS   MEANING
4577     CLASS1     Car
4577     CLASS2     Car
8965     CLASS1     Internet
2635     CLASS1     Computer
7788     CLASS1     Phone
7788     CLASS2     Phone
1258     CLASS1     Water
4021     CLASS1     Food
6987     CLASS1     Gasoline
6987     CLASS2     Gasoline

where each code displays a tax.class and meaning. I would like to get an output like this:

CODE    MEANING   TAX.CLASS   TAX   PRICE   TOTAL
4577    Car        CLASS2     0.9   99.1    100
8965    Internet   CLASS1     NA    25      25
7788    Phone      CLASS2     1.5   13.5    15
4021    Food       CLASS1     NA    20      20
6987    Gasoline   CLASS1     0.00  40      40

So far I have tried with the answers of these questions:

but it has not worked as I want. The issue basically is to match every code in df1 with the code in df2 but retrieving CLASS1 if there is NA or a TAX value in df1 <= 0.00, on the other hand, retrieve CLASS2 if there is a TAX value in df1 > 0.00 on codes repeated twice.

Any suggestions?

Preferably BaseR.

arg0naut91 · Accepted Answer

You could do:

df1$TAX.CLASS<- ifelse(is.na(df1$TAX) | df1$TAX <= 0.00, "CLASS1", "CLASS2")

df_final <- merge(df1, df2, by = c("CODE", "TAX.CLASS"), all.x = TRUE)

Output:

   CODE TAX.CLASS TAX PRICE TOTAL  MEANING
1: 4021    CLASS1  NA  20.0    20     Food
2: 4577    CLASS2 0.9  99.1   100      Car
3: 6987    CLASS1 0.0  40.0    40 Gasoline
4: 7788    CLASS2 1.5  13.5    15    Phone
5: 8965    CLASS1  NA  25.0    25 Internet

This is a solution in base, but the same logic can be applied in any package.

Note that you could also use with in the first statement to avoid referencing the data frame:

df1$TAX.CLASS<- with(df1, ifelse(is.na(TAX) | TAX <= 0.00, "CLASS1", "CLASS2"))

How to merge two dataframes with conditional statements

Answers (2)

Related Questions