User23
User23

Reputation: 153

Combine data frames in R unless entry already exists

I have a large data frame A that has sales figures for different items for only a few weeks and doesn't mention the weeks where no sales occurred. I therefore created a second data frame B where I have included all weeks with sales set to 0. I now want to add B to A but not for the weeks where A already mentions a sale. I was hoping to do this via an added combination variable but can't seem to figure out a fast way to do this.

So I have for example

A  Week ID Sales Combination           B   Week ID Sales Combination
   1    X  5     1_X                       1    X  0     1_X
   2    X  6     2_X                       2    X  0     2_X
   5    X  5     5_X                       3    X  0     3_X
   6    X  4     6_X                       4    X  0     4_X
   1    Y  2     1_Y                       5    X  0     5_X
   3    Y  2     3_Y                       6    X  0     6_X
   5    Y  2     5_Y                       1    Y  0     1_Y
                                           2    Y  0     2_Y
                                           3    Y  0     3_Y
                                           4    Y  0     4_Y
                                           5    Y  0     5_Y

And what I want is this

 Week ID Sales Combination
 1    X  5     1_X
 2    X  6     2_X
 3    X  0     3_X
 4    X  0     4_X
 5    X  5     5_X
 6    X  4     6_X
 1    Y  2     1_Y
 2    Y  0     2_Y
 3    Y  2     3_Y
 4    Y  0     4_Y
 5    Y  2     5_Y

Hope this makes it more or less clear.

Upvotes: 0

Views: 317

Answers (2)

jgadoury
jgadoury

Reputation: 293

Let dfA be the first data.frame, and dfB be the second one, you could do

# Get relevant data together
new_df = rbind(dfA, dfB[dfA$Combination != dfB$Combination,])

# Order the data frame
sorting_index = sort(new_df$Combination, index.return=T)
new_df = new_df[sorting.index$ix,]

Alternatively, you could set your new data frame as being dfB and then use match to get the values from dfA and put them at the right place.

Upvotes: 1

bpheazye
bpheazye

Reputation: 146

newdataframe <- rbind(A,B,by='week')
newdataframe <- newdataframe[!duplicated(newdataframe$week),]

This should solve it

Upvotes: 0

Related Questions