agustin
agustin

Reputation: 1351

Reference the row above in a data.frame if condition not satisfied

With a data.frame colors:

colors <- data.frame(color=c("Red","Light Red","Dark Red","Blue","Turquise", "Dark Blue","Cyan"),
                     level=c("Primary",rep("Secondary",2),"Primary",rep("Secondary",3)),
                     stringsAsFactors = F)

I want to add a new column primary.color whose value is conditional of the value in the column level. If the level == "Primary", the color itself is the primary color (you know: red, green, blue...). For other colors (marked as Secondary in colors), the primary color should be assigned taking the previous value of colors$primary.color. Something like:

colors$primary.color <- ifelse(colors$level == "Primary", 
                               colors$color,
                               "Value of Primary Color above")

The desired output should be:

colorsOutput <- data.frame(color=c("Red","Light Red","Dark Red","Blue","Turquise", "Dark Blue","Cyan"),
                           level=c("Primary",rep("Secondary",2),"Primary",rep("Secondary",3)),
                           primary.color=c("Red","Red","Red","Blue","Blue", "Blue", "Blue"),
                           stringsAsFactors = F)

Upvotes: 0

Views: 568

Answers (2)

talat
talat

Reputation: 70326

This is a good use case for the "last observation carried forward", i.e. na.locf function from the zoo package.

Start by getting an index of where the primary colors are located:

idx <- colors$level == "Primary"

Then, assign those primary colors to a new column, leaving the other rows as missing values:

colors[idx, "primary"] <- colors$color[idx]

Now you can use the function I described above to fill up the NAs with primary colors:

colors$primary <- zoo::na.locf(colors$primary)

Upvotes: 3

Cath
Cath

Reputation: 24074

Considering your data.frame is sorted (Primary color first, then the secondary colors), you can do:

colors$primary.color <- colors$color[colors$level=="Primary"][cumsum(colors$level=="Primary")]
colors
#      color     level primary.color
#1       Red   Primary           Red
#2 Light Red Secondary           Red
#3  Dark Red Secondary           Red
#4      Blue   Primary          Blue
#5  Turquise Secondary          Blue
#6 Dark Blue Secondary          Blue
#7      Cyan Secondary          Blue

Explanation: you take all the primary colors with colors$color[colors$level=="Primary"] then you use cumsum to know the indices of the changes for each primary colors, you then subset the primary colors with this last vector.

Upvotes: 4

Related Questions