Replacing non sequential data in a dataframe with sequential data (repeated for unique values)

Question

I have a data set that looks like this:

dat <- data.frame(x=c(1,1,2,2,7,7,8,8), y=c(rep(c(-1,-2),4)), 
                  z= c(0.5,0.6,0.6,0.4,0.3,0.3,0.5,0.5))

dat
  x  y   z
1 1 -1 0.5
2 1 -2 0.6
3 2 -1 0.6
4 2 -2 0.4
5 7 -1 0.3
6 7 -2 0.3
7 8 -1 0.5
8 8 -2 0.5

The x-values represent numeric dates for which I am plotting the y and z values. I need to replace the non sequential x values with a sequential vector so that the data becomes

I have tried to replace the value mathematically using a for loop that separates the data into dataframes by unique x-value. This has two issues: first the data gaps still exist any time the unique x values are used in a math formula such as data$x - min(alldata$x), and second since each resulting dataframe only has a single unique x value I cannot replace it within the loop and have the result be unique for each x value across the entire dataset.

I'm just starting with loops and I feel as though there's a different way to iterate across the data to achieve the outcome I require but I haven't been able to figure it out yet.

akrun · Accepted Answer

With dplyr, this can be done with group_indices

library(dplyr)
dat %>% 
    mutate(x = group_indices(., x))

In base R an option is match

dat$x <- with(dat, match(x, unique(x)))

Replacing non sequential data in a dataframe with sequential data (repeated for unique values)

Answers (2)

Related Questions