erc
erc

Reputation: 10131

Add rows to dataframe based on existing rows

I have this dataframe:

df <- data.frame(group=c("A", "A", "B", "B"), year=c(1980, 1986, 1990, 1992))
  group year
1     A 1980
2     A 1986
3     B 1990
4     B 1992

I'd like to modify it in the following way:

This would be the outcome:

   group  year     pre
1      A  1978 pre1980
2      A  1979 pre1980
3      A  1984 pre1986
4      A  1985 pre1986
5      B  1988 pre1990
6      B  1989 pre1990
7      B  1990 pre1992
8      B  1991 pre1992

Adding the new column would be easy..

df$pre <- paste("pre", df$year, sep="")

But I am stuck on how to add the new rows with the respective years (of course creating a whole new data frame would be just as good). Any hints?

Upvotes: 4

Views: 1632

Answers (5)

CuriousBeing
CuriousBeing

Reputation: 1632

Here is a simple solution with no packages:

Your Dataframe:

df <- data.frame(group=c("A", "A", "B", "B"), year=c(1980, 1986, 1990, 1992))

group year
1     A 1980
2     A 1986
3     B 1990
4     B 1992

Subtract two years and add column pre:

df1<-cbind(group=as.character(df$group),year=df$year-2, pre=paste("pre",df$year,sep=""))

group year   pre      
[1,] "A"   "1978" "pre1980"
[2,] "A"   "1984" "pre1986"
[3,] "B"   "1988" "pre1990"
[4,] "B"   "1990" "pre1992"

Next subtract 1 year and add column pre:

df2<-cbind(group=as.character(df$group),year=df$year-1,pre=paste("pre",df$year,sep=""))

    group year   pre      
[1,] "A"   "1979" "pre1980"
[2,] "A"   "1985" "pre1986"
[3,] "B"   "1989" "pre1990"
[4,] "B"   "1991" "pre1992"

Now rbind the two together:

ndf<-data.frame(rbind(df1,df2))

group year     pre
1     A 1978 pre1980
2     A 1984 pre1986
3     B 1988 pre1990
4     B 1990 pre1992
5     A 1979 pre1980
6     A 1985 pre1986
7     B 1989 pre1990
8     B 1991 pre1992

Sort it according to year. This is your output.

Lastdf <- ndf[order(ndf$year),] 

group year     pre
1     A 1978 pre1980
5     A 1979 pre1980
2     A 1984 pre1986
6     A 1985 pre1986
3     B 1988 pre1990
7     B 1989 pre1990
4     B 1990 pre1992
8     B 1991 pre1992

Upvotes: 1

akrun
akrun

Reputation: 887851

Here is another option with Map

do.call(rbind,Map(function(x,y,z) 
   data.frame(group=x, year=y:z, pre=paste0('pre', z+1)), 
    df$group, df$year-2, df$year-1))
#  group year     pre
#1     A 1978 pre1980
#2     A 1979 pre1980
#3     A 1984 pre1986
#4     A 1985 pre1986
#5     B 1988 pre1990
#6     B 1989 pre1990
#7     B 1990 pre1992
#8     B 1991 pre1992

Or a modification with rep

`row.names<-`(transform(df[rep(1:nrow(df),each=2),],
      year = year-2:1, pre = paste0('pre', year) ), NULL)
#  group year     pre
#1     A 1978 pre1980
#2     A 1979 pre1980
#3     A 1984 pre1986
#4     A 1985 pre1986
#5     B 1988 pre1990
#6     B 1989 pre1990
#7     B 1990 pre1992
#8     B 1991 pre1992

Upvotes: 4

marc1s
marc1s

Reputation: 779

If you don't mine the final order, without extra libraries you can use

gap = function(df, y) transform(df, year=year-y, pre = sprintf("pre%d", year))
rbind(gap(df,2), gap(df,1))

Upvotes: 1

Pierre L
Pierre L

Reputation: 28461

base R ftw:

data.frame(group = rep(df$group, each=2),
           year = df[rep(1:nrow(df), each=2),]$year-2:1,
           pre = paste0("pre",rep(df$year,each=2)))
#   group year     pre
# 1     A 1978 pre1980
# 2     A 1979 pre1980
# 3     A 1984 pre1986
# 4     A 1985 pre1986
# 5     B 1988 pre1990
# 6     B 1989 pre1990
# 7     B 1990 pre1992
# 8     B 1991 pre1992

Upvotes: 6

jazzurro
jazzurro

Reputation: 23574

Using the data.table package, here is one approach. With the given data, I decided to use year as a group variable. For each year, I calculate two previous years and created pre**** with the year. There are two year columns, so I deleted one of them in the end.

setDT(df)[, list(group = group,
                 year = c((year - 2), (year - 1)),
                 pre = paste0("pre", year, collapse = "")), by = "year"][, -1, with = FALSE][]

#   group year     pre
#1:     A 1978 pre1980
#2:     A 1979 pre1980
#3:     A 1984 pre1986
#4:     A 1985 pre1986
#5:     B 1988 pre1990
#6:     B 1989 pre1990
#7:     B 1990 pre1992
#8:     B 1991 pre1992

If you have an identical year appearing more than twice, you would do something like the following. This new data frame has 1992 appearing twice.

df <- data.frame(group=c("A", "A", "B", "B"), year=c(1980, 1986, 1992, 1992))


setDT(df)[, list(group = group,
                 year = c((year - 2), (year - 1)),
                 pre = paste0("pre", year, collapse = "")), by = rownames(df)][, -1, with = FALSE]


#   group year     pre
#1:     A 1978 pre1980
#2:     A 1979 pre1980
#3:     A 1984 pre1986
#4:     A 1985 pre1986
#5:     B 1990 pre1992
#6:     B 1991 pre1992
#7:     B 1990 pre1992
#8:     B 1991 pre1992

Upvotes: 5

Related Questions