I'm trying to create a new variable in R containing the initial values of another variable (crime) based on groups (countries) considering the initial period of time observable per group (on panel data framework), my current data looks like this: country year Crime Albania 2016 2.7369478 Albania 2017 2.0109779 Argentina 2002 9.474084 Argentina 2003 7.7898825 Argentina 2004 6.0739941 And I want it to look like this: country year Crime Initial_Crime Albania 2016 2.7369478 2.7369478 Albania 2017 2.0109779 2.7369478 Argentina 2002 9.474084 9.474084 Argentina 2003 7.7898825 9.474084 Argentina 2004 6.0739941 9.474084 I saw that ddply could make it work this way, but the problem is that it is not longer supported by the latest R updates. Thank you in advance.

Reputation: 87

Creating a Variable Initial Values from a base variable in Panel Data Structure in R

I'm trying to create a new variable in R containing the initial values of another variable (crime) based on groups (countries) considering the initial period of time observable per group (on panel data framework), my current data looks like this:

country	year	Crime
Albania	2016	2.7369478
Albania	2017	2.0109779
Argentina	2002	9.474084
Argentina	2003	7.7898825
Argentina	2004	6.0739941

And I want it to look like this:

country	year	Crime	Initial_Crime
Albania	2016	2.7369478	2.7369478
Albania	2017	2.0109779	2.7369478
Argentina	2002	9.474084	9.474084
Argentina	2003	7.7898825	9.474084
Argentina	2004	6.0739941	9.474084

I saw that ddply could make it work this way, but the problem is that it is not longer supported by the latest R updates.

Thank you in advance.

Upvotes: 1

Answers (3)

langtang

Reputation: 24832

library(data.table)

setDT(data)[, Initial_Crime:=.SD[1,Crime], by=country]

     country year    Crime Initial_Crime
1:   Albania 2016 2.736948      2.736948
2:   Albania 2017 2.010978      2.736948
3: Argentina 2002 9.474084      9.474084
4: Argentina 2003 7.789883      9.474084
5: Argentina 2004 6.073994      9.474084

Upvotes: 1

Ben

Reputation: 30494

Maybe arrange by year, then after grouping by country set Initial_Crime to be the first Crime in the group.

library(tidyverse)

df %>%
  arrange(year) %>%
  group_by(country) %>%
  mutate(Initial_Crime = first(Crime))

Output

  country    year Crime Initial_Crime
  <chr>     <int> <dbl>         <dbl>
1 Argentina  2002  9.47          9.47
2 Argentina  2003  7.79          9.47
3 Argentina  2004  6.07          9.47
4 Albania    2016  2.74          2.74
5 Albania    2017  2.01          2.74

Upvotes: 1

Sweepy Dodo

Reputation: 1873

A data.table solution

setDT(df)

df[, x := 1:.N, country
   ][x==1, initial_crime := crime
     ][, initial_crime := nafill(initial_crime, type = "locf")
       ][, x := NULL
         ]

Upvotes: 0

Creating a Variable Initial Values from a base variable in Panel Data Structure in R

Answers (3)

Related Questions