phill
phill

Reputation: 95

First difference data frame

I have the folowing data frame:

>dados

COUNTRY   Year   CO2 emissions Pop. Growth(%)
Argentina  1994      1.23         0.3
Argentina  1995      1.26         0.2
Argentina  1996      1.28         0.4
Argentina  1997      1.24         0.2
Brazil     1994      1.54         0.7
Brazil     1995      1.59         0.6
Brazil     1996      1.60         0.9
Brazil     1997      1.58         1.3

And I'd like to first difference the variables CO2 emissions and Pop. Growth(%) for each country. I've already tried the function dados[,2:4] <- diff(dados[,2:4]) but it's returned the error:

"Error in r[i1] - r[-length(r):-(length(r) - lag + 1L)] : non-numeric argument to binary operator"

Upvotes: 2

Views: 3374

Answers (1)

acylam
acylam

Reputation: 18681

Here's with dplyr:

library(dplyr)

df %>%
  group_by(COUNTRY) %>%
  mutate_at(vars(CO2_emissions:Pop_Growth), funs(.-lag(.)))

Edit: As of dplyr 0.8.0, funs() is soft deprecated. Use the following instead for newer versions of dplyr

df %>%
  group_by(COUNTRY) %>%
  mutate_at(vars(CO2_emissions:Pop_Growth), list(~ .x - lag(.x)))

Output:

# A tibble: 8 x 4
# Groups:   COUNTRY [2]
  COUNTRY    Year CO2_emissions Pop_Growth
  <fct>     <int>         <dbl>      <dbl>
1 Argentina  1994         NA        NA    
2 Argentina  1995          0.03     -0.100
3 Argentina  1996          0.02      0.2  
4 Argentina  1997         -0.04     -0.2  
5 Brazil     1994         NA        NA    
6 Brazil     1995          0.05     -0.100
7 Brazil     1996          0.01      0.3  
8 Brazil     1997         -0.02      0.4 

Data:

df = structure(list(COUNTRY = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L), .Label = c("Argentina", "Brazil"), class = "factor"), 
    Year = c(1994L, 1995L, 1996L, 1997L, 1994L, 1995L, 1996L, 
    1997L), CO2_emissions = c(1.23, 1.26, 1.28, 1.24, 1.54, 1.59, 
    1.6, 1.58), Pop_Growth = c(0.3, 0.2, 0.4, 0.2, 0.7, 0.6, 
    0.9, 1.3)), .Names = c("COUNTRY", "Year", "CO2_emissions", 
"Pop_Growth"), class = "data.frame", row.names = c(NA, -8L))

Upvotes: 5

Related Questions