misheekoh
misheekoh

Reputation: 460

Reshaping Dataframe in R (melt?)

So, I currently have a dataframe that looks like:

      country   continent year lifeExp   pop     gdpPercap
       <fctr>    <fctr> <int>   <dbl>    <int>     <dbl>
1 Afghanistan      Asia  1952  28.801  8425333  779.4453
2 Afghanistan      Asia  1957  30.332  9240934  820.8530
3 Afghanistan      Asia  1962  31.997 10267083  853.1007
4 Afghanistan      Asia  1967  34.020 11537966  836.1971
5 Afghanistan      Asia  1972  36.088 13079460  739.9811
6 Afghanistan      Asia  1977  38.438 14880372  786.1134

There are 140+ countries. The years are in 5 year intervals. From 1952- 2007 I want to reshape my dataframe such that I get.

     Country   gdpPercap(1952)     gdpPercap(1957)   ...   gdpPercap(2007)
      <fctr>      <dbl>
1  Afghanistan   974.5803           ....                      ...
2      Albania  5937.0295           ...                       ...
3      Algeria  6223.3675           ...                       ...
4       Angola  4797.2313
5    Argentina 12779.3796
6    Australia 34435.3674
7      Austria 36126.4927
8      Bahrain 29796.0483
9   Bangladesh  1391.2538
10     Belgium 33692.6051

My attempt is this:

gapminder %>% #my dataframe
  filter(year >= 1952) %>%
  group_by(country) %>%
  summarise(gdpPercap = mean(gdpPercap))

OUTPUT:

        country  gdpPercap <- but this takes the mean of gdpPercap from 1952-2007
        <fctr>      <dbl>
1  Afghanistan   802.6746
2      Albania  3255.3666
3      Algeria  4426.0260
4       Angola  3607.1005
5    Argentina  8955.5538
6    Australia 19980.5956
7      Austria 20411.9163
8      Bahrain 18077.6639
9   Bangladesh   817.5588
10     Belgium 19900.7581
# ... with 132 more rows

Any ideas? PS: I'm new to R. I'm also looking at melt(). Any help will be appreciated!

Upvotes: 0

Views: 266

Answers (3)

Kumar Manglam
Kumar Manglam

Reputation: 2832

You should use year also in group_by, and after summary, just reshape the data the way you want using dcast or rehape

Here is a sample solution :

library(dplyr)
library(reshape2)
gapminder <- data.frame(cbind(gdpPercap=runif(10000), year =as.integer(seq(from=1952, to=2007, by=5)), country = c("India", "US", "UK")))
gapminder$gdpPercap <- as.numeric(as.character(gapminder$gdpPercap))
gapminder$year <- as.integer(as.character(gapminder$year))
gapminder %>% #my dataframe
  filter(year >= 1952) %>%
  group_by(country, year) %>%
  summarise(gdpPercap = mean(gdpPercap)) %>%
   dcast(country ~ year, value.var="gdpPercap")

I have to generate a new data, because your example is not reproducible. Go through the link How to make a great R reproducible example?. It helps in answering and understanding the problem, as well as, quicker answers.

Upvotes: 2

Dave
Dave

Reputation: 2526

Built-in reshape can do this.

foo.data.frame <- data.frame(
    Country=rep(c("Here", "There"), each=3),
    year=rep(c(1952, 1957, 1962),2),
    gdpPercap=779:784
    # ... other variables
)

reshape(foo.data.frame[, c("Country", "year", "gdpPercap")], 
    timevar="year", idvar="Country", direction="wide", sep=" ")

#   Country gdpPercap 1952 gdpPercap 1957 gdpPercap 1962
# 1    Here            779            780            781
# 4   There            782            783            784

Upvotes: 0

cuttlefish44
cuttlefish44

Reputation: 6806

tidyr::spread() would solve your problem

library(dplyr); library(tidyr)

gapminder %>% 
  select(country, year, gdpPercap) %>% 
  spread(year, gdpPercap)

Upvotes: 2

Related Questions