Reputation: 460
So, I currently have a dataframe that looks like:
country continent year lifeExp pop gdpPercap
<fctr> <fctr> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.801 8425333 779.4453
2 Afghanistan Asia 1957 30.332 9240934 820.8530
3 Afghanistan Asia 1962 31.997 10267083 853.1007
4 Afghanistan Asia 1967 34.020 11537966 836.1971
5 Afghanistan Asia 1972 36.088 13079460 739.9811
6 Afghanistan Asia 1977 38.438 14880372 786.1134
There are 140+ countries. The years are in 5 year intervals. From 1952- 2007 I want to reshape my dataframe such that I get.
Country gdpPercap(1952) gdpPercap(1957) ... gdpPercap(2007)
<fctr> <dbl>
1 Afghanistan 974.5803 .... ...
2 Albania 5937.0295 ... ...
3 Algeria 6223.3675 ... ...
4 Angola 4797.2313
5 Argentina 12779.3796
6 Australia 34435.3674
7 Austria 36126.4927
8 Bahrain 29796.0483
9 Bangladesh 1391.2538
10 Belgium 33692.6051
My attempt is this:
gapminder %>% #my dataframe
filter(year >= 1952) %>%
group_by(country) %>%
summarise(gdpPercap = mean(gdpPercap))
OUTPUT:
country gdpPercap <- but this takes the mean of gdpPercap from 1952-2007
<fctr> <dbl>
1 Afghanistan 802.6746
2 Albania 3255.3666
3 Algeria 4426.0260
4 Angola 3607.1005
5 Argentina 8955.5538
6 Australia 19980.5956
7 Austria 20411.9163
8 Bahrain 18077.6639
9 Bangladesh 817.5588
10 Belgium 19900.7581
# ... with 132 more rows
Any ideas? PS: I'm new to R. I'm also looking at melt(). Any help will be appreciated!
Upvotes: 0
Views: 266
Reputation: 2832
You should use year also in group_by, and after summary, just reshape the data the way you want using dcast
or rehape
Here is a sample solution :
library(dplyr)
library(reshape2)
gapminder <- data.frame(cbind(gdpPercap=runif(10000), year =as.integer(seq(from=1952, to=2007, by=5)), country = c("India", "US", "UK")))
gapminder$gdpPercap <- as.numeric(as.character(gapminder$gdpPercap))
gapminder$year <- as.integer(as.character(gapminder$year))
gapminder %>% #my dataframe
filter(year >= 1952) %>%
group_by(country, year) %>%
summarise(gdpPercap = mean(gdpPercap)) %>%
dcast(country ~ year, value.var="gdpPercap")
I have to generate a new data, because your example is not reproducible. Go through the link How to make a great R reproducible example?. It helps in answering and understanding the problem, as well as, quicker answers.
Upvotes: 2
Reputation: 2526
Built-in reshape
can do this.
foo.data.frame <- data.frame(
Country=rep(c("Here", "There"), each=3),
year=rep(c(1952, 1957, 1962),2),
gdpPercap=779:784
# ... other variables
)
reshape(foo.data.frame[, c("Country", "year", "gdpPercap")],
timevar="year", idvar="Country", direction="wide", sep=" ")
# Country gdpPercap 1952 gdpPercap 1957 gdpPercap 1962
# 1 Here 779 780 781
# 4 There 782 783 784
Upvotes: 0
Reputation: 6806
tidyr::spread()
would solve your problem
library(dplyr); library(tidyr)
gapminder %>%
select(country, year, gdpPercap) %>%
spread(year, gdpPercap)
Upvotes: 2