Reputation:
I am very new to R and script writing in general. Please be patient if this is a very basic question. My search for a solution has not been successful.
date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c("1", "3", "1", "10", "5")
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)
I want to change the df to a matrix which has the airport.departure as rows and the airport.arrival as columns with the cumulated amount for a given month/years in the cells of the matrix.
Upvotes: 0
Views: 95
Reputation: 163
using dplyr package:
library(dplyr)
df %>% mutate(month.departure =substr(date.depature, 1, 7)) %>%
group_by(airport.departure, airport.arrival, month.departure) %>%
summarize(total = sum(as.numeric(as.character(amount))))
as.numeric(as.character())
is included because 'amount' started as a factor type, which can't be summed [the as.character()
is needed because as.numeric would have unexpected behavior if called directly on the factor]. total =
gives an arbitrary column name to include in the output table; you could leave it out and this will still work.Output:
(format of output table is different than you described, but it contains all the info you're looking for. and dplyr is a great package to start learning for this sort of data manipulation!)
Upvotes: 0
Reputation: 51592
You are looking for xtabs
, i.e.
xtabs(amount ~ airport.arrival + airport.departure, df)
which gives,
airport.departure airport.arrival CDG QNY QXO CDG 0 3 0 QNY 0 0 1 QXO 0 5 0 SYD 11 0 0
P.S
As @Andre Elrico mentions, for some reason you declared your amount
variable as a string. You need to convert to integer prior to calculating the sums
Proposed Solution:
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
Upvotes: 3