Reputation: 3505
I have a dataframe df of soccer team information by game (MATCHID) with these initial values
TEAMID Venue LEAGUEPOS MATCHID
WHU A 5 1
COV H 12 1
EVE H 15 2
MNU A 2 2
ARS A 3 3
LEI H 4 3
I wish to create just one row for each game so that it would end up looking like
MATCHID HomeTeam AwayTeam HomePos AwayPos
1 COV WHU 12 5 etc.
so I want to create some new columns , delete others and remove duplicated rows.
I am having trouble with first stage trying
df$HomeTeam <- df$TEAMID[df$Venue == "H"]
as this produces
TEAMID Venue LEAGUEPOS MATCHID HomeTeam
WHU A 5 1 COV
COV H 12 1 EVE
EVE H 15 2 LEI
MNU A 2 2 STH
ARS A 3 3 TOT
LEI H 4 3 WIM
With the HomeTeam just showing the sequential TEAMID for each record with a Venue = H
Upvotes: 3
Views: 3036
Reputation: 55695
This can be easily achieved using the function reshape
which is a part of base R.
# READ DATA
mydf = read.table(textConnection("
TEAMID Venue LEAGUEPOS MATCHID
WHU A 5 1
COV H 12 1
EVE H 15 2
MNU A 2 2
ARS A 3 3
LEI H 4 3"),
sep = "", header = T, colClasses = rep('character', 4))
# RESHAPE DATA
reshape(mydf, idvar = 'MATCHID', timevar = 'Venue', direction = 'wide')
Here is the output produced
MATCHID TEAMID.A LEAGUEPOS.A TEAMID.H LEAGUEPOS.H
1 1 WHU 5 COV 12
3 2 MNU 2 EVE 15
5 3 ARS 3 LEI 4
NOTE: An alternate way to do this is to use cast
and melt
functions from reshape
package.
require(reshape)
mydf_m = melt(mydf, id = c('MATCHID', 'Venue'))
cast(mydf_m, MATCHID ~ Venue + variable)
Upvotes: 5
Reputation: 174813
reshape()
in base R does what you want, if a little clunkily. Here is your data:
con <- textConnection(" TEAMID Venue LEAGUEPOS MATCHID
WHU A 5 1
COV H 12 1
EVE H 15 2
MNU A 2 2
ARS A 3 3
LEI H 4 3
")
dat <- read.table(con, header = TRUE, stringsAsFactors = FALSE)
close(con)
We reshape()
this, get the columns in the requested order, and update the columns names:
newdat <- reshape(dat, direction = "wide", timevar = "Venue", idvar = "MATCHID")
## reorder
newdat <- newdat[, c(1,4,2,5,3)]
names(newdat) <- c("MatchID","HomeTeam","AwayTeam","HomePos","AwayPos")
This gives us:
> newdat
MatchID HomeTeam AwayTeam HomePos AwayPos
1 1 COV WHU 12 5
3 2 EVE MNU 15 2
5 3 LEI ARS 4 3
Upvotes: 1