Reputation: 1626
I have the titanic dataset in which I want to find the probability of survival based on 3 conditions. The following table gives the probabilities.
library(PASWR2)
tab = with(TITANIC3, ftable(fare = fare > 200, pclass, sex, survived)) %>% prop.table(1) %>% round(3) * 100
tab
Is there an easy way to add probabilities from tab
table to TITANIC3 dataset as a new column?
Thanks!
Upvotes: 1
Views: 372
Reputation: 717
This can be achieved by using the package data.table
.
The object TITANIC3
is of class data.frame
. First you need to convert it to class data.table
. When using data.table you can define new columns based on aggregations and a grouping clause directly in one line.
Just run the code below.
The new column with the conditional probability of survival is survival_prob
.
I always recommend using data.table
because it is the fastest way to manipulate data in R. However, if you want to proceed your analysis with a data.frame
, just use the command setDF(titanic3)
to convert the object back to class data.frame
.
library(PASWR2)
library(magrittr)
library(data.table)
# convert dataset from data frame to data table
titanic3 <- copy(TITANIC3)
setDT(titanic3)
# define new column survival_prob using by-option
titanic3[, survival_prob := round(100*mean(survived), 1),
by = .(fare > 200, pclass, sex)]
Upvotes: 1