Reputation: 79
My original data has about 1000 observations and has the following variables.
$Nationality : Factor "American" "Korean" ...
$Food : Factor "Milk" "Fruits" "Rice"
$No. of servings : num 5 6 3
I wanted to construct a table, which shows for $Nationality == American, what is the $Food that they eat, and its corresponding $No. of servings.
Since my original data is huge, i tried to first subset the data using:
American = subset(originaldata, $Nationality == "American")
, to create a data frame which contain records of American nationality only.
Then i applied the table ( ) function on the subsetted data (i.e. American) using: table(American$Food, American$No. of servings)
The results, instead of just containing $Nationality == "American"
records, had also contained all other Nationality records.
Why is this so? Is there any method to work around with this problem? I want a table which only contains records of Nationality == American
, showing data on $Food
and $No
. of servings in two columns.
Upvotes: 1
Views: 719
Reputation: 3082
For Large scale data use data.table . if I understand your problem correctly then it should be achievable by following
library(data.table)
dt= as.data.table(your_data)
dt[,.SD,Nationality]
with the data that @sotos provided it would look like
dt <- as.data.table(x)
> dt[,.SD,Nationality]
Nationality Food No.ofServings
1: American Fruits 3
2: American rise 5
3: American pasta 9
4: Korean meat 6
5: British Fruits 2
filtering is easy peasy
> dt[Nationality=="American"]
Nationality Food No.ofServings
1: American Fruits 3
2: American rise 5
3: American pasta 9
Upvotes: 0
Reputation: 149
or with the dplyr package:
install.packages("dplyr")
library(dplyr)
AmericanData = filter(yourdata, Nationality == "American")
Upvotes: 0
Reputation: 51582
You can split your data by nationality and then extract 'American',
list1 <- split(originaldata, originaldata$Nationality)
list1$American
# Nationality Food No.ofServings
#1 American Fruits 3
#2 American rise 5
#5 American pasta 9
DATA
dput(originaldata)
structure(list(Nationality = structure(c(1L, 1L, 3L, 2L, 1L), .Label = c("American",
"British", "Korean"), class = "factor"), Food = structure(c(1L,
4L, 2L, 1L, 3L), .Label = c("Fruits", "meat", "pasta", "rise"
), class = "factor"), No.ofServings = c(3, 5, 6, 2, 9)), .Names = c("Nationality",
"Food", "No.ofServings"), row.names = c(NA, -5L), class = "data.frame")
Upvotes: 1