Reputation: 11
I want to work with a filtered subset of my dataset.
Example: healthstats.csv
age weight height gender
A 25 150 65 female
B 24 175 78 male
C 26 130 72 male
D 32 200 69 female
E 28 156 66 male
F 40 112 78 female
I would start with
patients = read.csv("healthstats.csv")
but how to I only import a subset of
patients$gender == "female"
when I run
patients = read.csv("healthstats.csv")
Upvotes: 0
Views: 2054
Reputation: 388817
If you want to import only a subset of rows without reading them you can use sqldf
which accepts a query to filter data.
library(sqldf)
read.csv.sql("healthstats.csv", sql = "select * from file where gender == 'female'")
We can also use read_csv_chunked
from readr
readr::read_csv_chunked('healthstats.csv',
callback = DataFrameCallback$new(function(x, pos) subset(x, gender == "female")))
Upvotes: 3