denvercode
denvercode

Reputation: 11

How do I filter .csv file before reading

I want to work with a filtered subset of my dataset.

Example: healthstats.csv

    age   weight   height   gender
A    25      150      65      female
B    24      175      78      male
C    26      130      72      male
D    32      200      69      female
E    28      156      66      male
F    40      112      78      female

I would start with

patients = read.csv("healthstats.csv")

but how to I only import a subset of

patients$gender == "female" 

when I run

patients = read.csv("healthstats.csv")

Upvotes: 0

Views: 2054

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388817

If you want to import only a subset of rows without reading them you can use sqldf which accepts a query to filter data.

library(sqldf)
read.csv.sql("healthstats.csv", sql = "select * from file where gender == 'female'")

We can also use read_csv_chunked from readr

readr::read_csv_chunked('healthstats.csv', 
  callback = DataFrameCallback$new(function(x, pos) subset(x, gender == "female")))

Upvotes: 3

Related Questions