Reputation: 406
I'm trying to subset a data frame that I imported with read.table using the colClasses='character'
option.
A small sample of the data can be found here
Full99<-read.csv("File.csv",header=TRUE,colClasses='character')
After removing duplicates, missing values, and all unnecessary columns I get a data frame of these dimmensions:
>dim(NoMissNoDup99)
[1] 81551 6
I'm interested in reducing the data to only include observations of a specific Service.Type
I've tried with the subset function:
MU99<-subset(NoMissNoDup99,Service.Type=='Apartment'|
Service.Type=='Duplex'|
Service.Type=='Triplex'|
Service.Type=='Fourplex',
select=Service.Type:X.13)
dim(MU99)
[1] 0 6
MU99<-NoMissNoDup99[which(NoMissNoDup99$Service.Type!='Hospital'
& NoMissNoDup99$Service.Type!= 'Hotel or Motel'
& NoMissNoDup99$Service.Type!= 'Industry'
& NoMissNoDup99$Service.Type!= 'Micellaneous'
& NoMissNoDup99$Service.Type!= 'Parks & Municipals'
& NoMissNoDup99$Service.Type!= 'Restaurant'
& NoMissNoDup99$Service.Type!= 'School or Church or Charity'
& NoMissNoDup99$Service.Type!='Single Residence'),]
but that doesn't remove observations.
I've tried that same method but slightly tweaked...
MU99<-NoMissNoDup99[which(NoMissNoDup99$Service.Type=='Apartment'
|NoMissNoDup99$Service.Type=='Duplex'
|NoMissNoDup99$Service.Type=='Triplex'
|NoMissNoDup99$Service.Type=='Fourplex'), ]
but that removes every observation...
The final subset should have somewhere around 8000 observations
I'm pretty new to R and Stack Overflow, so I apologize if there's some convention of posting I've neglected to follow, but if anyone has a magic bullet to get this data to cooperate, I'd love your insights :)
Upvotes: 1
Views: 12030
Reputation: 10913
## exclude
MU99<-subset(NoMissNoDup99,!(Service.Type %in% c('Hospital','Hotel or Motel')))
##include
MU99<-subset(NoMissNoDup99,Service.Type %in% c('Apartment','Duplex'))
Upvotes: 1
Reputation: 4534
The different methods should work if you were using the right variable values. Your issue likely is extra spaces in your variable names.
You can avoid this kind of issues using grep
for example:
NoMissNoDup99[grep("Apartment|Duplex|Business",NoMissNoDup99$Service.Type),]
Upvotes: 1