user6156267
user6156267

Reputation:

Subset data frame by character not working... - R

I'm trying to subset this data frame to only include row that have "kw" in the UOM column, but the resultant data frame is empty (it's not finding any "kw" which there clearly are...) Anybody have any thoughts?

> source<-read.csv("C:\\Users\\mcan\\Desktop\\Analysis\\Reduced Data.csv",header=TRUE, sep=",",stringsAsFactors = FALSE) #open
> str(source)
'data.frame':   1048575 obs. of  12 variables:
 $ Location..         : int  12345 12345 12345 12345 12345 12345 12345 12345 ...
 $ Vendor.Name        : chr  "Alabama Power" "Alabama Power" "Alabama Power" "Alabama Power" ...
 $ Bill.Month         : chr  "8/01/2015" "5/01/2015" "12/01/2016" "11/01/2015" ...
 $ Bill.Date          : chr  "8/27/2015" "5/28/2015" "12/28/2016" "11/24/2015" ...
 $ Service.Begin.Date : chr  "7/29/2015" "4/29/2015" "11/29/2016" "10/27/2015" ...
 $ Service.End.Date   : chr  "8/26/2015" "5/28/2015" "12/28/2016" "11/24/2015" ...
 $ Service.Days       : int  29 30 30 29 33 33 33 32 29 32 ...
 $ Service.Description: chr  "Elec Off Peak" "Elec Adjustment" "Elec Adjustment" "Elec Off Peak" ...
 $ Service.Alias      : chr  "Energy - Off Peak" "Fuel Cost Adjustment - On Peak" "Contribution" "Off Peak Kwh Usage" ...
 $ Billed.Quantity    : chr  "60,000" "0.000" "0.000" "0.000" ...
 $ UOM                : chr  "kWh" "" "" "kWh" ...
 $ Cost               : chr  "$3,000.10 " "($20.090)" "$41.120 " "$0.000 " ...
> library("sqldf")
> library("data.table")
> data<-subset(source,select = -c(Vendor.Name,Bill.Month,Bill.Date,Service.Description,Cost))
> head(data)
  Location.. Service.Begin.Date Service.End.Date Service.Days                  Service.Alias Billed.Quantity UOM
1      12345          7/29/2015        8/26/2015           29              Energy - Off Peak      60,000.000 kWh
2      12345          4/29/2015        5/28/2015           30 Fuel Cost Adjustment - On Peak           0.000    
3      12345         11/29/2016       12/28/2016           30                   Contribution           0.000    
4      12345         10/27/2015       11/24/2015           29             Off Peak Kwh Usage           0.000 kWh
5      12345         12/27/2014        1/28/2015           33                       Kw Usage           0.000  Kw
6      12345         12/27/2014        1/28/2015           33          On-Peak Demand Charge         190.000  Kw
> newdata<-subset(data,UOM=="kw")
> newdata
[1] Location..         Service.Begin.Date Service.End.Date   Service.Days       Service.Alias      Billed.Quantity    UOM               
<0 rows> (or 0-length row.names)

Upvotes: 1

Views: 1778

Answers (1)

user6156267
user6156267

Reputation:

Thank you @mako212

newdata<-subset(data,UOM=="Kw")

Upvotes: 1

Related Questions