Reputation: 13
This is my first time posting a question in StackOverflow due to ongoing struggles with the following tasks:
Getting error when using The Pipe Operator
Error in erate_data %>% filter(Speed.Unit == "Gbps") %>% as.numeric(erate_data$Download.Speed) :
'list' object cannot be coerced to type 'double'
When using ifelse statement, I would get a large list with
Warning message:
In ifelse(erate_data$Speed.Unit == "Gbps", as.numeric(erate_data$Download.Speed) * :
NAs introduced by coercion
library(dplyr)
erate_data <- read.csv('E-Rate_Details.csv', stringsAsFactors = FALSE)
#convert gbps to mbps trial 1
gbps_mbps <- erate_data %>%
filter(Speed.Unit == "Gbps") %>%
as.numeric(erate_data$Download.Speed) * 1024
#convert gbps to mbps trial 2
gbps_mbps <- ifelse(erate_data$Speed.Unit == "Gbps", as.numeric(erate_data$Download.Speed) * 1024, erate_data)
# filter latest year with lowest FRN monthly cost
library_latest <-
erate_data %>%
filter(Funding.Year == max(Funding.Year) & Monthly.Cost == min(Monthly.Cost))
Any help/guidance will be much appreciated. attached screenshot for the reference
Input
dput(erate_data)
structure(list(Entity.Name = c("115TH STREET BRANCH LIBRARY", "115th Street Branch Library", "125th Street Branch Library", "320th Federal Way Library", "320th Federal Way Library", "53rd Street Library", "81ST AVENUE BRANCH LIBRARY", "81ST AVENUE BRANCH LIBRARY", "81ST AVENUE BRANCH LIBRARY"), Zip.Code = c(10026L, 10026L, 10035L, 98003L, 98003L, 10019L, 94621L, 94621L, 94621L), Funding.Year = c(2016L, 2019L, 2019L, 2019L, 2019L, 2019L, 2016L, 2017L, 2017L), Download.Speed = c(40, 200, 200, 100, 1, 1, 50, 1.544, 1.544), Speed.Unit = c("Mbps", "Mbps", "Mbps", "Mbps", "Gbps", "Gbps", "Mbps", "Mbps", "Mbps" ), Monthly.Cost = c("1,365", "1,207.50", "1,207.50", "876", "1,380", "2,126.25", "961.01", "26.12", "158.5")), class = "data.frame", row.names = c(NA, -9L))
Desired Output
dput(erate_data)
structure(list(Entity.Name = c("115th Street Branch Library", "125th Street Branch Library", "320th Federal Way Library", "53rd Street Library", "81ST AVENUE BRANCH LIBRARY"), Zip.Code = c(10026L, 10035L, 98003L, 10019L, 94621L), Funding.Year = c(2019L, 2019L, 2019L, 2019L, 2017L), Download.Speed = c(200, 200, 100, 1024, 1.544), Speed.Unit = c("Mbps", "Mbps", "Mbps", "Mbps", "Mbps"), Monthly.Cost = c("1,207.50", "1,207.50", "876", "2,126.25", "26.12")), row.names = c(NA, 5L), class = "data.frame", na.action = structure(6:20, names = c("6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20"), class = "omit"))
Upvotes: 0
Views: 39
Reputation: 26505
Thanks for editing your question to include an MRE. Two issues stand out to me and might be the cause of your problem: the "Monthly.Cost" column uses commas as a 'thousands' separator (e.g. "1,234.00"). In order to use these values for sorting your dataframe, you need R to interpret these as numbers. There is a function in the readr package called parse_number()
which can handle the conversion from "1,234.00" to "1234.00". Readr is part of the tidyverse (gets loaded when you load the tidyverse package). The other issue is the "Entity.Name"'s have different cases i.e. all upper case vs Sentence case. One way to address this is to convert all of the names to upper case (toupper()
function), but this may or may not be suitable depending on your use-case (up to you).
Here is a potential solution that I hope solves your issues:
library(tidyverse)
erate_data <- structure(list(Entity.Name = c("115TH STREET BRANCH LIBRARY", "115th Street Branch Library", "125th Street Branch Library", "320th Federal Way Library", "320th Federal Way Library", "53rd Street Library", "81ST AVENUE BRANCH LIBRARY", "81ST AVENUE BRANCH LIBRARY", "81ST AVENUE BRANCH LIBRARY"), Zip.Code = c(10026L, 10026L, 10035L, 98003L, 98003L, 10019L, 94621L, 94621L, 94621L), Funding.Year = c(2016L, 2019L, 2019L, 2019L, 2019L, 2019L, 2016L, 2017L, 2017L), Download.Speed = c(40, 200, 200, 100, 1, 1, 50, 1.544, 1.544), Speed.Unit = c("Mbps", "Mbps", "Mbps", "Mbps", "Gbps", "Gbps", "Mbps", "Mbps", "Mbps" ), Monthly.Cost = c("1,365", "1,207.50", "1,207.50", "876", "1,380", "2,126.25", "961.01", "26.12", "158.5")), class = "data.frame", row.names = c(NA, -9L))
erate_data %>%
mutate(Monthly.Cost = parse_number(Monthly.Cost)) %>%
mutate(Download.Speed = ifelse(Speed.Unit == "Gbps",
Download.Speed * 1024,
Download.Speed)) %>%
select(-Speed.Unit) %>%
group_by(toupper(Entity.Name)) %>%
slice_max(order_by = desc(Monthly.Cost))
#> # A tibble: 5 × 6
#> # Groups: toupper(Entity.Name) [5]
#> Entity.Name Zip.Code Funding.Year Download.Speed Monthly.Cost `toupper(Entit…`
#> <chr> <int> <int> <dbl> <dbl> <chr>
#> 1 115th Stre… 10026 2019 200 1208. 115TH STREET BR…
#> 2 125th Stre… 10035 2019 200 1208. 125TH STREET BR…
#> 3 320th Fede… 98003 2019 100 876 320TH FEDERAL W…
#> 4 53rd Stree… 10019 2019 1024 2126. 53RD STREET LIB…
#> 5 81ST AVENU… 94621 2017 1.54 26.1 81ST AVENUE BRA…
Created on 2022-07-12 by the reprex package (v2.0.1)
Upvotes: 0