John Thomas
John Thomas

Reputation: 1105

Error in source("...") unexpected '=' despite all brackets lining up

I want to pull my hair out on this one...

I read in code like this

holidays <- read.csv("~/xxx/holiday_sample.csv") %>% 
  rename(DATE = "ï..DATE") %>% 
  mutate(DATE = as.Date(DATE,format = "%m/%d/%Y"))

##looks like this
structure(list(DATE = structure(c(17532, 17533, 17534, 17546, 
17547, 17548, 17549, 17575, 17576, 17577, 17620, 17621, 17622, 
17678, 17679, 17680, 17681, 17682, 17713, 17714, 17715, 17716, 
17717, 17774, 17775, 17776, 17777, 17778, 17812, 17847, 17855, 
17856, 17857, 17858, 17859, 17860, 17884, 17885, 17886, 17887, 
17888, 17889, 17890, 17891, 17892, 17893, 17894, 17895, 17896
), class = "Date"), REASON = c("New Years Day", "New Years Travel", 
"New Years Travel", "Lee-Jackson Day", "Lee-Jackson-King Travel Day", 
"Lee-Jackson-King Travel Day", "Martin Luther King, Jr. Day", 
"Presidents Day Travel", "Presidents Day Travel", "Presidents Day", 
"Easter Travel", "Easter Travel", "Easter", "Memorial Day Travel", 
"Memorial Day Travel", "Memorial Day Travel", "Memorial Day", 
"Memorial Day Travel", "Independence Day Travel", "Independence Day Travel", 
"Independence Day Travel", "Independence Day", "Independence Day Travel", 
"Labor Day Travel", "Labor Day Travel", "Labor Day Travel", "Labor Day", 
"Labor Day Travel", "Columbus Day", "Veterans Day", "Thanksgiving Travel", 
"Thanksgiving Travel", "Thanksgiving Day", "Thanksgiving Travel", 
"Thanksgiving Travel", "Thanksgiving Travel", "Christmas Travel", 
"Christmas Travel", "Christmas Travel", "Christmas Travel", "Christmas Travel", 
"Christmas Travel", "Christmas Day", "Christmas Travel", "Christmas Travel", 
"Christmas Travel", "Christmas Travel", "Christmas Travel", "New Years Travel"
)), class = "data.frame", row.names = c(NA, -49L))

And I want to loop thru another df to see which rows happen on a holiday.

bottleneck2 <- structure(list(startTime = structure(c(1519903920, 1519905060, 
1519913640), class = c("POSIXct", "POSIXt"), tzone = "America/New_York"), 
    endTime = structure(c(1519904880, 1519912200, 1519914540), class = c("POSIXct", 
    "POSIXt"), tzone = "America/New_York"), impact = c(92.17, 
    616.43, 63.69), impactPercent = c(184.15, 1495.17, 138.69
    ), impactSpeedDiff = c(3587.72, 25726.22, 2616.01), maxQueueLength = c(5.76053, 
    5.76053, 4.829511), tmcs = list(c("110N04623", "110-04623", 
    "110N04624", "110-04624", "110N04625", "110-04625", "110N04626", 
    "110-04626", "110N04627"), c("110N04623", "110-04623", "110N04624", 
    "110-04624", "110N04625", "110-04625", "110N04626", "110-04626", 
    "110N04627"), c("110N04623", "110-04623", "110N04624", "110-04624", 
    "110N04625", "110-04625", "110N04626", "110-04626")), early_startTime = structure(c(1519903620, 
    1519904760, 1519913340), class = c("POSIXct", "POSIXt"), tzone = "America/New_York")), row.names = c(NA, 
3L), class = "data.frame")


But when I run the following I get a syntax error which makes zero sense....

holiday_match <- lapply(1:nrow(bottleneck2), function(x) {
  
  bottleneck_row <- bottleneck2[x,]
  holidays[which(holidays$DATE = as.Date(bottleneck_row$early_startTime) | 
                   holidays$DATE = as.Date(bottleneck_row$endTime) == TRUE),]
  })


ERROR: Error: unexpected '}' in " }"

And then when I am saving the file in R I get another error.

Error in source("~/xxx/example.R") : 
  ~/xxx/example.R:226:32: unexpected '='
225:   bottleneck_row <- bottleneck2[x,]
226:   holidays[which(holidays$DATE =

Saw another post saying it could be a Unicode mismatch but retyped it twice and no shot. This is a copy and paste of another loop in the file which works perfectly....

Upvotes: 1

Views: 160

Answers (1)

r2evans
r2evans

Reputation: 160447

I think the operation you are effectively trying to do is determine if one of the bottleneck2 occurrences happen on a holiday. I think a better operation is a merge/join operation. Since you are looking at two fields, I think we need two joins, but I don't think this will be expensive, and we can cleanup afterwards so it just doesn't matter.

For this example, none of your bottleneck2 occurrences happen on a holiday, so I'm going to "nudge" two of them to happen on different holidays ...

bottleneck2 %>%
  # just to "bump" a couple of the rows into a holiday occurrence,
  # purely for demonstration
  mutate_if(~ inherits(., "POSIXt"),
            ~ . + c(0, 29, 31) * 86400) %>%
  # add a "_date" column for each so that we can "join" on the
  # date-version of each timestamp
  mutate_at(vars(early_startTime, endTime),
            list(date = ~ trunc(as.Date(.)))) %>%
  left_join(holidays, by = c(early_startTime_date = "DATE")) %>%
  left_join(holidays, by = c(endTime_date = "DATE")) %>%
  mutate(REASON = coalesce(REASON.x, REASON.y)) %>%
  select(-REASON.x, -REASON.y, -ends_with("_date"))
#             startTime             endTime impact impactPercent impactSpeedDiff maxQueueLength                                                                                              tmcs     early_startTime        REASON
# 1 2018-03-01 06:32:00 2018-03-01 06:48:00  92.17        184.15         3587.72       5.760530 110N04623, 110-04623, 110N04624, 110-04624, 110N04625, 110-04625, 110N04626, 110-04626, 110N04627 2018-03-01 06:27:00          <NA>
# 2 2018-03-30 07:51:00 2018-03-30 09:50:00 616.43       1495.17        25726.22       5.760530 110N04623, 110-04623, 110N04624, 110-04624, 110N04625, 110-04625, 110N04626, 110-04626, 110N04627 2018-03-30 07:46:00 Easter Travel
# 3 2018-04-01 10:14:00 2018-04-01 10:29:00  63.69        138.69         2616.01       4.829511            110N04623, 110-04623, 110N04624, 110-04624, 110N04625, 110-04625, 110N04626, 110-04626 2018-04-01 10:09:00        Easter

Now you have a REASON field (far right) that is the holiday name or NA otherwise.

From here, if you need to know which bottleneck2 match a holiday, just use filter(!is.na(REASON)) and you have all matching bottlenecks.


To answer your question as to why the syntax is incorrect, see this (after fixing = to ==):

holiday_match <- lapply(1:nrow(bottleneck2), function(x) {
  bottleneck_row <- bottleneck2[x,]
  holidays[which(holidays$DATE == as.Date(bottleneck_row$early_startTime) | 
                   holidays$DATE == as.Date(bottleneck_row$endTime) == TRUE),]
})

Let's drill inside:

holidays[which(holidays$DATE == as.Date(bottleneck_row$early_startTime) | 
                   holidays$DATE == as.Date(bottleneck_row$endTime) == TRUE),]

Specifically,

which(holidays$DATE == as.Date(bottleneck_row$early_startTime) | 
                   holidays$DATE == as.Date(bottleneck_row$endTime) == TRUE)

Let's remove the first half of the |:

which(holidays$DATE == as.Date(bottleneck_row$endTime) == TRUE)
# ...
holidays$DATE == as.Date(bottleneck_row$endTime) == TRUE

Unlike math operators (e.g., +) and assignment (<-), the == does not *cascade:

TRUE == TRUE == TRUE
# Error: unexpected '==' in "TRUE == TRUE =="
(TRUE == TRUE) == TRUE
# [1] TRUE

So a literal fix would be

holiday_match <- lapply(1:nrow(bottleneck2), function(x) {
  bottleneck_row <- bottleneck2[x,]
  holidays[which(holidays$DATE == as.Date(bottleneck_row$early_startTime) | 
                   holidays$DATE == as.Date(bottleneck_row$endTime)) == TRUE,]
})

but since == TRUE is completely unnecessary, this can be reduced to

holiday_match <- lapply(1:nrow(bottleneck2), function(x) {
  bottleneck_row <- bottleneck2[x,]
  holidays[which(holidays$DATE == as.Date(bottleneck_row$early_startTime) | 
                   holidays$DATE == as.Date(bottleneck_row$endTime)),]
})
holiday_match
# [[1]]
# [1] DATE   REASON
# <0 rows> (or 0-length row.names)
# [[2]]
# [1] DATE   REASON
# <0 rows> (or 0-length row.names)
# [[3]]
# [1] DATE   REASON
# <0 rows> (or 0-length row.names)

no matches because your sample dataset has no overlaps. If you use my "nudged" data above, then

holiday_match <- lapply(1:nrow(bottleneck2mod), function(x) {
  bottleneck_row <- bottleneck2mod[x,]
  holidays[which(holidays$DATE == as.Date(bottleneck_row$early_startTime) | 
                   holidays$DATE == as.Date(bottleneck_row$endTime)),]
})

holiday_match
# [[1]]
# [1] DATE   REASON
# <0 rows> (or 0-length row.names)
# [[2]]
#          DATE        REASON
# 11 2018-03-30 Easter Travel
# [[3]]
#          DATE REASON
# 13 2018-04-01 Easter

Upvotes: 1

Related Questions