Gloria Dalla Costa
Gloria Dalla Costa

Reputation: 397

Error in function using doParallel and foreach

I am trying to reproduce a code I found to download trading data and it was supposed to be working fine, but when I run this function I get an error, and I can't find a solution. Here it is:

Get_Candlesticks <- function(
  pairs, interval, startTime, endTime, path_save = NULL){
  # Download all trading pairs for the specified period
  start_timer <- Sys.time()
  data <- foreach(i = pairs, .packages = c("httr", "foreach"),
                  .export = c("Binance_Candlesticks_Historical", "Binance_Candlesticks_Timed")) %dopar% { 
                    response <- Binance_Candlesticks_Historical(symbol = i, interval = interval,
                                                                startTime = startTime, endTime = endTime)
                    if(nrow(response) == 0){
                       stop(paste0("Trading pair ", i, " has no data for this period!")) 
                      }
                  cat(paste0("Trading pair ", i, " downloaded."))
                  return(response) 
                  }
  names(data) <- pairs
  end_timer <- Sys.time()
  time_taken <- end_timer - start_timer
  cat(paste0("Downloaded ", length(pairs), " trading pairs in"), time_taken, "\n")
  # If no path for saving is provided, download data to R object
  if(is.null(path_save)){
    cat(paste0("All ", length(pairs), " pairs downloaded as R object."), "\n")
    return(data) }
  # If path for saving is provided, save data to path
  else {
    foreach(i = pairs) %dopar% {
      # Download data
      temp_data <- data[[i]]
      # Save the candlesticks for current the trading pair
      write.csv(
        x = temp_data,
        file = file.path(path_save, paste0(i, ".csv")), row.names = FALSE, quote = TRUE
      )
      cat(paste0("Trading pair ", i, " saved as .csv file.")) }
    cat(paste0("All ", length(pairs), " pairs saved to ", path_save), "\n")
    return(time_taken) 
   }
}

When I call the function:

    Get_Candlesticks(pairs="BTCUSDT",interval="1m",startTime="2021-01-01", endTime="2021-01-02",path_save=NULL)

I get this error:

Error in { : 
  task 1 failed - "non-numeric argument to binary operator"
Called from: e$fun(obj, substitute(ex), parent.frame(), e$data)

For completeness, these are the two functions called by the above function:

library (dplyr) # Data Manipulation 
library (magrittr)  # Pipe - Operators 
library (tidyr) # Data Tidying
library (reshape2 ) # Data Reshaping 
library (tibble)    # Data Frame Format 
library (readr) # Read Data
library (foreach)   # Iterative Computing
library (doParallel)    # Parallel Backend for foreach 
library (compiler)  # Byte Code Compiler
library (purrr) # Functional Programming
library (anytime)
library (lubridate)

# Binance candlesticks Timed
Binance_Candlesticks_Timed <- function( symbol, interval, startTime){
  # Set URL to API endpoint
  api_url <- "https://api.binance.com" 
  req_url <- "api/v1/klines"
  # Define the parameters for the API call
  params <- list(symbol = symbol, interval = interval,
                 startTime = startTime)
  response <- content(GET(api_url, path = req_url, query = params))
  # Restructure the response
  response_df <- as.data.frame(
    foreach(i = 1:length(response), .combine = rbind) %do% {
      foreach(j = 1:12, .combine = c) %do% {
        response[[i]][j] }
    }
    , stringsAsFactors = FALSE, row.names = FALSE)
  # Filter and name the response
  response_df <- response_df[,-12]
  cols_numeric <- c(1,2,3,4,5,6, 7 ,8,9,10,11)
  response_df[, cols_numeric] = apply(response_df[, cols_numeric], 2, function(x) as.numeric(x))
  colnames(response_df) <- c("Open_time", "Open",
                             # Make the call
                             "High",
                             "Low",
                             "Close",
                             "Volume",
                             "Close_time", "Quote_asset_volume", "Number_of_trades", "Taker_buy_base_asset_volume", "Taker_buy_quote_asset_volume")
  # Return the response
  return(response_df) 
  }

# Binance Candlesticks Historical
Binance_Candlesticks_Historical <- function(symbol, interval, startTime, endTime){
  # Check inputs
  if(endTime - startTime <= 0) stop("startTime must be before endTime!") 
  # Initial data setup
  data <- Binance_Candlesticks_Timed(symbol = symbol, interval = interval, startTime = startTime) 
  # Set start time for while loop
  next_startTime <- startTime + 60000000
  # Get all full api calls
  while(next_startTime <= endTime){
    next_data <- Binance_Candlesticks_Timed(symbol = symbol, interval = interval,
                                            startTime = next_startTime) 
    data <- rbind(data, next_data)
    next_startTime <- next_startTime + 60000000
  }
  # Cut data to start and end time
  data <- data[which(data$Open_time <= endTime),] 
  return(data)
}

Upvotes: 0

Views: 306

Answers (1)

MacOS
MacOS

Reputation: 1159

The source of your problem is this line.

while(next_startTime <= endTime)

because in your function call you are giving two strings.

Get_Candlesticks(pairs="BTCUSDT",
                 interval="1m",
                 startTime="2021-01-01", # this is a string
                 endTime="2021-01-02",   # this is a string
                 path_save=NULL)

If you change your function call to the following, this error goes away.

Get_Candlesticks(pairs = "BTCUSDT",
                 startTime = as.Date("2021-01-01"),  # Convert to date object.
                 endTime = as.Date("2021-01-02"),    # Convert to date object here too.
                 interval = "1m")

Unfortunately, this results in another error.

Warning message:
In FUN(newX[, i], ...) : NAs introduced by coercion

Which is actually true when we look at the output.

Trading pair BTCUSDT downloaded.Downloaded 1 trading pairs in 0.7909036 
All 1 pairs downloaded as R object. 
$BTCUSDT
  Open_time Open High Low Close Volume Close_time Quote_asset_volume Number_of_trades Taker_buy_base_asset_volume
1     -1100   NA   NA  NA    NA     NA         NA                 NA               NA                          NA
  Taker_buy_quote_asset_volume
1                           NA

The reason for this is that the REST API does expect start and end time to be timestamps, i.e. integers. Which can be easily done be changing the function call to the following.

Get_Candlesticks(pairs = "BTCUSDT",
                 startTime = as.numeric(as.Date("2021-01-01")),
                 endTime = as.numeric(as.Date("2021-01-02")),
                 interval = "1m")

Upvotes: 1

Related Questions