Reputation: 17
I'm trying to make a function in R, that performs some specific operations on a lot of different data sets, with the following code:
library(parallel)
cluster = makeCluster(2)
setwd("D:\\Speciale")
data_func <- function(kommune) {
rm(list=ls())
library(dplyr)
library(data.table)
library (tidyr)
#Load address and turbine datasets
distances <- fread(file="Adresser og distancer\\kommune.csv", header=TRUE, sep=",", colClasses = c("longitude" = "character", "latitude" = "character", "min_distance" = "character", "distance_turbine" = "character", "id_turbine" = "character"), encoding="Latin-1")
turbines <- fread(file="turbines_DK.csv", header=TRUE, sep=",", colClasses = c("lon" = "character", "lat" = "character", "id_turbine" = "character", "total_height" = "character", "location" = "character"), encoding="Latin-1")
Some cleaning of the data and construction of new variables
#write out the dataset
setwd("D:\\Speciale\\Analysedata")
fwrite(mock_final, file = "final_kommune.csv", row.names = FALSE)
}
do.call(rbind, parLapply(cl = cluster, c("Albertslund", "Alleroed"), data_func))
When I do this, I get the following error message:
Error in checkForRemoteErrors(val) : 2 nodes produced errors; first error: File 'Adresser og distancer\kommune.csv' does not exist or is non-readable. getwd()=='C:/Users/KSAlb/OneDrive/Dokumenter'
I need it to change the name of the files. Here it should insert Albertslund instead of kommune in the file names, perform the operations, write out a CSV file (changing "final_kommune.csv" to "final_Albertslund.csv"), clear the environment and then move on to the next data set, Alleroed.
Albertslund and Alleroed are just examples, there is a total of 98 data sets I need to process.
Upvotes: 0
Views: 417
Reputation: 76402
Maybe something like the code below can be of help. Untested, since there are no data.
library(parallel)
library(dplyr)
library(data.table)
library(tidyr)
data_func <- function(kommune, inpath = "Adresser og distancer",
turbines, outpath = "D:/Speciale/Analysedata") {
filename <- paste0(kommune, ".csv")
filename <- file.path(inpath, filename)
#Load address and turbine datasets
distances <- fread(
file = filename,
header = TRUE,
sep = ",",
colClasses = c("longitude" = "character", "latitude" = "character", "min_distance" = "character", "distance_turbine" = "character", "id_turbine" = "character"),
encoding = "Latin-1"
)
#Some cleaning of the data and construction of new variables
#write out the dataset
outfile <- paste0("final_", kommune, ".csv")
outfile <- file.path(outpath, outfile)
fwrite(mock_final, file = outfile, row.names = FALSE)
}
cluster = makeCluster(2)
setwd("D:\\Speciale")
# Read turbines file just once
turbines <- fread(
file = "turbines_DK.csv",
header = TRUE,
sep=",",
colClasses = c("lon" = "character", "lat" = "character", "id_turbine" = "character", "total_height" = "character", "location" = "character"),
encoding = "Latin-1"
)
kommune_vec <- c("Albertslund", "Alleroed")
do.call(rbind, parLapply(cl = cluster, kommune_vec, data_func, turbines = turbines))
Upvotes: 1