"For" loop with column names as index

Question

I would like to create a loop in which the index is given by the column names of a dataframe. The idea is to select one column at a time and create a map based on the data in that column. I need i being the column name, as it identifies the name of the variable and I'll use that as part of the title of the map. However, I do not seem to be able to associate my index i to the name of the column. My code goes as follows:

# random data
x <- rep(c("AT130", "DEA1A", "DEA2C", "SE125", "SE232"), 4)
y <- c(1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0 ,1, 0, 1, 0, 1)
z <- c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ,0, 0, 0, 0, 0)
w <- c(0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1 ,0, 1, 0, 1, 0)
d <- as.data.frame(cbind(x,y,z,w))
colnames(d) <- c("id", "typeA", "typeB", "typeC")


for (i in colnames(d[,2:ncol(d)])) {
var_to_map <- d[,c(1,i)]

## do stuff

}

I get the following error at the first line:

Error: Can't subset columns that don't exist.
x Columns `1`, `2`, and `3` don't exist.
Run `rlang::last_error()` to see where the error occurred.

However, if I just run colnames(d[,2:ncol(d)]), it works properly

colnames(d[,2:ncol(d)])
 [1] "typeA"             "typeB"       "typeC"

I could find a workaround by using columns numbers to make it work, but I would like to keep the column names since I am printing (10+) maps within the loop and I am using i to insert the title of the map each, as follows:

# I use geodata files from the library `Eurostat`.
geodata <- get_eurostat_geospatial(resolution = "60", nuts_level = "3", year = 2013)
 

for (i in colnames(d[,2:ncol(d)])) {

var_to_map <- d[,c(1,i)]

colnames(var_to_map)[1] <- "geo"
# Joining, by = "geo" 
map_data <- merge(var_to_map, geodata, by=c("geo"), all.y=T, all.x=T) 
## creating ranges

 map_data$cat <- with(map_data, cut(value, 
                                   breaks= qu <- unique(quantile(value, 
                                                                 probs=c(0, 0.2, 0.5, 0.8, 
                                                                         0.9, 0.95, 0.99, 1),
                                                                 na.rm=TRUE,  include.lowest=T )),
                                                        labels=qu[-1]),include.lowest=TRUE )

 # Map 
 print(ggplot(data=map_data) + geom_sf(aes(fill=cat),  size=.1) + 
        scale_fill_brewer(palette = "Darkred", na.value= "grey") + aes(geometry = geometry) +
        guides(fill = guide_legend(reverse=T, title = "Percentiles")) +
        labs(title = paste("The name of this graph is the column name", i) ## here is where I use the index
             )+ 
        theme_minimal() + theme(legend.position=c(.8,.6)) +
        coord_sf(xlim=c(-12,44), ylim=c(35,70)) +
        theme( axis.text.x=element_blank(), axis.text.y=element_blank())) 
 }

I could also use column numbers for i and create another object with column names to refer to when pasting the title of the map, but I am wondering why the above approach fails and what I could do to make it work in that setting.

Ronak Shah · Accepted Answer

In base R, you can either select the columns by position or by name, you can't combine them both in one command. If you use dplyr::select you can select columns by name and position in the same command.

So here are your options -

cols <- colnames(d)

for (i in cols[-1]) {
  #Select columns by position
  var_to_map <- d[,c(1,match(i, cols))]
  #OR select column by name
  var_to_map <- d[,c(cols[1],i)]
  #OR select column by position and name
  var_to_map <- dplyr::select(d, 1, i)
  #...rest of the code 
  #...rest of the code 
}

"For" loop with column names as index

Answers (2)

Related Questions

&quot;For&quot; loop with column names as index

Answers (2)

Related Questions

"For" loop with column names as index