Reputation: 516
I would like to create a loop in which the index is given by the column names of a dataframe. The idea is to select one column at a time and create a map based on the data in that column. I need i
being the column name, as it identifies the name of the variable and I'll use that as part of the title of the map. However, I do not seem to be able to associate my index i
to the name of the column. My code goes as follows:
# random data
x <- rep(c("AT130", "DEA1A", "DEA2C", "SE125", "SE232"), 4)
y <- c(1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0 ,1, 0, 1, 0, 1)
z <- c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ,0, 0, 0, 0, 0)
w <- c(0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1 ,0, 1, 0, 1, 0)
d <- as.data.frame(cbind(x,y,z,w))
colnames(d) <- c("id", "typeA", "typeB", "typeC")
for (i in colnames(d[,2:ncol(d)])) {
var_to_map <- d[,c(1,i)]
## do stuff
}
I get the following error at the first line:
Error: Can't subset columns that don't exist.
x Columns `1`, `2`, and `3` don't exist.
Run `rlang::last_error()` to see where the error occurred.
However, if I just run colnames(d[,2:ncol(d)])
, it works properly
colnames(d[,2:ncol(d)])
[1] "typeA" "typeB" "typeC"
I could find a workaround by using columns numbers to make it work, but I would like to keep the column names since I am printing (10+) maps within the loop and I am using i
to insert the title of the map each, as follows:
# I use geodata files from the library `Eurostat`.
geodata <- get_eurostat_geospatial(resolution = "60", nuts_level = "3", year = 2013)
for (i in colnames(d[,2:ncol(d)])) {
var_to_map <- d[,c(1,i)]
colnames(var_to_map)[1] <- "geo"
# Joining, by = "geo"
map_data <- merge(var_to_map, geodata, by=c("geo"), all.y=T, all.x=T)
## creating ranges
map_data$cat <- with(map_data, cut(value,
breaks= qu <- unique(quantile(value,
probs=c(0, 0.2, 0.5, 0.8,
0.9, 0.95, 0.99, 1),
na.rm=TRUE, include.lowest=T )),
labels=qu[-1]),include.lowest=TRUE )
# Map
print(ggplot(data=map_data) + geom_sf(aes(fill=cat), size=.1) +
scale_fill_brewer(palette = "Darkred", na.value= "grey") + aes(geometry = geometry) +
guides(fill = guide_legend(reverse=T, title = "Percentiles")) +
labs(title = paste("The name of this graph is the column name", i) ## here is where I use the index
)+
theme_minimal() + theme(legend.position=c(.8,.6)) +
coord_sf(xlim=c(-12,44), ylim=c(35,70)) +
theme( axis.text.x=element_blank(), axis.text.y=element_blank()))
}
I could also use column numbers for i
and create another object with column names to refer to when pasting the title of the map, but I am wondering why the above approach fails and what I could do to make it work in that setting.
Upvotes: 1
Views: 3956
Reputation: 26484
There's a lot going on in this question, but perhaps this minimal example will help:
library(tidyverse)
# random data
d <- data.frame(x = rep(c("AT130", "DEA1A", "DEA2C", "SE125", "SE232"), 4),
y = sample(1:10, 20, replace = TRUE),
z = sample(1:10, 20, replace = TRUE),
w = sample(1:10, 20, replace = TRUE))
colnames(d) <- c("id", "typeA", "typeB", "typeC")
for (i in colnames(d[,2:ncol(d)])) {
type <- ensym(i)
p <- ggplot(d, aes(y = !!type, x = id, fill = id)) +
geom_boxplot() +
ggtitle(type)
print(p)
}
Upvotes: 1
Reputation: 388807
In base R, you can either select the columns by position or by name, you can't combine them both in one command. If you use dplyr::select
you can select columns by name and position in the same command.
So here are your options -
cols <- colnames(d)
for (i in cols[-1]) {
#Select columns by position
var_to_map <- d[,c(1,match(i, cols))]
#OR select column by name
var_to_map <- d[,c(cols[1],i)]
#OR select column by position and name
var_to_map <- dplyr::select(d, 1, i)
#...rest of the code
#...rest of the code
}
Upvotes: 1