Reputation: 63
this one has been bugging me for a couple of days now, and I havent had any luck on stack exchange yet. Essentially, I have two tables, one table defines what columns (by column number) to select from the second table. My initial plan was to string together the columns and pass that into a subselect statement, however when I define the string as as.character it's not happy, i.e.:
# Data Sets, Variable_Selection: Table of Columns to Select from Variable_Table
VARIABLE_SELECTION <- data.frame(Set.1 = c(3,1,1,1,1), Set.2 = c(0,3,2,2,2), Set.3 = c(0,0,3,4,3),
Set.4 = c(0,0,0,5,4), Set.5 = c(0,0,0,0,5))
VARIABLE_TABLE <- data.frame(Var.1 = runif(100,0,10), Var.2 = runif(100,-100,100), Var.3 = runif(100,0,1),
Var.4 = runif(100,-1000,1000), Var.5 = runif(100,-1,1), Var.6 = runif(100,-10,10))
# Sting rows into character string of columns to select
VARIABLE_STRING <- apply(VARIABLE_SELECTION,1,paste,sep = ",",collapse = " ")
VARIABLE_STRING <- gsub(" ",",",VARIABLE_STRING)
VARIABLE_STRING <- data.frame(VAR_STRING = gsub(",0","",VARIABLE_STRING))
# Will actually be part of lapply function but, one line selection for demonstration:
VARIABLE_SINGLE_SET <- as.character(VARIABLE_STRING[4,])
# Subset table for selected columns
VARIABLE_TABLE_SUB_SELECT <- VARIABLE_TABLE[,c(VARIABLE_SINGLE_SET)]
# Error Returned:
# Error in `[.data.frame`(VARIABLE_TABLE, , c(VARIABLE_SINGLE_SET)) :
# undefined columns selected
I know the text formatting is the problem but I can't find a workaround, any suggestions?
Upvotes: 1
Views: 818
Reputation: 67818
Does this give the desired result?
lapply(VARIABLE_SELECTION, function(x) VARIABLE_TABLE[ , x[x != 0], drop = FALSE])
Produces a list where each element is a subset of 'VARIABLE_TABLE' given by 'VARIABLE_SELECTION' (using a 'VARIABLE_TABLE' with fewer rows).
# $Set.1
# Var.3 Var.1 Var.1.1 Var.1.2 Var.1.3
# 1 0.09536403 5.593292 5.593292 5.593292 5.593292
# 2 0.09086404 6.339074 6.339074 6.339074 6.339074
#
# $Set.2
# Var.3 Var.2 Var.2.1 Var.2.2
# 1 0.09536403 65.81870 65.81870 65.81870
# 2 0.09086404 66.79157 66.79157 66.79157
#
# $Set.3
# Var.3 Var.4 Var.3.1
# 1 0.09536403 -674.6672 0.09536403
# 2 0.09086404 -576.7986 0.09086404
#
# $Set.4
# Var.5 Var.4
# 1 0.5155411 -674.6672
# 2 -0.9593219 -576.7986
#
# $Set.5
# Var.5
# 1 0.5155411
# 2 -0.9593219
Upvotes: 1
Reputation: 121608
You should avoid sub-setting by number of columns and process by variables names or at least keep your index as integer list( no need to coerce to a string)
First To stay in the same idea, this correct your code. Basciaclly I coerce your variable to vector:
VARIABLE_TABLE[,as.numeric(unlist(strsplit(
VARIABLE_SINGLE_SET,',')))]
Upvotes: 1