Reputation: 179
I have a data frame called data.df with various columns say col1,col2,col3....col15. The data frame does not have a specific class attribute but any attribute could be potentially used as a class variable. I would like to use an R variable called target which points to the column number to be treated as class as follows :
target<-data.df$col3
and then use that field (target) as input to several learners such as PART and J48 (from package RWeka) :
part<-PART(target~.,data=data.df,control=Weka_control(M=200,R=FALSE))
j48<-J48(target~.,data=data.df,control=Weka_control(M=200,R=FALSE))
The idea is to be able to change 'target' only once at the beginning of my R code. How can this be done?
Upvotes: 18
Views: 40497
Reputation: 22291
I sometimes manage to get a lot done by using strings to reference columns. It works like this:
> df <- data.frame(numbers=seq(5))
> df
numbers
1 1
2 2
3 3
4 4
5 5
> df$numbers
[1] 1 2 3 4 5
> df[['numbers']]
[1] 1 2 3 4 5
You can then have a variable target
be the name of your desired column as a string. I don't know about RWeka, but many libraries such as ggplot can take string references for columns (e.g. the aes_string
parameter instead of aes
).
Upvotes: 23
Reputation: 18628
If you ask about using references in R, it is impossible.
However, if you ask about getting a column by name not explicitly given, this is possible with [
operator, like this:
theNameOfColumnIwantToGetSummaryOf<-"col3"
summary(data.df[,theNameOfColumnIwantToGetSummaryOf])
...or like that:
myIndexOfTheColumnIwantToGetSummaryOf<-3
summary(data.df[,sprintf("col%d",myIndexOfTheColumnIwantToGetSummaryOf)])
Upvotes: 6