Reputation: 13
I have a table (named 'df') in R with 3000 rows.
In each row, in the column 'TestResults', there is a string of numbers separated by commas (e.g. 5, 10, 1, 3...).
I would like to create a new column in 'df' called 'TestValue1' which only includes the first number in the string found in 'TestResults' (therefore, in the example row, under 'TestResults' would be the value "5".
This is the code I am running :
for (i in 1:nrow(df)) {
rname=rownames(df)[i]
a <- as.numeric(unlist(strsplit(df[rname, "TestResults"],",")))
df[rname, "TestValue1"] <- a[1]
}
The error message I receive is :
Error in strsplit(df[rname, ("TestResults"))], :
non-character argument
However, when I run : class(df$TestResults)
, I receive :
[1] "character"
so the string of numbers is a character
(This error holds true even in the absence of the as.numeric function being called)
Thanks very much for your help!
Upvotes: 1
Views: 196
Reputation: 30474
If you wish to use strsplit
you could do the following:
df <- cbind(df, TestValue1 = as.numeric(unlist(lapply(strsplit(df$TestResults, ","), `[[`, 1))))
Output
TestResults TestValue1
1 5, 10, 1, 3 5
2 12, 0, 19, 7 12
3 13, 4, 5, 11 13
Data
df <- data.frame(
TestResults = c("5, 10, 1, 3", "12, 0, 19, 7", "13, 4, 5, 11"),
stringsAsFactors = FALSE
)
Upvotes: 0
Reputation: 17175
The gsub
function seems to work with the sample data I generated. Hopefully it will work on your data!
#Created example data
res<-data.frame((rbind(("5, 10, 1, 3"),("4,3,2,10"), ("8,21,0,8"))))
names(res)<-"TestResults"
res$TestResults<-(as.character(res$TestResults))
#Run gsub
res$TestValue1<-gsub(",.*", "\\1", res$TestResults)
#See results
res
Output results:
TestResults TestValue1
1 5, 10, 1, 3 5
2 4,3,2,10 4
3 8,21,0,8 8
Upvotes: 1