schmitzhugen
schmitzhugen

Reputation: 13

Splitting comma separated string in R for every row

I have a table (named 'df') in R with 3000 rows.

In each row, in the column 'TestResults', there is a string of numbers separated by commas (e.g. 5, 10, 1, 3...).

I would like to create a new column in 'df' called 'TestValue1' which only includes the first number in the string found in 'TestResults' (therefore, in the example row, under 'TestResults' would be the value "5".

This is the code I am running :


for (i in 1:nrow(df)) {
  rname=rownames(df)[i]
  a <- as.numeric(unlist(strsplit(df[rname, "TestResults"],",")))
  df[rname, "TestValue1"] <- a[1]
}

The error message I receive is :

Error in strsplit(df[rname, ("TestResults"))], : non-character argument

However, when I run : class(df$TestResults), I receive : [1] "character" so the string of numbers is a character

(This error holds true even in the absence of the as.numeric function being called)

Thanks very much for your help!

Upvotes: 1

Views: 196

Answers (2)

Ben
Ben

Reputation: 30474

If you wish to use strsplit you could do the following:

df <- cbind(df, TestValue1 = as.numeric(unlist(lapply(strsplit(df$TestResults, ","), `[[`, 1))))

Output

   TestResults TestValue1
1  5, 10, 1, 3          5
2 12, 0, 19, 7         12
3 13, 4, 5, 11         13

Data

df <- data.frame(
  TestResults = c("5, 10, 1, 3", "12, 0, 19, 7", "13, 4, 5, 11"), 
  stringsAsFactors = FALSE
)

Upvotes: 0

jpsmith
jpsmith

Reputation: 17175

The gsub function seems to work with the sample data I generated. Hopefully it will work on your data!

#Created example data
res<-data.frame((rbind(("5, 10, 1, 3"),("4,3,2,10"), ("8,21,0,8"))))
names(res)<-"TestResults"
res$TestResults<-(as.character(res$TestResults))

#Run gsub
res$TestValue1<-gsub(",.*", "\\1", res$TestResults)

#See results
res

Output results:

  TestResults TestValue1
1 5, 10, 1, 3          5
2    4,3,2,10          4
3    8,21,0,8          8

Upvotes: 1

Related Questions