Achal Neupane
Achal Neupane

Reputation: 5719

How to delete last character in column values if greater than certain length in R?

I have data called deseq.res. It has a column called Gene. I want to delete values in this column if the value is of greater than 10 character length.

deseq.res

deseq.res<-structure(list(Gene = c("SS1G_0300902", "SS1G_024991", "SS1G_09248", 
"SS1G_09768"), sampleA = c("Healthy", "Healthy", "Healthy", "Healthy"
), sampleB = c("Infected", "Infected", "Infected", "Infected"
)), .Names = c("Gene", "sampleA", "sampleB"), row.names = c(NA, 
4L), class = "data.frame")

Result I want:

        Gene sampleA  sampleB
SS1G_03009 Healthy Infected
SS1G_02499 Healthy Infected
SS1G_09248 Healthy Infected
SS1G_09768 Healthy Infected

code I tried:

This is where I am having trouble, then I could simply use gsub or substring. I can do it with more elaborate way, but I wanted to use function to do this.

check.len<- function(x){if (length(deseq.res$Gene[x])>10) return (x)}
check.len(deseq.res$Gene)

Upvotes: 1

Views: 218

Answers (2)

DevGin
DevGin

Reputation: 453

You can use library(dplyr) and mutate:

library(dplyr)
deseq.res <- deseq.res %>% mutate(Gene = substr(Gene,1,10))

Upvotes: 0

akrun
akrun

Reputation: 887058

We can use substr to get extract the first 10 character substring of the data

deseq.res$Gene <- substr(deseq.res$Gene, 1, 10)

Based on the OP's function, it is nchar instead of length

check.len <- function(x, n) ifelse(nchar(x) > n, substr(x, 1, n) , x)
check.len(deseq.res$Gene, n = 10)

Upvotes: 4

Related Questions