Reputation: 33
I have a dataset df1
like so:
snp <- c("rs7513574_T", "rs1627238_A", "rs1171278_C")
p.value <- c(2.635489e-01, 9.836280e-01 , 6.315047e-01 )
df1 <- data.frame(snp, p.value)
I want to remove the _
underscore and the letters after it (representing allele) in df1 and make this into a new dataframe df2
I tried this using the code
df2 <- df1[,c("snp", "allele"):=tstrsplit(`snp`, "_", fixed = TRUE)]
However, this changes the df1
data frame. Is there another way to do this?
Upvotes: 0
Views: 669
Reputation: 887048
Consider creating a copy
of the dataset and do the tstrsplit
on the copied data to avoid changes in original data
library(data.table)
df2 <- copy(df1)
setDT(df2)[,c("snp", "allele") := tstrsplit(snp, "_", fixed = TRUE)]
Upvotes: 0
Reputation: 145765
This is my best guess as to what you want:
library(tidyr)
separate(df1, snp, into = c("snp", "allele"), sep = "_")
# snp allele p.value
# 1 rs7513574 T 0.2635489
# 2 rs1627238 A 0.9836280
# 3 rs1171278 C 0.6315047
Upvotes: 1
Reputation: 6206
df2 = df1 %>%
dplyr::mutate(across(c(V1, V2, V3), ~stringr::str_remove_all(., "_[:alpha:]")))
> df2
V1 V2 V3
snp rs7513574 rs1627238 rs1171278
p.value 0.2635489 0.983628 0.6315047
Upvotes: 0