Reputation: 380
I am searching for an efficient way to split word into char( I have some special characters like ",). ) . I have done something using loop and substring function but it is super slow.
Example: Code Input
words <- data.frame(V1 = c("blibli","blabla","\"","]"))
words$V1 <- as.character(words$V1)
Input looks like:
V1
1 blibli
2 blabla
3 "
4 ]
Code that i have done:
char_df <- NULL
for(i in 1:nrow(words)){
print(i)
temp <- substring(words[i,][1],1:nchar(words[i,]),1:nchar(words[i,]))
char_df <- rbind(char_df,
data.frame(char = temp,
idx = 1:nchar(words[i,]) )
)
}
expected output:
char idx
1 b 1
2 l 2
3 i 3
4 b 4
5 l 5
6 i 6
7 b 1
8 l 2
9 a 3
10 b 4
11 l 5
12 a 6
13 " 1
14 ] 1
I am open to any technique dplyr , data.table , base R.
Upvotes: 2
Views: 1006
Reputation: 1795
Additionally, I would add the pretty nifty package stringi
library(stringi)
x<-c("dog","cat","@@$")
unlist(stri_extract_all(x,regex = "."))
[1] "d" "o" "g" "c" "a" "t" "@" "@" "$"
Upvotes: 2
Reputation: 887048
After splitting the 'V1' by ''
into a list
, we get the sequence
of the lengths
of the list
and create a data.frame
by unlist
ing the list
lst <- strsplit(words$V1, "")
data.frame(char = unlist(lst), idx = sequence(lengths(lst)))
# char idx
#1 b 1
#2 l 2
#3 i 3
#4 b 4
#5 l 5
#6 i 6
#7 b 1
#8 l 2
#9 a 3
#10 b 4
#11 l 5
#12 a 6
#13 " 1
#14 ] 1
Upvotes: 3