Reputation: 1878
R Gurus, I am struggling to find an efficient way to split a string into multiple parts given in a vector.
In the following example, I have few cryptocurrencies' pairs from BINANCE exchange. I want to split each pair into two separate parts given in symbol
column in top100
data frame.
library(dplyr)
library(jsonlite)
library(RCurl)
top100 <- data.frame(fromJSON(getURL(paste0('https://api.coinmarketcap.com/v1/ticker/?start=0&limit=100'))))
markets <- data.frame(pairs = c("NEOBTC","EOSETH","VENETH","ELFETH","ICXETH","BNBETH","NEOETH",
"TRXETH","QTUMETH","DASHETH","XRPETH" ,"ETHUSDT","LTCUSDT","ADAETH",
"XMRETH","ZECETH","IOTAETH","NEOUSDT","BNBUSDT","XLMBNB","LSKBNB"),
symbol1 = NA,
symbol2 = NA)
markets$symbol1 <- substr(markets$pairs, 1,3) markets$symbol2 <- substr(markets$pairs, 4,6)
markets$symbol1 %in% top100$symbol markets$symbol2 %in% top100$symbol
One naive way do that is to take first three characters of the ticker as symbol1
and last three characters as symbol2
, some tickers have more than three characters like DASH.
Upvotes: 0
Views: 145
Reputation: 79208
You can try the following code:
grep("\\w\\s\\w",sapply(paste0("(",top100$symbol,"$)"),
sub,"\\3 \\1",a<-markets$pairs),value = T)%>%
{.[match(a,sub("\\s","",.))]}%>%
strsplit(.,"\\s")%>%do.call(rbind,.)%>%
{setNames(as.data.frame(.),paste0("Symbols",1:2))}
You can also try:
sub(paste0("(",top100$symbol,")$",collapse = "|"),"",a<-markets$pairs)%>%
{cbind.data.frame(Symbols1=.,Symbols2=sub(paste0("^(",.,")",collapse = "|"),"",a))}
Both the codes above give:
Symbols1 Symbols2
1 NEO BTC
2 EOS ETH
3 VEN ETH
4 ELF ETH
5 ICX ETH
6 BNB ETH
7 NEO ETH
8 TRX ETH
9 QTUM ETH
10 DASH ETH
11 XRP ETH
12 ETH USDT
13 LTC USDT
14 ADA ETH
15 XMR ETH
16 ZEC ETH
17 IOTA ETH
18 NEO USDT
19 BNB USDT
20 XLM BNB
21 LSK BNB
Upvotes: 1