how to extract number beside specific string in a cell?

I would like to extract number information in cells which is located beside specific string. My data looks like this.

    item             stock
PRE 24GUSSETX4SX15G   200
PLS 12KLRX10SX15G     200
ADU 24SBX200ML        200
NIS 18BNDX40SX11G     200
REF 500GX12BTL        200

i want to extract the numbers which located besides string 'GUSSET','KLR','SB','BND' and 'BTL'. I want to use this number to do multiplication with the stock. For example like this.

            item         stock    pcs    total
    PRE 24GUSSETX4SX15G   200      24    4800
    PLS 12KLRX10SX15G     200      12    2400 
    ADU 24SBX200ML        200      24    4800
    NIS 18BNDX40SX11G     200      18    3600
    REF 500GX12BTL        200      12    2400

anyone know how to extract the numbers? thanks very much in advance

Upvotes: 0

Views: 47

Answers (2)

akrun
akrun

Reputation: 887251

We can do this in tidyverse

library(tidyverse)
df %>%
    mutate(pcs = as.numeric(str_extract(item, "(\\d+)(?=(GUSSET|KLR|SB|BND|BTL))")),
    total = pcs * stock)
#                 item stock pcs total
#1 PRE 24GUSSETX4SX15G   200  24  4800
#2   PLS 12KLRX10SX15G   200  12  2400
#3      ADU 24SBX200ML   200  24  4800
#4   NIS 18BNDX40SX11G   200  18  3600
#5      REF 500GX12BTL   200  12  2400

data

df <- structure(list(item = c("PRE 24GUSSETX4SX15G", "PLS 12KLRX10SX15G", 
"ADU 24SBX200ML", "NIS 18BNDX40SX11G", "REF 500GX12BTL"), stock = c(200L, 
200L, 200L, 200L, 200L)), class = "data.frame", row.names = c(NA, 
-5L))

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 389055

One way using base R, is to use sub to extract numbers besides those groups and multiply them with stock to get total.

df$pcs <- as.numeric(sub(".*?(\\d+)(GUSSET|KLR|SB|BND|BTL).*", "\\1", df$item))
df$total <- df$stock * df$pcs

df
#               item stock pcs total
#PRE 24GUSSETX4SX15G   200  24  4800
#PLS   12KLRX10SX15G   200  12  2400
#ADU      24SBX200ML   200  24  4800
#NIS   18BNDX40SX11G   200  18  3600
#REF      500GX12BTL   200  12  2400

Or everything in one pipe

library(dplyr)
df %>%
  mutate(pcs = as.numeric(sub(".*?(\\d+)(GUSSET|KLR|SB|BND|BTL).*", "\\1", item)), 
         total = stock * pcs)

Upvotes: 1

Related Questions