Reputation: 1117
I have the DF
data.frame
. I would like to add another column
(i.e., call it station_no)
where it will extrac
t the number
after underscore
from the Variables column
.
library(lubridate)
library(tidyverse)
set.seed(123)
DF <- data.frame(Date = seq(as.Date("1979-01-01"), to = as.Date("1979-12-31"), by = "day"),
Grid_2 = runif(365,1,10), Grid_20 = runif(365,5,15)) %>%
pivot_longer(-Date, names_to = "Variables", values_to = "Values")
Desired Output:
DF_out <- data.frame(Date = c("1979-01-01","1979-01-01"),Variables = c("Grid_2","Grid_20"),
Values = c(0.95,1.3), Station_no = c(2,20))
Upvotes: 1
Views: 40
Reputation: 887118
Easy option is parse_number
which returns numeric converted value
library(dplyr)
DF %>%
mutate(Station_no = readr::parse_number(Variables))
Or using str_extract
(in case we want to go by the pattern)
library(stringr)
DF %>%
mutate(Station_no = str_extract(Variables, "(?<=_)\\d+"))
Or using base R
DF$Station_no <- trimws(DF$Variables, whitespace = '\\D+')
Upvotes: 2
Reputation: 39595
A base R
solution would be:
#Code
DF$Station_no <- sub("^[^_]*_", "", DF$Variables)
Output (some rows):
# A tibble: 730 x 4
Date Variables Values Station_no
<date> <chr> <dbl> <chr>
1 1979-01-01 Grid_2 3.59 2
2 1979-01-01 Grid_20 12.8 20
3 1979-01-02 Grid_2 8.09 2
4 1979-01-02 Grid_20 6.93 20
5 1979-01-03 Grid_2 4.68 2
6 1979-01-03 Grid_20 5.18 20
7 1979-01-04 Grid_2 8.95 2
8 1979-01-04 Grid_20 9.07 20
9 1979-01-05 Grid_2 9.46 2
10 1979-01-05 Grid_20 9.83 20
# ... with 720 more rows
Upvotes: 1