Reputation: 77
I need to add a Thickness
column to my Products
table based on multiple conditions.
1 : Thickness should be only one of these values
Plate_Thickness <- c(5.8,25.1,27.1,32.5,55.6,98.1,120.4)
2 : Thickness should be between the ThicknessMin
and ThicknessMax
values already existing in table.
Current table looks like this:
Product ThicknessMin ThicknessMax
P0001 0 8
P0002 31.01 70
P0003 8.01 31
P0004 70.01 999
P0005 8.01 31
So, the idea is to pick a value for Thickness from the vector randomly but it should be between the ThicknessMin
and ThicknessMax
. Please help with any pointers how to go about this. Thanks.
Upvotes: 0
Views: 48
Reputation: 32548
#DATA
df = structure(list(Product = c("P0001", "P0002", "P0003", "P0004",
"P0005"), ThicknessMin = c(0, 31.01, 8.01, 70.01, 8.01), ThicknessMax = c(8L,
70L, 31L, 999L, 31L)), .Names = c("Product", "ThicknessMin",
"ThicknessMax"), class = c("data.table", "data.frame"), row.names = c(NA,
-5L))
Plate_Thickness = c(5.8,25.1,27.1,32.5,55.6,98.1,120.4)
set.seed(1)
apply(X = df[c("ThicknessMin", "ThicknessMax")],
MARGIN = 1, #Run FUN on each row of X
FUN = function(x)
#Retain only eligible values for each row and sample 1 value
sample(x = Plate_Thickness[Plate_Thickness > x[1] & Plate_Thickness < x[2]],
size = 1))
#[1] 2.0 32.5 27.1 120.4 25.1
Upvotes: 0
Reputation: 24480
A vectorized base
R solution (df
is your data.frame):
set.seed(1) #just for reproducibility
a<-findInterval(df$ThicknessMin,Plate_Thickness,all.inside=TRUE)
b<-findInterval(df$ThicknessMax,Plate_Thickness,all.inside=TRUE)
Plate_Thickness[runif(length(a)) %/% (1/(b-a+1))+a]
#[1] 5.8 32.5 25.1 98.1 5.8
Upvotes: 2
Reputation: 10761
We can use the rowwise
function from the dplyr
package to sample
from the Plate_Thickness
vector. Within the call to sample
, we sample
only from elements of Plate_Thickness
which are between
ThicknessMin
and ThicknessMax
. I put your table in a data.frame
called dat
:
library(dplyr)
set.seed(123)
dat %>%
rowwise() %>%
mutate(thick_sample = sample(Plate_Thickness[between(Plate_Thickness, ThicknessMin, ThicknessMax)],
1))
Product ThicknessMin ThicknessMax thick_sample
<fctr> <dbl> <int> <dbl>
1 P0001 0.00 8 2.0
2 P0002 31.01 70 55.6
3 P0003 8.01 31 25.1
4 P0004 70.01 999 120.4
5 P0005 8.01 31 27.1
dat <- structure(list(Product = structure(1:5, .Label = c("P0001", "P0002",
"P0003", "P0004", "P0005"), class = "factor"), ThicknessMin = c(0,
31.01, 8.01, 70.01, 8.01), ThicknessMax = c(8L, 70L, 31L, 999L,
31L)), .Names = c("Product", "ThicknessMin", "ThicknessMax"), class = "data.frame", row.names = c(NA,
-5L))
Upvotes: 1
Reputation: 13581
Plate_Thickness <- c(5.8,25.1,27.1,32.5,55.6,98.1,120.4)
df <- structure(list(Product = c("P0001", "P0002", "P0003", "P0004",
"P0005"), ThicknessMin = c(0, 31.01, 8.01, 70.01, 8.01), ThicknessMax = c(8L,
70L, 31L, 999L, 31L), Plate_Thickness = c(5.8, 32.5, 27.1, 120.4,
25.1)), .Names = c("Product", "ThicknessMin", "ThicknessMax",
"Plate_Thickness"), row.names = c(NA, -5L), class = c("data.table",
"data.frame"))
library(dplyr)
acceptable_vals <- lapply(1:nrow(df), function(x) Plate_Thickness[between(Plate_Thickness, df$ThicknessMin[x], df$ThicknessMax[x])])
set.seed(1)
df$Plate_Thickness <- sapply(acceptable_vals, function(x) x[sample(1:length(x), 1)])
Product ThicknessMin ThicknessMax Plate_Thickness
1: P0001 0.00 8 5.8
2: P0002 31.01 70 32.5
3: P0003 8.01 31 27.1
4: P0004 70.01 999 120.4
5: P0005 8.01 31 25.1
Upvotes: 1