Reputation: 13
I have a dataframe with numeric variables of the wet weight and dry weight of samples, say soil. In this dataframe some values are equal to 0 and other are greater than zero. I want to apply a formula to the variables to create a new variable, but only for the pairs of data that are greater than zero. So far, I have tried the filter
function of dplyr
.
I want to create the new variable using the following formula:
moisture content = (wet weight - dry weight)/wet weight
Here is the code I have tried thus far:
dry_weight <- c(0,1,0,2,0,3,4,5,6,7)
wet_weight <- c(1,0,2,4,0,1,4,0,5,0)
weights <- data.frame(dry_weight, wet_weight)
weights$moisture <- weights %>%
filter(weights$wet_weight > 0, weights$dry_weight >0) %>%
mutate((weights$wet_weight-weights$dry_weight)/weights$wet_weight)
I am not sure if mutate
is the right approach, but when I execute the code i get:
"Error: Column `(weights$wet_weight - weights$dry_weight)/weights$wet_weight` must
be length 4 (the number of rows) or one, not 10"
Any suggestions would be appreciated.
Upvotes: 1
Views: 302
Reputation: 388787
A vectorized way :
#Initialize column to NA
weights$moisture <- NA
#Get the index where dry_weight > 0 and wet_weight > 0
inds <- with(weights, dry_weight > 0 & wet_weight >0)
#Calculate using the formula and replace the value.
weights$moisture[inds] <- with(weights,
(wet_weight[inds] - dry_weight[inds])/wet_weight[inds])
weights
# dry_weight wet_weight moisture
#1 0 1 NA
#2 1 0 NA
#3 0 2 NA
#4 2 4 0.5
#5 0 0 NA
#6 3 1 -2.0
#7 4 4 0.0
#8 5 0 NA
#9 6 5 -0.2
#10 7 0 NA
Upvotes: 0
Reputation: 11514
Another approach would be to simply use base R
:
weights$moisture <-
ifelse(weights$dry_weight*weights$wet_weight > 0
, 1-weights$dry_weight/weights$wet_weight
, NA)
weights
dry_weight wet_weight moisture
1 0 1 NA
2 1 0 NA
3 0 2 NA
4 2 4 0.5
5 0 0 NA
6 3 1 -2.0
7 4 4 0.0
8 5 0 NA
9 6 5 -0.2
10 7 0 NA
ifelse
is a vectorised if
with ifelse(condition, if true then this, if false then that)
. Here, I check if both values are strictly greater than zero, in which case I return the moisture, or else I return NA
.
Upvotes: 1
Reputation: 24770
I hope this will get you started.
First, no need to keep typing weights$
every time when you're using pipes (%>%
).
Second, with mutate
, you need to have a left hand side that is assigned with =
.
weights %>%
dplyr::filter(wet_weight > 0 & dry_weight > 0) %>%
mutate(moisture = (wet_weight - dry_weight)/wet_weight)
# dry_weight wet_weight moisture
#1 2 4 0.5
#2 3 1 -2.0
#3 4 4 0.0
#4 6 5 -0.2
Remember, if you want to assign this back to weights
, just add weights <-
to the beginning of the first line.
Upvotes: 1