Reputation: 1644
I would like to take the maximum negative value of a column containing negatives and positives (diff_start
), and minimum positive value of another column (diff_end
) in R.
Data:
data <- read.table(text ="
id lab diff_start diff_end
1 hb -1.7 -1.8
1 hb -0.3 -0.3
1 hb 0.6 0.5
1 hb 0.7 0.8", header = TRUE)
Desired Output:
# id lab diff_start diff_end
# 1 hb -0.3 0.5
What I have done:
<= 0
for diff_start
and >= 0
for diff_end
I think this is pretty long and inefficient, and hope to make it more succinct.
full_join(
data %>%
group_by(id, lab) %>%
filter(diff_start <= 0) %>%
summarise(diff_start = max(diff_start)) %>%
ungroup(),
data %>%
group_by(id, lab) %>%
filter(diff_start >= 0) %>%
summarise(diff_end = min(diff_end)) %>%
ungroup())
Upvotes: 1
Views: 1905
Reputation: 5481
You can factorise your code this way:
data %>%
group_by(id, lab) %>%
summarise(diff_start = max(diff_start[diff_start <= 0]), diff_end = min(diff_end[diff_end >= 0])) %>%
ungroup()
# A tibble: 1 x 4
id lab diff_start diff_end
<int> <fct> <dbl> <dbl>
1 1 hb -0.3 0.5
No need to filter first as you can do it in summarize
.
To deal with missing negatives or positives:
data %>%
group_by(id, lab) %>%
summarise(
diff_start = if(sum(diff_start <= 0) == 0) NA else max(diff_start[diff_start <= 0], na.omit = TRUE),
diff_end = if(sum(diff_end >= 0) == 0) NA else min(diff_end[diff_end >= 0], na.omit = TRUE)
) %>%
ungroup()
Upvotes: 2
Reputation: 332
Give this a go:
max(data$diff_start[data$diff_start < 0])
min(data$diff_end[data$diff_end > 0])
Result:
> max(data$diff_start[data$diff_start < 0])
[1] -0.3
> min(data$diff_end[data$diff_end > 0])
[1] 0.5
Edit:
To maintain the grouping you can use:
by(data, list(data$id, data$lab), function(x) {
c(max(x$diff_start[x$diff_start < 0]),
min(x$diff_end[x$diff_end > 0]))
})
Output
[1] -0.3 0.5
Upvotes: 4