Reputation: 7790
I have the following data frame:
df <- structure(list(x = c(0.389794300700167, -1.20807617542949, -0.363676017470862,
-1.62667268170309, -0.256478394123992, 1.10177950308713, 0.755781508027337,
-0.238233556018718, 0.98744470341339, 0.741390128383824), y = c(0.0893472664958216,
-0.954943856152377, -0.195150384667239, 0.92552126209408, 0.482978524836611,
-0.596310636720207, -2.18528683816953, -0.674865937875116, -2.11906119191017,
-1.2651980215309), fac = structure(c(2L, 1L, 2L, 3L, 1L, 1L,
1L, 1L, 2L, 2L), .Label = c("A", "B", "C"), class = "factor")), .Names = c("x",
"y", "fac"), row.names = c(NA, -10L), class = "data.frame")
df
#> x y fac manual_assignment
#> 1 0.3897943 0.08934727 B b.x
#> 2 -1.2080762 -0.95494386 A a.y
#> 3 -0.3636760 -0.19515038 B b.y
#> 4 -1.6266727 0.92552126 C c.y
#> 5 -0.2564784 0.48297852 A a.y
#> 6 1.1017795 -0.59631064 A a.x
#> 7 0.7557815 -2.18528684 A a.x
#> 8 -0.2382336 -0.67486594 A a.x
#> 9 0.9874447 -2.11906119 B b.x
#> 10 0.7413901 -1.26519802 B b.x
What I want to do is to count rows based on comparing values in x
and y
. For every row, if the value x
is larger than y
, we increase the count of fac
member by 1. So the end result is this:
x.count y.count
A 3 2 (# a.y)
B 3 1
C 0 1
How can I achieve that? Possible with dplyr?
Upvotes: 2
Views: 84
Reputation: 886938
With data.table
, we can use
library(data.table)
dcast(setDT(df)[, .N, .(fac, measure = c('y.count', 'x.count')[(x > y) + 1])],
fac ~measure, fill = 0)
# fac x.count y.count
#1: A 3 2
#2: B 3 1
#3: C 0 1
Upvotes: 1
Reputation: 388797
With dplyr
we can group by fac
and count the length
of values where x
and y
values are greater.
library(dplyr)
df %>%
group_by(fac) %>%
summarise(x.count = length(which(x > y)),
y.count = length(which(x < y)))
# fac x.count y.count
# <fctr> <int> <int>
#1 A 3 2
#2 B 3 1
#3 C 0 1
Upvotes: 1
Reputation: 214927
This is more straight forward with table
:
with(df, table(fac, ifelse(x > y, "x.count", "y.count")))
#fac x.count y.count
# A 3 2
# B 3 1
# C 0 1
With dplyr/tidyr
, you need a few more lines of code:
library(tidyverse)
df %>%
group_by(fac, measure = if_else(x > y, "x.count", "y.count")) %>%
tally() %>%
spread(measure, n, fill = 0)
#Source: local data frame [3 x 3]
#Groups: fac [3]
# fac x.count y.count
#* <fctr> <dbl> <dbl>
#1 A 3 2
#2 B 3 1
#3 C 0 1
Upvotes: 2