Reputation: 1456
library(tidyverse)
I have a string of letters and letter-words:
myletters <- c("A", "A", "B", "C", "C", "C", "C", "AA", "BB", "BB")
I'd like to get a count of each letter, arranged in the original order. All groups of similar letters and letter words will be grouped in the original string ... they will never be mixed. For example, this will never happen:
mylettersNever <- c("A", "B", "A", "C", "C", "C", "C", "AA", "BB", "BB")
I tried some things with table()
, but it did the same as the following code.
This doesn't work:
myletters %>%
tibble(letters = .) %>%
group_by(letters) %>%
summarise(n = n())
... because the output is
# A tibble: 5 x 2
letters n
<chr> <int>
1 A 2
2 AA 1
3 B 1
4 BB 2
5 C 4
... but I would like:
# A tibble: 5 x 2
letters n
<chr> <int>
1 A 2
2 B 1
3 C 4
4 AA 1
5 BB 2
Help?
Upvotes: 2
Views: 2106
Reputation: 11140
Here's a hacky way but works. Basically assign an id column to each group based on whichever appears first and then drop id after summarizing. Also you can directly use count
which groups and un-groups behind the scenes.
myletters %>%
tibble(letters = .) %>%
count(id = match(letters, unique(letters)), letters) %>%
select(-id)
# A tibble: 5 x 2
letters n
<chr> <int>
1 A 2
2 B 1
3 C 4
4 AA 1
5 BB 2
Upvotes: 1
Reputation: 6542
You can use count()
to do the counting according to some variable. Indeed, to keep the order, considering your character column as factor will help maintain levels in order
library(tidyverse)
myletters <- c("A", "A", "B", "C", "C", "C", "C", "AA", "BB", "BB")
tibble(letters = myletters) %>%
mutate(letters = as_factor(letters)) %>%
count(letters)
#> # A tibble: 5 x 2
#> letters n
#> <fct> <int>
#> 1 A 2
#> 2 B 1
#> 3 C 4
#> 4 AA 1
#> 5 BB 2
Created on 2018-12-05 by the reprex package (v0.2.1)
Upvotes: 3