Reputation: 10173
I will edit the post name shortly as I think up a better title, but for the time being, a short example below highlights what I am struggling with:
dput(mydf)
structure(list(gameID = c("34", "34", "34", "34", "34", "25",
"25", "25")), class = "data.frame", row.names = c(NA, -8L))
mydf
gameID
1 34
2 34
3 34
4 34
5 34
6 25
7 25
8 25
(garbocCol is included only so that the dataframe had >1 column - otherwise please ignore.) This feels like it should be a fairly straightforward data manipulation problem. I would like to create a new column that is simply the gameID column pasted with the count of that gameID. I am thus seeking the following output:
mydf
gameID newCol
1 34 34-1
2 34 34-2
3 34 34-3
4 34 34-4
5 34 34-5
6 25 25-1
7 25 25-2
8 25 25-3
The gameID column is already a character, and the newCol is preferably going to be type character as well. I am working within a long-ish dplyr chain, and am trying to get the following to work:
mydf <- mydf %>%
dplyr::mutate(newCol = paste0(gameID, '-', {what goes here}))
I am fairly easily able to do this with a for-loop, however a dplyr solution would be much better.
Upvotes: 1
Views: 64
Reputation: 887118
If we need to paste
with sequence, get the sequence with row_number()
grouped by 'gameID' and paste
to create the 'newCol'
mydf %>%
group_by(gameID) %>%
mutate(newCol = paste(gameID, row_number(), sep = '-'))
# A tibble: 8 x 3
# Groups: gameID [2]
# gameID garboCol newCol
# <fct> <dbl> <chr>
#1 34 1 34-1
#2 34 2 34-2
#3 34 3 34-3
#4 34 4 34-4
#5 34 5 34-5
#6 25 6 25-1
#7 25 7 25-2
#8 25 8 25-3
If we want to make this shorter, an option is rowid
from data.table
. Advantage is that it won't create the group attributes in the output
library(data.table)
mydf %>%
mutate(newCol = paste(gameID, rowid(gameID), sep='-'))
# gameID garboCol newCol
#1 34 1 34-1
#2 34 2 34-2
#3 34 3 34-3
#4 34 4 34-4
#5 34 5 34-5
#6 25 6 25-1
#7 25 7 25-2
#8 25 8 25-3
Or use it with glue
(from glue
)
library(glue)
mydf %>%
mutate(newCol = glue("{gameID}-{rowid(gameID)}"))
Upvotes: 2
Reputation: 26343
This might be what you had in mind.
mydf %>%
group_by(gameID) %>%
dplyr::mutate(newCol = paste0(gameID, '-', seq_along(gameID)))
# A tibble: 8 x 3
# Groups: gameID [2]
# gameID garboCol newCol
# <fct> <dbl> <chr>
#1 34 1 34-1
#2 34 2 34-2
#3 34 3 34-3
#4 34 4 34-4
#5 34 5 34-5
#6 25 6 25-1
#7 25 7 25-2
#8 25 8 25-3
Upvotes: 2