jed
jed

Reputation: 132

Mutating via dplyr by a particular set of indices

I am working with a dataframe which follows a pattern of:

Key     Data
Loc     Place1
Value1  6
Value2  7
Loc     Place2
Value3  8
Loc     Place3
Value1  9
Value2  10
Loc     Place4
Value3  11

It is a rough dataset where a pattern exists - in this example, rows within sep(1,100,by=5) would identify the first location of an observation. My goal is to adjust the key in those positions to be different such as LocA rather than Loc in order for a spread(key,value) to provide me with unique observations I can use for further analysis:

LocA   Value1 Value2  Loc    Value3
Place1      6      7  Place2      8
Place3      9     10  Place4     11

I have been using dplyr and a chain of other mutates and selects to get to this point so I'm hoping to remain in the chain. I can see how I can do it with a appropriate subsetting outside of the chain but am having difficulty wrapping my head around a dplyr solution.

Upvotes: 0

Views: 70

Answers (2)

Kan Nishida
Kan Nishida

Reputation: 106

Here is another way to do this. I admit this one won't scale as good as the one from r2evans above.

df <- structure(list(Key = c("Loc", "Value1", "Value2", "Loc", "Value3", 
"Loc", "Value1", "Value2", "Loc", "Value3"), Data = c("Place1", 
"6", "7", "Place2", "8", "Place3", "9", "10", "Place4", "11")), .Names = c("Key", 
"Data"), row.names = c(NA, -10L), class = "data.frame")

library(dplyr)
library(tidry)

df %>% 
  mutate(gid = ceiling(row_number() / 5)) %>%
  group_by(gid) %>%
  summarize(concatenated_text = str_c(Data, collapse = ",")) %>%
  separate(concatenated_text, into = c("LocA", "Value1", "Value2", "Loc", "Value3"), sep=",")

Upvotes: 1

r2evans
r2evans

Reputation: 160417

Your data:

df <- structure(list(Key = c("Loc", "Value1", "Value2", "Loc", "Value3", 
"Loc", "Value1", "Value2", "Loc", "Value3"), Data = c("Place1", 
"6", "7", "Place2", "8", "Place3", "9", "10", "Place4", "11")), .Names = c("Key", 
"Data"), row.names = c(NA, -10L), class = "data.frame")

Is this workable?

library(dplyr)
library(tidyr)
df %>%
  mutate(grp = (row_number() - 1) %/% 5) %>%
  group_by(grp) %>%
  mutate(
    Key = ifelse(! duplicated(Key), Key, paste0(Key, "A"))
  ) %>%
  ungroup() %>%
  spread(Key, Data) %>%
  select(-grp)
# Source: local data frame [2 x 5]
#      Loc   LocA Value1 Value2 Value3
# *  <chr>  <chr>  <chr>  <chr>  <chr>
# 1 Place1 Place2      6      7      8
# 2 Place3 Place4      9     10     11

Upvotes: 1

Related Questions