Aishwarya Sharma
Aishwarya Sharma

Reputation: 81

Looping in a data frame in R until a certain condition is met

We have the current data frame df as below

df <- data.frame(ID = c(1,2,3,4,5,6), Name = c("Chris", "J", "Kemp", "President,", "CEO & ", "Director", "Ashton", "K", "Christian", "Analyst"), 
                  Font = c("Font A", "Font A", "Font A", "Font B", "Font B", "Font B", "Font A", "Font A", "Font A", "Font B"))

The expected Output is

final_df <- data.frame(Name =  c("Chris J Kemp", "Ashton K Christian"), Designation = c("President, CEO & Director", "Analyst"))

So basically I want to add names until there's a certain font type in column Font and this is of course a sample of the huge data frame I'm dealing with. Thanks for the help in advance !

Sorry all your efforts earlier. This question has been re-edited a bit.

Upvotes: 2

Views: 103

Answers (5)

TarJae
TarJae

Reputation: 78917

Here is an alternative strategy with dplyr: Data used from Clemsang (many thanks!)

  1. group_by and divide ID by 3
  2. summarise and collapse (bring rows to one row by group)
  3. use group_split to split the groups (returns a list)
  4. use bind_cols to get a dataframe
  5. tweak names and select
library(dplyr)
df %>% 
    group_by(Font, ceiling(ID/3)) %>%
    summarise(Name = paste0(Name, collapse = " ")) %>% 
    group_split(Font) %>%
    bind_cols() %>% 
    select(Name = Name...3, Designation=Name...6)
  Name         Designation               
  <chr>        <chr>                     
1 Chris J Kemp President, CEO &  Director
2 Bad D King   Best, Teacher &  Friend   

Upvotes: 1

Vincent Guillemot
Vincent Guillemot

Reputation: 3429

Here is a solution with dplyr:

library(dplyr)

df %>%
  group_by(Font, fontnum) %>%
  summarize(Tmp = paste(Name, collapse = " ")) %>% 
  mutate(ID = fontnum %/% 2) %>% 
  pivot_wider(id_cols = ID, names_from = Font, values_from = Tmp) %>%
  transmute(Name = `Font A`, Designation = `Font B`)

where

df <- data.frame(
  ID = 1:12, 
  Name = c("Chris", "J", "Kemp", "President,", "CEO & ", "Director", 
           "Bad", "D", "King", "Best,", "Teacher & ", "Friend"), 
  Font = c("Font A", "Font A", "Font A", "Font B", "Font B", "Font B",
           "Font A", "Font A", "Font A", "Font B", "Font B", "Font B")
)

and

df$fontnum <- cumsum(c(0, abs(diff(as.numeric(factor(df$Font))))))

And the result will be

# A tibble: 2 × 2
  Name         Designation               
  <chr>        <chr>                     
1 Chris J Kemp President, CEO &  Director
2 Bad D King   Best, Teacher &  Friend  

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388817

You can try -

library(dplyr)
library(tidyr)

df %>%
  mutate(Font = recode(Font, 'Font A' = 'Name', 'Font B' = 'Designation'), 
         ID = data.table::rleid(Font)) %>%
  group_by(ID, Font) %>%
  summarise(Name = toString(Name), .groups = 'drop') %>%
  mutate(ID = ceiling(ID/2)) %>%
  pivot_wider(names_from = Font, values_from = Name) %>%
  select(-ID)

#  Name                 Designation                 
#  <chr>                <chr>                       
#1 Chris, J, Kemp       President,, CEO & , Director
#2 Ashton, K, Christian Analyst                     

Upvotes: 2

Clemsang
Clemsang

Reputation: 5481

Data:

df <- data.frame(
  ID = c(1:12), 
  Name = c("Chris", "J", "Kemp", "President,", "CEO & ", "Director", 
           "Bad", "D", "King", "Best,", "Teacher & ", "Friend"), 
  Font = c("Font A", "Font A", "Font A", "Font B", "Font B", "Font B",
           "Font A", "Font A", "Font A", "Font B", "Font B", "Font B")
)

You can do:

df$group <- cumsum(c(TRUE, df$Font[-1] != df$Font[-length(df$Font)]))
final_df <- as.data.frame(matrix(lapply(split(df$Name, df$group), paste, collapse = " "), ncol = 2))
colnames(final_df) <- c("Name", "Designation")

A grouping row of consecutif font is created. Then split allows to have a list per font then you can reformat data using paste.

Output:

                        Name             Designation
1               Chris J Kemp              Bad D King
2 President, CEO &  Director Best, Teacher &  Friend

Upvotes: 2

elielink
elielink

Reputation: 1202

How about this method:

df_final = as.data.frame(matrix(unlist(lapply(unique(df$Font),function(i){paste(collapse = ' ', df[df$Font%in%i,"Name"])})), byrow=T,ncol = 2))

colnames(df_final)=c("names", "designation")

Let me know if it's ok

Upvotes: 1

Related Questions