Reputation: 81
We have the current data frame df as below
df <- data.frame(ID = c(1,2,3,4,5,6), Name = c("Chris", "J", "Kemp", "President,", "CEO & ", "Director", "Ashton", "K", "Christian", "Analyst"),
Font = c("Font A", "Font A", "Font A", "Font B", "Font B", "Font B", "Font A", "Font A", "Font A", "Font B"))
The expected Output is
final_df <- data.frame(Name = c("Chris J Kemp", "Ashton K Christian"), Designation = c("President, CEO & Director", "Analyst"))
So basically I want to add names until there's a certain font type in column Font and this is of course a sample of the huge data frame I'm dealing with. Thanks for the help in advance !
Sorry all your efforts earlier. This question has been re-edited a bit.
Upvotes: 2
Views: 103
Reputation: 78917
Here is an alternative strategy with dplyr
: Data used from Clemsang (many thanks!)
group_by
and divide ID
by 3summarise
and collapse
(bring rows to one row by group)group_split
to split the groups (returns a list)bind_cols
to get a dataframelibrary(dplyr)
df %>%
group_by(Font, ceiling(ID/3)) %>%
summarise(Name = paste0(Name, collapse = " ")) %>%
group_split(Font) %>%
bind_cols() %>%
select(Name = Name...3, Designation=Name...6)
Name Designation
<chr> <chr>
1 Chris J Kemp President, CEO & Director
2 Bad D King Best, Teacher & Friend
Upvotes: 1
Reputation: 3429
Here is a solution with dplyr
:
library(dplyr)
df %>%
group_by(Font, fontnum) %>%
summarize(Tmp = paste(Name, collapse = " ")) %>%
mutate(ID = fontnum %/% 2) %>%
pivot_wider(id_cols = ID, names_from = Font, values_from = Tmp) %>%
transmute(Name = `Font A`, Designation = `Font B`)
where
df <- data.frame(
ID = 1:12,
Name = c("Chris", "J", "Kemp", "President,", "CEO & ", "Director",
"Bad", "D", "King", "Best,", "Teacher & ", "Friend"),
Font = c("Font A", "Font A", "Font A", "Font B", "Font B", "Font B",
"Font A", "Font A", "Font A", "Font B", "Font B", "Font B")
)
and
df$fontnum <- cumsum(c(0, abs(diff(as.numeric(factor(df$Font))))))
And the result will be
# A tibble: 2 × 2
Name Designation
<chr> <chr>
1 Chris J Kemp President, CEO & Director
2 Bad D King Best, Teacher & Friend
Upvotes: 1
Reputation: 388817
You can try -
library(dplyr)
library(tidyr)
df %>%
mutate(Font = recode(Font, 'Font A' = 'Name', 'Font B' = 'Designation'),
ID = data.table::rleid(Font)) %>%
group_by(ID, Font) %>%
summarise(Name = toString(Name), .groups = 'drop') %>%
mutate(ID = ceiling(ID/2)) %>%
pivot_wider(names_from = Font, values_from = Name) %>%
select(-ID)
# Name Designation
# <chr> <chr>
#1 Chris, J, Kemp President,, CEO & , Director
#2 Ashton, K, Christian Analyst
Upvotes: 2
Reputation: 5481
Data:
df <- data.frame(
ID = c(1:12),
Name = c("Chris", "J", "Kemp", "President,", "CEO & ", "Director",
"Bad", "D", "King", "Best,", "Teacher & ", "Friend"),
Font = c("Font A", "Font A", "Font A", "Font B", "Font B", "Font B",
"Font A", "Font A", "Font A", "Font B", "Font B", "Font B")
)
You can do:
df$group <- cumsum(c(TRUE, df$Font[-1] != df$Font[-length(df$Font)]))
final_df <- as.data.frame(matrix(lapply(split(df$Name, df$group), paste, collapse = " "), ncol = 2))
colnames(final_df) <- c("Name", "Designation")
A grouping row of consecutif font is created. Then split
allows to have a list per font then you can reformat data using paste
.
Output:
Name Designation
1 Chris J Kemp Bad D King
2 President, CEO & Director Best, Teacher & Friend
Upvotes: 2
Reputation: 1202
How about this method:
df_final = as.data.frame(matrix(unlist(lapply(unique(df$Font),function(i){paste(collapse = ' ', df[df$Font%in%i,"Name"])})), byrow=T,ncol = 2))
colnames(df_final)=c("names", "designation")
Let me know if it's ok
Upvotes: 1