Aserian
Aserian

Reputation: 1127

Merge columns with the same name R

I'm fairly new to R. I'm working with a data set that is incredibly redundant with a lot of columns (~400). There are several duplicate column names, however the data is not duplicate, so I need to sum the columns when collapsing them.

The columns all have a similar name that allows easy identification, so I'm hoping I can use that to my advantage.

I attempted to perform the following:

ColNames <- unique(colnames(df))
CombinedDf <- data.frame(sapply(ColNames, function(i)rowSums(Test[,ColNames==i, drop=FALSE])))

This works if I sum over the range of columns that only contain integers, but the issue is that other columns have strings and such in them, so rowSums throws a fit.

Assuming that the identifier is "XXX", how can I aggregate all the columns that are of the same name leaving the other columns as is?

Thank you for your time.

Edit: Sample data has been asked for, I cannot give the exact data as it is sensitive, but I will give an example:

Name    COL1XXX    COL2XXX    COL1XXX    COL3XXX    COL2XXX   Type
Henry   5          15         25         31         1         Orange
Tom     8          16         12         4          3         Green

Should return

Name    COL1XXX   COL2XXX   COL3XXX    Type
Henry   30        16        31         Orange
Tom     20        19        4          Green

Upvotes: 0

Views: 4481

Answers (1)

jtclaypool
jtclaypool

Reputation: 190

I'm not really sure, but you may try transposing the data and then aggregating by unique names.

t_df=as.data.frame(t(df))

new_df=aggregate(t_df, by=list(rownames(t_df)),sum)

Again, without sample data I'm unsure if it'll work, but based on what you said, that might work.

Upvotes: 1

Related Questions