Storm Wiston
Storm Wiston

Reputation: 79

R - reorder dataframe keeping first column

I have two data frames

the first one (A):

the first col is the "rownames"

                          GTEX-11DXY-0426-SM-5H12R   GTEX-11EQ8-0826-SM-5N9FG [...30]
ENSG00000223972.4                        0                        1
ENSG00000227232.4                      663                      802
ENSG00000243485.2                        0                        1
ENSG00000237613.2                        0                        0
ENSG00000268020.2                        0                        1
ENSG00000240361.1                        3                        0

It continues for 30 more columns with the same format

I want to order it based on the order of another data frame column, that looks like this:

> head(targets10)
# A tibble: 6 x 7
# Groups:   Group [1]
  Sample_Name Grupo_analisis body_site molecular_data_~ sex   Group

1 GTEX-11XUK~              3 Thyroid   RNA Seq (NGS)    fema~ ELI  
2 GTEX-R55G-~              3 Thyroid   RNA Seq (NGS)    fema~ ELI  
3 GTEX-PLZ4-~              3 Thyroid   RNA Seq (NGS)    fema~ ELI  
4 GTEX-14AS3~              3 Thyroid   RNA Seq (NGS)    fema~ ELI  
5 GTEX-14BMU~              3 Thyroid   Allele-Specific~ fema~ ELI  
6 GTEX-13QJC~              3 Thyroid   Allele-Specific~ fema~ ELI  
# ... with 1 more variable: ShortName <fct>

The column Sample_Name has the same names as the headers of the columns in the dataframe A.

I want them just to have the same order, so the 1st column in the dataframe A is the 1st row in the targets10$Sample_Name

I tried the following:

library(data.table)
setDT(countdata)
setcolorder(countdata, as.character(coldata$Sample_Name))

and it works but removes my rownames from the data frame, and I need them to stay!!!

please help me

thank you so much

Upvotes: 0

Views: 48

Answers (2)

Matt
Matt

Reputation: 7413

You could do:

dput(dfB$Sample_Name) which will print the values of the Sample_Name column to your console. Then you can copy the output, and do:

library(dplyr)
dfA <- dfA %>%
  select("GTEX-11XUK", "GTEX-R55G", etc...)

Or a less hacky approach as noted by Gregor:

dfA <- dfA %>%
  select(all_of(dfB$Sample_Name))

Upvotes: 0

Bernhard
Bernhard

Reputation: 4427

Without respect to your data being tibbles and planned to be data.tables, this works with plain data.frames :

A <- data.frame(id = LETTERS, c = rnorm(26), d=rnorm(26), a = 1:26, b = 26:1)
B <- data.frame(sample = c("a", "b", "c", "d"), ignore =rnorm(4))

new.A <- cbind(A$id, A[,B$sample])
head(new.A)

Edit

Just realized ids are not in a column but in rownames. Makes this approach even easier:

A <- data.frame(c = rnorm(26), d=rnorm(26), a = 1:26, b = 26:1)
rownames(A) <- LETTERS
B <- data.frame(sample = c("a", "b", "c", "d"), ignore =rnorm(4))

new.A <- A[, B$sample]
head(new.A)

Upvotes: 1

Related Questions