user5054
user5054

Reputation: 1096

R: How to set the row names to the last column in read.table()

I usually use dat = read.table(filename, row.names=1) if the row names are in the first column. What is the corresponding call when the row names are in the last column of the file? I tried dat = read.table(filename, row.names=ncol(dat)) but this did not work as expected, since dat variable does not exist yet.

Upvotes: 1

Views: 1468

Answers (3)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522762

I would personally just cut and paste the header row into the correct position at the top. Then, speak with you data pipeline folks about why headers are appearing on the bottom of the file. If you want an R solution to this, I can offer the following code.

I don't think there is a nice way to do this using read.table or read.csv. These functions were designed with the header being on the top of the file. You could try the following:

library(readr)

df <- NULL
cols <- list()
line <- 0L
input <- "start"
while (input != "") {
    line <- line + 1L
    input <- read_lines( file, skip = line - 1L, n_max = 1L )
    cols <- strsplit(input, " ")
    df <- rbind(df, cols)
}

# remove the last row, which is really just the header names
df <- head(df, -1)

# now take the very last line and assign the names
names(df) <- as.character(cols)

Upvotes: 1

akrun
akrun

Reputation: 887951

A base R option would be to use count.fields

read.table('filename.csv', sep=",", 
       row.names = count.fields('filename.csv', sep=",")[1], header = TRUE)
#   col1 col2
#C    1    A
#F    2    B
#D    3    C
#G    4    D

data

df1 <- data.frame(col1 = 1:2, col2 = LETTERS[1:4],
           col3 = c('C', 'F', 'D', 'G'), stringsAsFactors=FALSE)
write.csv(df1, 'filename.csv', quote = FALSE, row.names = FALSE)

Upvotes: 2

Kevin Arseneau
Kevin Arseneau

Reputation: 6264

There is a function in tibble to do this column_to_rownames

dat <- read.table(filename)
dat <- tibble::column_to_rownames(dat, var = "target")

This provides the benefit of using the column name, the order of columns in your source is less relevant.

Upvotes: 1

Related Questions