Layla
Layla

Reputation: 5446

removing columns in csv format in R language

I have the following table in csv format:

I have the following gene information in a csv format table:

                    1       3       1       2     2         3
1415670_at  1   365.1   293.4   288.9   394.5   312     381.6
1415671_at  2   556.1   584.2   567.8   592.8   471.6   513.1
1415672_at  3   1048.3  763.1   1074.9  852.3   826.1   898.3
1415673_at  4   60.8    51.7    51.6    224     248.4   150.7
1415674_at  5   129.1   107.2   230.4   175.5   250.5   172.4

As you can see I got some columns labeled with 1,2 and 3. I have made a VB script that deletes the columns that are different from 1 and 2 in Excel. The question that I have is how I can do that using only R? So that my resulting table will be:

                    1   1       2          2        
1415670_at  1   365.1   293.4   394.5     312       
1415671_at  2   556.1   584.2   592.8   471.6   
1415672_at  3   1048.3  763.1   852.3   826.1   
1415673_at  4   60.8    51.7    224     248.4   
1415674_at  5   129.1   107.2   175.5   250.5   

By the way, this is only an example, I can have other columns labeled 4, 5 and 6, but I only want to keep those that are labeled 1 and 2

I have tried the solution posted, that is to use:

m<-read.csv("test1.csv")
smallerdat <- m[ grep("^X1$|^X2$|X1\\.|X2\\." , names(m) ) ]

where m is the table in csv format, but the results that I got is:

    X1  X1.1        X2      X2.2        
365.1   293.4   394.5     312       
556.1   584.2   592.8   471.6   
1048.3  763.1   852.3   826.1   
60.8    51.7    224     248.4   
129.1   107.2   175.5   250.5

So it is deleting the first two columns that I need those. How not to delete those columns? and also how to keep the original format, I mean only 1 and 2 in the header and not those Xs

Upvotes: 1

Views: 9551

Answers (3)

Ulquay
Ulquay

Reputation: 45

Just merge as many columns of your table as you like with a cbind();

newtable <- cbind( table[1], table[2], table[3], ..)

where table[1] is 1st column of table, and table[2] is 2nd etc.

As cbind() takes as many arguments as you like.

And then name columns if you like with;

colnames(newtable) <- list("First", "Second", "Third")

Upvotes: 2

IRTFM
IRTFM

Reputation: 263382

After reading data in to a dataframe named "indat"

 smallerdat <- indat[ grep("^X1|^X2", names(indat) )]

Will work if columns are no greater than 0-9. If they are greater in character values, you would get "11" or "21" in that net so might need something more complex:

 smallerdat <- indat[ grep("^X1$|^X2$|X1\\.|X2\\." , names(indat) ) ]

Upvotes: 2

Alberto Otero
Alberto Otero

Reputation: 143

Manolo,

You have to read the data into a data frame and remove the columns. The only thing that you have to take into account is that the column names could not be the same.

# Your data
data <- read.table(text = "1       3       1       2       2       3
                           365.1   293.4   288.9   394.5   312     381.6
                           556.1   584.2   567.8   592.8   471.6   513.1
                           1048.3  763.1   1074.9  852.3   826.1   898.3
                           60.8    51.7    51.6    224     248.4   150.7
                           129.1   107.2   230.4   175.5   250.5   172.4",
        header=TRUE, sep="", nrows=5)

after that if you run

print(data)

you will obtain

      X1    X3   X1.1    X2  X2.1  X3.1
1  365.1 293.4  288.9 394.5 312.0 381.6
2  556.1 584.2  567.8 592.8 471.6 513.1
3 1048.3 763.1 1074.9 852.3 826.1 898.3
4   60.8  51.7   51.6 224.0 248.4 150.7
5  129.1 107.2  230.4 175.5 250.5 172.4

if you remove the columns named X3 and X3.1 with

data <- data[, !(colnames(data) %in% c("X3","X3.1"))]

and execute

print(data)

you now will get

      X1   X1.1    X2  X2.1
1  365.1  288.9 394.5 312.0
2  556.1  567.8 592.8 471.6
3 1048.3 1074.9 852.3 826.1
4   60.8   51.6 224.0 248.4
5  129.1  230.4 175.5 250.5

Hope it helps! :)

Upvotes: 0

Related Questions