b222
b222

Reputation: 976

Attempting to drop numerous columns in R

I have a list of column names in an excel file (sds.drop.csv) that I want to drop from a dataframe already imported into R.

I attempted to read the column names into R as follows

sds.drop <- as.list(read.csv("sds.drop.csv", header = F))

With the intention of then running the code below to drop them from the dataframe called 'dat'

dat1 <-  dat[, !(names(dat) %in% sds.drop)]

However, no columns are dropped. I'm guessing the issue resides in the way I am reading in the data. I attempted to read the data in without the as.list() command and it still did not work. Any thoughts?

Here is what the sds.drop.csv df looks like...

> head(read.csv("sds.drop.csv", header = F))
   V1
1 Q20
2 Q23
3 Q24
4 Q25
5 Q26
6 Q27

Upvotes: 0

Views: 142

Answers (3)

micstr
micstr

Reputation: 5206

Try:

# for example (you are reading this in from Excel)
dat <- data.frame("goodcol" = c(1,2), "badcol1"= c(-1,-2),
                  "badcol2"= c(-2,-4), "goodcol2" = c(2,4))

sds.drop <- c("badcol1", "badcol2")

dat <-  dat[, !(names(dat) %in% sds.drop)]

You may have been missing a bracket and needed to refer to the dateframe in names().

EDIT For those who later search - Similar to Drop data frame columns by name

Upvotes: 0

Amrita Sawant
Amrita Sawant

Reputation: 10913

    #keep only the headers from the csv
    cols = colnames(read.csv('C:/myfile.csv',colClasses='character',nrows =    1,header=TRUE)[-1, ])

    #subset dataframe excluding colnames in cols
    df2 <- yourdataframe[!names(yourdataframe) %in% cols]

Upvotes: 1

bgoldst
bgoldst

Reputation: 35314

I think your main problem is that you're not dereferencing (i.e. extracting) the column of column names from the data.frame that is returned by read.csv(). It also makes sense to coerce to character, since that's what you need to operate on. Thus, you should be assigning sds.drop from as.character(read.csv('sds.drop.csv',header=F)[,1]);:

dat <- data.frame(Q20=1:3, Q23=4:6, Q24=7:9, Q25=10:12, Q26=13:15, Q27=16:18, Q30=19:21, Q31=22:24 );
dat;
##   Q20 Q23 Q24 Q25 Q26 Q27 Q30 Q31
## 1   1   4   7  10  13  16  19  22
## 2   2   5   8  11  14  17  20  23
## 3   3   6   9  12  15  18  21  24
sds.drop <- as.character(read.csv('sds.drop.csv',header=F)[,1]);
sds.drop;
## [1] "Q20" "Q23" "Q24" "Q25" "Q26" "Q27"
dat[,!names(dat)%in%sds.drop];
##   Q30 Q31
## 1  19  22
## 2  20  23
## 3  21  24

Upvotes: 1

Related Questions