user938301
user938301

Reputation: 293

Selecting non-consecutive columns in R tables

Let's say I have a some table, T. Assume T has 5 columns. I understand how to select any consecutive subset of columns and store them as a new table. For that I would use brackets and a colon to the right of a comma:

newT <- T[,2:4]   # creates newT from columns 2 through 4 in T

But how do I select non-consecutive columns for subsetting? Let's say I want to select Column 1 and Column 3? How do I go about doing this? Another type of selection I may want to do, and not sure how to, is selecting random columns from T.

Upvotes: 28

Views: 76183

Answers (4)

Luke
Luke

Reputation: 1673

You can also use logical values. Eg. df[c(TRUE,FALSE,TRUE)] selects the first and third column. The logical vector must have a number or elements equal to the number of columns in the data frame, otherwise its elements are replicated up to the number of columns.

Upvotes: 0

Grau
Grau

Reputation: 21

For random columns check out ?sample

df <- data.frame(matrix(runif(25), 5))
df
#         X1        X2         X3         X4        X5
#1 0.7973941 0.6142358 0.07211461 0.01478683 0.6623704
#2 0.8992845 0.8347466 0.54495115 0.52242817 0.4944838
#3 0.8695551 0.9228987 0.00838420 0.58049324 0.9256282
#4 0.1559048 0.7116077 0.08964883 0.06799828 0.3752833
#5 0.2179599 0.4533054 0.60817319 0.62235228 0.8357441

df[ ,sample(names(df), 3)]
#         X5         X3        X2
#1 0.6623704 0.07211461 0.6142358
#2 0.4944838 0.54495115 0.8347466
#3 0.9256282 0.00838420 0.9228987
#4 0.3752833 0.08964883 0.7116077
#5 0.8357441 0.60817319 0.4533054

Upvotes: 2

ATMathew
ATMathew

Reputation: 12856

If I understand your question correctly, you should try something similar to the following:

df1 = data.frame(state=c("KS","CO","CA","FL","CA"), value=c(1,2,3,7,9))
df1

df1[c(c(1,3),4:5),]
df1[c(1,3,4:5),]

Upvotes: 4

Tommy
Tommy

Reputation: 40821

You simply first generate the indexes you want. The c function allows you to concatenate values. The values can be either column indices or column names (but not mixed).

df <- data.frame(matrix(runif(100), 10))
cols <- c(1, 4:8, 10)
df[,cols]

You can also select which column indices to remove by specifying a negative index:

df[, -c(3, 5)] # all but the third and fifth columns

Upvotes: 54

Related Questions