Rebecca
Rebecca

Reputation: 25

cbind multiple, individual columns in a single data frame using column numbers

I have a single data frame of 100 columns and 25 rows. I would like to cbind different groupings of columns (sometimes as many as 30 columns) in several new data frames without having to type out each column name every time. Some columns that i want fall individually e.g. 6 and 72 and some do lie next to each other e.g. columns 23, 24, 25, 26 (23:26).

Usually i would use:

z <- cbind(visco$fish, visco$bird)

for example, but i have too many columns and need to create too many new data frames to be typing the name of every column that i need every time. Generally i do not attach my data.

I would like to use column numbers, something like:

z <- cbind(6 , 72 , 23:26, data=visco) 

and also retain the original column names, not the automatically generated V1, V2. I have tried adding deparse.level=2 but my column names then become "visco$fish" rather than the original "fish"

I feel there should be a simple answer to this, but so far i have failed to find anything that works as i would like.

Upvotes: 1

Views: 35010

Answers (5)

Surin Voravipapan
Surin Voravipapan

Reputation: 1

Try this z <- visco[c(6,72,23:26)]

Upvotes: 0

RIGAN KHAN
RIGAN KHAN

Reputation: 1

In R we have vectors and matrices. You can create your own vectors with the function c.

c(1,5,3,4)

They are also the output of many functions such as

rnorm(10)

You can turn vectors into matrices using functions such as rbind, cbind or matrix.

Create the matrix from the vector 1:1000 like this:

X = matrix(1:1000,100,10)

What is the entry in row 25, column 3 ?

Upvotes: -1

user5249203
user5249203

Reputation: 4648

I understand your question as , subsetting a large dataframe into smaller ones. Which could be achieved in different ways. One way is, data.table package helps you to retain the column names, and yet subset it by indexing the columns.

if you have your data as dataframe, you can just do

DT<- data.table(df)
# You still have to define your subsets of columns you need to create

sub_1<-c(2,3)
sub_2<-c(2:5,9)
sub_3<-c(1:2,5:6,10)

DT[ ,sub_2, with = FALSE]

Output

  bird       cat        dog       rat        car
1: 0.2682538 0.1386834 0.01633384 0.5336649 0.43432878
2: 0.2418727 0.7530654 0.26999873 0.2679446 0.00859734
3: 0.1211858 0.2563736 0.92637523 0.8572615 0.63165705
4: 0.4556401 0.2343427 0.09324584 0.8731174 0.50098461
5: 0.1646126 0.9258622 0.86957980 0.3636781 0.89608415

Data

require("data.table")
DT <- data.table(matrix(runif(10*10),5,10)) 
colnames(DT) <- c("fish","bird","cat","dog","rat","tiger","insect","boat","car", "cycle")

Upvotes: 0

Sowmya S. Manian
Sowmya S. Manian

Reputation: 3833

 df <- data.frame(AA = 11:15, BB = 2:6, CC = 12:16, DD = 3:7, EE = 23:27)
 df
 #   AA BB CC DD EE
 # 1 11  2 12  3 23
 # 2 12  3 13  4 24
 # 3 13  4 14  5 25
 # 4 14  5 15  6 26
 # 5 15  6 16  7 27

 df1 <- data.frame(cbind(df,df,df,df))
 df1
 #   AA BB CC DD EE AA.1 BB.1 CC.1 DD.1 EE.1 AA.2 BB.2 CC.2 DD.2 EE.2 AA.3 BB.3
 # 1 11  2 12  3 23   11    2   12    3   23   11    2   12    3   23   11    2
 # 2 12  3 13  4 24   12    3   13    4   24   12    3   13    4   24   12    3
 # 3 13  4 14  5 25   13    4   14    5   25   13    4   14    5   25   13    4
 # 4 14  5 15  6 26   14    5   15    6   26   14    5   15    6   26   14    5
 # 5 15  6 16  7 27   15    6   16    7   27   15    6   16    7   27   15    6

 # CC.3 DD.3 EE.3
 # 1   12    3   23
 # 2   13    4   24
 # 3   14    5   25
 # 4   15    6   26
 # 5   16    7   27


 Result <- data.frame(cbind(df1[,c(1:5,14:17,20)]))
 Result
 #   AA BB CC DD EE DD.2 EE.2 AA.3 BB.3 EE.3
 # 1 11  2 12  3 23    3   23   11    2   23
 # 2 12  3 13  4 24    4   24   12    3   24
 # 3 13  4 14  5 25    5   25   13    4   25
 # 4 14  5 15  6 26    6   26   14    5   26
 # 5 15  6 16  7 27    7   27   15    6   27

Note: The columns with same name are adjusted in their next appearance as .1 or .2 by R itself.

Upvotes: 2

Rory Shaw
Rory Shaw

Reputation: 851

Here's an example of how to do this using the select function from dplyr - which should be your go to package for this type of data wrangling

> library(dplyr)
> df <- head(iris)
> df
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
> 
>## select by variable name
>newdf <- df %>% select(Sepal.Length, Sepal.Width,Species)
> newdf
  Sepal.Length Sepal.Width Species
1          5.1         3.5  setosa
2          4.9         3.0  setosa
3          4.7         3.2  setosa
4          4.6         3.1  setosa
5          5.0         3.6  setosa
6          5.4         3.9  setosa

>## select by variable indices
> newdf <- df %>% select(1:2,5)
> newdf
  Sepal.Length Sepal.Width Species
1          5.1         3.5  setosa
2          4.9         3.0  setosa
3          4.7         3.2  setosa
4          4.6         3.1  setosa
5          5.0         3.6  setosa
6          5.4         3.9  setosa

However, I'm not sure why you would need to do this? Can you not run your analyses on the original dataframe?

Upvotes: 0

Related Questions