split a dataframe to multiple tables based on conditional search

Question

I have a table like:

ID A10 A11 A12 A20 A21 A22 A30 A31 A32
Apple 1 1 1 1 1 1 1 1 1
Banana 2 2 2 2 2 2 2 2 2
Pear 3 3 3 3 3 3 3 3 3

My first question is, can I use something like "*" to subset multiple column? Like if I want to subset columns containing A2 (A20, A21, A22), how can I do it quickly?

My second question is how can I split this dataframe to multiple tables? I would like to split this dataframe base on "A1" "A2" and "A3" (related with my first question), which will be:

df1<-
ID A10 A11 A12
Apple 1 1 1
Banana 2 2 2
Pear 3 3 3

df2<-
ID A20 A21 A22
Apple 1 1 1
Banana 2 2 2
Pear 3 3 3

df3<-
ID A30 A31 A32
Apple 1 1 1
Banana 2 2 2
Pear 3 3 3

How can I do that?

For another case, I have a dataframe that the character length of the column names are different, like:

ID BSS20 BSS40 BSS60 W01 W02 W03 BW20 BW40 BW60 SS20 SS40 SS60
Apple 1 1 1 1 1 1 1 1 1
Banana 2 2 2 2 2 2 2 2 2
Pear 3 3 3 3 3 3 3 3 3

How can I split the data into three new dataframes, each one must contain W0, like:

df1<-
ID BSS20 BSS40 BSS60 W01 W02 W03
Apple 1 1 1 1 1 1
Banana 2 2 2 2 2 2
Pear 3 3 3 3 3 3

df2<-
ID BW20 BW40 BW60 W01 W02 W03
Apple 1 1 1 1 1 1
Banana 2 2 2 2 2 2
Pear 3 3 3 3 3 3

df3<-
ID SS20 SS40 SS60 W01 W02 W03
Apple 1 1 1 1 1 1
Banana 2 2 2 2 2 2
Pear 3 3 3 3 3 3

My columns are composed of *20 *40 *60 and three special characters W01 W02 W03, how can I express them well in r?

Ronak Shah · Accepted Answer

Yes, you can split the dataframe to multiple tables using split.default. In this case we can extract first two characters from the column names to split them

split.default(df[-1], substr(names(df)[-1], 1, 2))

#$A1
#  A10 A11 A12
#1   1   1   1
#2   2   2   2
#3   3   3   3

#$A2
#  A20 A21 A22
#1   1   1   1
#2   2   2   2
#3   3   3   3

#$A3
#  A30 A31 A32
#1   1   1   1
#2   2   2   2
#3   3   3   3

If you want to add first column to each of the list you can cbind it to every list

lapply(split.default(df[-1], substr(names(df)[-1], 1, 2)), cbind, df[1])

EDIT

For the updated data we can create two groups, common columns(common_cols) and different columns (diff_cols)

common_cols <-c("ID", "W01", "W02", "W03")
diff_cols <- setdiff(names(df), common_cols)

then do the split only on diff_cols and cbind the common_cols to each list.

lapply(split.default(df[diff_cols], substr(names(df[diff_cols]), 1, 2)), 
        cbind, df[common_cols])

#$BS
#  BSS20 BSS40 BSS60     ID W01 W02 W03
#1     1     1     1  Apple   1   1   1
#2     2     2     2 Banana   2   2   2
#3     3     3     3   Pear   3   3   3

#$BW
#  BW20 BW40 BW60     ID W01 W02 W03
#1    1    1    1  Apple   1   1   1
#2    2    2    2 Banana   2   2   2
#3    3    3    3   Pear   3   3   3

#$SS
#  SS20 SS40 SS60     ID W01 W02 W03
#1    1    1    1  Apple   1   1   1
#2    2    2    2 Banana   2   2   2
#3    3    3    3   Pear   3   3   3

split a dataframe to multiple tables based on conditional search

Answers (2)

Related Questions