Math. H
Math. H

Reputation: 37

Create matrix of features using regex?

Suppose I have a data frame of 101 variables. I select one so-called Y as a dependent variable, and the remaining 100 so-called x_1, X_2,...,X_{100} as independent ones.

Now I would like to create a matrix containing 100 independent variables. What are the ways to do it directly? Like when I make a linear regression model, just use "." as regex, i.e lm(Y ~ ., _____)

Upvotes: 0

Views: 110

Answers (1)

Artem
Artem

Reputation: 3414

You can use grep function to extract indpendent variable associated column names of the data frame. Then you can transform it into the matrix. Please see the code below:

# simulation of the data frame with 100 measurements and 101 variables

n <- 100
df <- data.frame(matrix(1:101 * n, ncol = 101))
names(df) <- c(paste0("X_", 1:100), "Y")

# extract matrix of Xs
m_x <- as.matrix(df[, grep("^X", names(df))])
dimnames(m_x)

Output:

[[1]]
NULL

[[2]]
  [1] "X_1"   "X_2"   "X_3"   "X_4"   "X_5"   "X_6"   "X_7"   "X_8"   "X_9"   "X_10"  "X_11"  "X_12"  "X_13"  "X_14"  "X_15" 
 [16] "X_16"  "X_17"  "X_18"  "X_19"  "X_20"  "X_21"  "X_22"  "X_23"  "X_24"  "X_25"  "X_26"  "X_27"  "X_28"  "X_29"  "X_30" 
 [31] "X_31"  "X_32"  "X_33"  "X_34"  "X_35"  "X_36"  "X_37"  "X_38"  "X_39"  "X_40"  "X_41"  "X_42"  "X_43"  "X_44"  "X_45" 
 [46] "X_46"  "X_47"  "X_48"  "X_49"  "X_50"  "X_51"  "X_52"  "X_53"  "X_54"  "X_55"  "X_56"  "X_57"  "X_58"  "X_59"  "X_60" 
 [61] "X_61"  "X_62"  "X_63"  "X_64"  "X_65"  "X_66"  "X_67"  "X_68"  "X_69"  "X_70"  "X_71"  "X_72"  "X_73"  "X_74"  "X_75" 
 [76] "X_76"  "X_77"  "X_78"  "X_79"  "X_80"  "X_81"  "X_82"  "X_83"  "X_84"  "X_85"  "X_86"  "X_87"  "X_88"  "X_89"  "X_90" 
 [91] "X_91"  "X_92"  "X_93"  "X_94"  "X_95"  "X_96"  "X_97"  "X_98"  "X_99"  "X_100"

Upvotes: 0

Related Questions