Reputation: 133
I have a data frame with column names with name start with numbers and names with string and I want to subset with names starting with numbers followed by dots.
this code is working for this sample but in my actual data frame the column AA ID
get selected. I don't know the reason
df <- data.frame(`AA ID`=c(1,2,3,4,5,6,7,8,9,10),
"BB"=c("AMK","KAMl","HAJ","NHS","KUL","GAF","BGA","NHU","VGY","NHU"),
"CC"=c("TAMAN","GHUSI","KELVIN","DEREK","LOKU","MNDHUL","JASMIN","BINNY","BURTAM","DAVID"),
"DD"=c(62,41,37,41,32,74,52,75,59,36),
"EE"=c("CA","NY","GA","DE","MN","LA","GA","VA","TM","BA"),
"FF"=c("ENGLISH","FRENCH","ENGLISH","FRENCH","ENGLISH","ENGLISH","SPANISH","ENGLISH","SPANISH","RUSSIAN"),
"GG"=c(33,44,51,51,37,58,24,67,41,75),
`1A`=c("","D","","NA","","D","","","D",""),
`2B`=c("","A","","","A","A","A","A","",""),
`3C`=c("","","","","","","","","",""),
`4D`=c("","G","G","G","G","G","G","G","",""),
"Concatenate" = c("","DAG","G","NAG","AG","DAG","AG","AG","D",""))
df <- df %>% rename(`1. A`="X1A",`1. B`="X2B",`1. C`="X3C",`1. D`="X4D")
Error_summary <- select(df,matches("^[0-9]*\\."))
also I am trying to add count in data frames like below
df_row =
df %>%
summarize(across(c(matches("^[0-9]*\\."), Concatenate), ~ sum(!is.na(.) & . != "" & . != "NA")))
but this is also selecting column AA ID
which i dont want to select.
Upvotes: 1
Views: 1719
Reputation: 10996
Taking into account that your variables supposed to starting with numbers will be converted to variable names starting with X, you could do:
library(tidyverse)
df %>%
select(matches("^X[0-9]"))
which gives:
X1..A X2..B X3..C X4..D
1
2 D A G
3 G
4 NA G
5 A G
6 D A G
7 A G
8 A G
9 D
10
With the same logic you can do your counts:
df %>%
summarize(across(c(matches("^X[0-9]"), Concatenate), ~ sum(!is.na(.) & . != "" & . != "NA")))
which gives
X1..A X2..B X3..C X4..D Concatenate
1 3 5 0 7 8
Although I'm not sure if you want to exclude the "NAG" value in the Concatenate column.
Upvotes: 1