David Z
David Z

Reputation: 7061

How to extract str() information in R

Suppose I have a data frame such like:

df<-data.frame(a=rnorm(20), 
               b=LETTERS[1:20], 
               c=rep(c(FALSE, TRUE), each=10))
str(df)
'data.frame':   20 obs. of  3 variables:
 $ a: num  1.1525 0.0377 -0.2212 -2.6184 -0.3649 ...
 $ b: Factor w/ 20 levels "A","B","C","D",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ c: logi  FALSE FALSE FALSE FALSE FALSE FALSE ...

What I wanted is to extract the variable names and their class types from the str() output:

Names  Type
a      num
b      Factor
c      logi

How to realize this in R?

Upvotes: 2

Views: 2077

Answers (2)

akrun
akrun

Reputation: 887951

As the OP mentioned about extracting info from the str, we can use capture.output to get that as a string, then with sub remove the unwanted substring, and using read.table convert the vector to a two column data.frame

read.table(text=sub("\\$\\s+(\\S+)\\s+(\\S+).*", "\\1\\2",
  trimws(capture.output(str(df))[-1])), sep=":", 
  col.names = c("Names", "Type"), header=FALSE, stringsAsFactors=FALSE)
#  Names   Type
#1     a    num
#2     b Factor
#3     c   logi

Upvotes: 0

Alexey Shiklomanov
Alexey Shiklomanov

Reputation: 1652

As far as I know, str only prints output and returns NULL. But, you can accomplish what you want with the class or typeof commands (depending on exactly the kind of information you want).

df <- data.frame(a=rnorm(20), 
                   b=LETTERS[1:20], 
                   c=rep(c(FALSE, TRUE), each=10))
sapply(df, class)
#         a         b         c 
# "numeric"  "factor" "logical" 
sapply(df, typeof)
#         a         b         c 
#  "double" "integer" "logical" 

Upvotes: 5

Related Questions