Reputation: 1506
I have a data frame for a raw data set where the variable names are extremely long. I would like to display the structure of the data frame using the str
function, and impose a character limit on the displayed variable names, so that it is easier to read.
Here is a reproducible example of the kind of thing I am talking about.
#Data frame with long names
set.seed(1);
DATA <- data.frame(ID = 1:50,
Value = rnorm(50),
This_variable_has_a_really_long_and_annoying_name_to_illustrate_the_problem_of_a_data_frame_with_a_long_and_annoying_name = runif(50));
#Show structure of DATA
str(DATA);
> str(DATA)
'data.frame': 50 obs. of 3 variables:
$ ID : int 1 2 3 4 5 6 7 8 9 10 ...
$ Value : num -0.626 0.184 -0.836 1.595 0.33 ...
$ This_variable_has_a_really_long_and_annoying_name_to_illustrate_the_problem_of_a_data_frame_with_a_long_and_annoying_name: num 0.655 0.353 0.27 0.993 0.633 ...
I would like to use the str
function but impose an upper limit on the number of characters to display in the variable names, so that I get output that is something like the one below. I have read the documentation, but I have not been able to identify if there is an option to do this. (There seem to be options to impose upper limits on the lengths of strings in the data, but I cannot see an option to impose a limit on the length of the variable name.)
'data.frame': 50 obs. of 3 variables:
$ ID : int 1 2 3 4 5 6 7 8 9 10 ...
$ Value : num -0.626 0.184 -0.836 1.595 0.33 ...
$ This_variable_has... : num 0.655 0.353 0.27 0.993 0.633 ...
Question: Is there a simple way to get the structure of the data frame, but imposing a limitation on the length of the variable names (to get output something like the above)?
Upvotes: 1
Views: 199
Reputation: 34601
As far as I can see you're right, there doesn't seem to be a built in means to control this. You also can't do it after the fact because str()
doesn't return anything. So the easiest option seems to be renaming beforehand. Relying on setNames()
, you could create a simple function to accomplish this:
short_str <- function(data, n = 20, ...) {
name_vec <- names(data)
str(setNames(data, ifelse(
nchar(name_vec) > n, paste0(substring(name_vec, 1, n - 4), "... "), name_vec
)), ...)
}
short_str(DATA)
'data.frame': 50 obs. of 3 variables:
$ ID : int 1 2 3 4 5 6 7 8 9 10 ...
$ Value : num -0.626 0.184 -0.836 1.595 0.33 ...
$ This_variable_has... : num 0.655 0.353 0.27 0.993 0.633 ...
Upvotes: 2