summarytools: How to edit variable names or to add text in dfSummary() output?

I need to summarize a dataset using dfSummary(), but I need to replace the variable names in the output (without having to rename the whole dataset again). Also, I need to write notes on some of the variables (i.e., which variables are reversed, etc.)

I haven't found any way to do it nor in the documentation or in forums online. Thanks!

Upvotes: 0

Views: 826

Answers (1)

Dominic Comtois
Dominic Comtois

Reputation: 10401

There is no easy way of replacing variable names. Best to rename variables in the data frame itself. For notes, see the label() function.

Edit 1
You can also use the footnote= argument of the package's print() function (if using html result) or caption= argument for ascii / markdown results.

Examples:

print(dfSummary(iris), caption="This is caption text")
view(dfSummary(iris), footnote="This is <em>footnote</em> text")

Edit 2
Also, keep in mind that dfSummary produces data frames. So after creation, one can simply modify the contents of the name column:

dfs <- dfSummary(iris)
dfs$Variable
[1] "Sepal.Length\\\n[numeric]" "Sepal.Width\\\n[numeric]"  "Petal.Length\\\n[numeric]"
[4] "Petal.Width\\\n[numeric]"  "Species\\\n[factor]"
dfs$Variable <- <- sub("\\.", " ", dfs$Variable)
print(dfs, graph.col = FALSE, na.col = FALSE)

Data Frame Summary  
iris  
Dimensions: 150 x 5  
Duplicates: 1  

---------------------------------------------------------------------------
No   Variable       Stats / Values          Freqs (% of Valid)   Valid     
---- -------------- ----------------------- -------------------- ----------
1    Sepal Length   Mean (sd) : 5.8 (0.8)   35 distinct values   150       
     [numeric]      min < med < max:                             (100.0%)  
                    4.3 < 5.8 < 7.9                                        
                    IQR (CV) : 1.3 (0.1)                                   

2    Sepal Width    Mean (sd) : 3.1 (0.4)   23 distinct values   150       
     [numeric]      min < med < max:                             (100.0%)  
                    2 < 3 < 4.4                                            
                    IQR (CV) : 0.5 (0.1)                                   

3    Petal Length   Mean (sd) : 3.8 (1.8)   43 distinct values   150       
     [numeric]      min < med < max:                             (100.0%)  
                    1 < 4.3 < 6.9                                          
                    IQR (CV) : 3.5 (0.5)                                   

4    Petal Width    Mean (sd) : 1.2 (0.8)   22 distinct values   150       
     [numeric]      min < med < max:                             (100.0%)  
                    0.1 < 1.3 < 2.5                                        
                    IQR (CV) : 1.5 (0.6)                                   

5    Species        1. setosa               50 (33.3%)           150       
     [factor]       2. versicolor           50 (33.3%)           (100.0%)  
                    3. virginica            50 (33.3%)                     
---------------------------------------------------------------------------

Upvotes: 1

Related Questions