Reputation: 31
I need to summarize a dataset using dfSummary()
, but I need to replace the variable names in the output (without having to rename the whole dataset again). Also, I need to write notes on some of the variables (i.e., which variables are reversed, etc.)
I haven't found any way to do it nor in the documentation or in forums online. Thanks!
Upvotes: 0
Views: 826
Reputation: 10401
There is no easy way of replacing variable names. Best to rename variables in the data frame itself. For notes, see the label()
function.
Edit 1
You can also use the footnote=
argument of the package's print()
function (if using html result) or caption=
argument for ascii / markdown results.
Examples:
print(dfSummary(iris), caption="This is caption text")
view(dfSummary(iris), footnote="This is <em>footnote</em> text")
Edit 2
Also, keep in mind that dfSummary produces data frames. So after creation, one can simply modify the contents of the name column:
dfs <- dfSummary(iris)
dfs$Variable
[1] "Sepal.Length\\\n[numeric]" "Sepal.Width\\\n[numeric]" "Petal.Length\\\n[numeric]"
[4] "Petal.Width\\\n[numeric]" "Species\\\n[factor]"
dfs$Variable <- <- sub("\\.", " ", dfs$Variable)
print(dfs, graph.col = FALSE, na.col = FALSE)
Data Frame Summary
iris
Dimensions: 150 x 5
Duplicates: 1
---------------------------------------------------------------------------
No Variable Stats / Values Freqs (% of Valid) Valid
---- -------------- ----------------------- -------------------- ----------
1 Sepal Length Mean (sd) : 5.8 (0.8) 35 distinct values 150
[numeric] min < med < max: (100.0%)
4.3 < 5.8 < 7.9
IQR (CV) : 1.3 (0.1)
2 Sepal Width Mean (sd) : 3.1 (0.4) 23 distinct values 150
[numeric] min < med < max: (100.0%)
2 < 3 < 4.4
IQR (CV) : 0.5 (0.1)
3 Petal Length Mean (sd) : 3.8 (1.8) 43 distinct values 150
[numeric] min < med < max: (100.0%)
1 < 4.3 < 6.9
IQR (CV) : 3.5 (0.5)
4 Petal Width Mean (sd) : 1.2 (0.8) 22 distinct values 150
[numeric] min < med < max: (100.0%)
0.1 < 1.3 < 2.5
IQR (CV) : 1.5 (0.6)
5 Species 1. setosa 50 (33.3%) 150
[factor] 2. versicolor 50 (33.3%) (100.0%)
3. virginica 50 (33.3%)
---------------------------------------------------------------------------
Upvotes: 1