Reputation: 477
I have dataframe which looks like.
structure(list(format_date = c("Feb-21 (W5)", "Mar-21 (W1)",
"Mar-21 (W2)"), Total_count = c(53L, 1079L, 1288L), Count_Diff_1 = c(29L,
508L, 519L), Count_Diff_2 = c(22L, 505L, 624L), Count_Diff_3 = c(2L,
66L, 145L), Average_Diff_1 = structure(c(10.3464846743295, 6.03846784776903,
2.75151412973667), class = "difftime", units = "hours"), Average_Diff_2 = structure(c(6.46958333333333,
7.78101430143014, 2.58788906695157), class = "difftime", units = "hours"),
Average_Diff_3 = structure(c(162.45419129418, 133.452629846369,
54.3061989570153), class = "difftime", units = "hours"),
Per_Diff_1 = c(0.547169811320755, 0.470806302131603, 0.402950310559006
), Per_Diff_2 = c(0.415094339622642, 0.468025949953661, 0.484472049689441
), Per_Diff_3 = c(0.0377358490566038, 0.0611677479147359,
0.112577639751553)), row.names = c(NA, -3L), groups = structure(list(
format_date = c("Feb-21 (W5)", "Mar-21 (W1)", "Mar-21 (W2)"
), .rows = structure(list(1L, 2L, 3L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, 3L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
I want to convert the values in days if it is >24 hours in Average_1
,Average_2
and Average_3
column and covert the Per_Diff_1
,Per_Diff_2
and Per_Diff_3
with percentage 2 decimal point with % sign.
Also, it would be great if somehow I can mange to order the format_date
in descending order. Right now it is coming as Feb-21 (W5)
, Mar-21 (W1)
and Mar-21 (W2)
it should be Mar-21 (W2)
,Mar-21 (W1)
and Feb-21 (W5)
.
Required df<-
format_date Total_count Count_Diff_1 Count_Diff_2 Count_Diff_3 Average_1 Average_2 Average_3 Per_Diff_1 Per_Diff_2 Per_Diff_3
Feb-21 (W5) 53 29 22 2 10.34 hours 6.46 hours 6.7 Days 54.71% 41.50% 3.77%
Upvotes: 3
Views: 184
Reputation: 73352
You may simply change units
of the difftime class to days in a case handling using if
. For the percentages try formatC
.
d[6:8] <- lapply(d[6:8], function(x) {if (all(x > 24)) units(x) <- "days";round(x, 2)})
d[9:11] <- lapply(d[9:11], function(x) paste0(formatC(x*100, 1, format="f"), "%"))
d <- d[order(d$format_date, decreasing=T), ]
d
# format_date Total_count Count_Diff_1 Count_Diff_2 Count_Diff_3 Average_Diff_1
# 3 Mar-21 (W2) 1288 519 624 145 2.75 hours
# 2 Mar-21 (W1) 1079 508 505 66 6.04 hours
# 1 Feb-21 (W5) 53 29 22 2 10.35 hours
# Average_Diff_2 Average_Diff_3 Per_Diff_1 Per_Diff_2 Per_Diff_3
# 3 2.59 hours 2.26 days 40.3% 48.4% 11.3%
# 2 7.78 hours 5.56 days 47.1% 46.8% 6.1%
# 1 6.47 hours 6.77 days 54.7% 41.5% 3.8%
Upvotes: 1
Reputation: 887541
We can use across
on the columns that starts_with
'Average' to convert the values if it is greater than 24 then divide by 24, append 'Days' as suffix, or else return by appending 'hours' as suffix. Similarly, the columns that starts_with
'Per' can be formatted to percentage with label_percent
from scales
. Then, we arrange
the 'format_date' in desc
ending order after conversion to yearmon
class (from zoo::as.yearmon
)
library(dplyr)
library(stringr)
library(scales)
df1 %>%
ungroup %>%
mutate(across(starts_with('Average'),
~ case_when(as.numeric(.) > 24 ~ str_c(round(as.numeric(.)/24, 2),
' Days'),
TRUE ~ str_c(as.character(round(., 2)), ' hours'))),
across(starts_with('Per'), label_percent())) %>%
arrange(desc(zoo::as.yearmon(str_remove(format_date, '\\s+.*'), '%b-%y')),
desc(readr::parse_number(str_remove_all(format_date, '.*\\(|\\)'))))
-output
# A tibble: 3 x 11
# format_date Total_count Count_Diff_1 Count_Diff_2 Count_Diff_3 Average_Diff_1 Average_Diff_2 Average_Diff_3 Per_Diff_1 Per_Diff_2 Per_Diff_3
# <chr> <int> <int> <int> <int> <chr> <chr> <chr> <chr> <chr> <chr>
#1 Mar-21 (W2) 1288 519 624 145 2.75 hours 2.59 hours 2.26 Days 40.3% 48.4% 11.3%
#2 Mar-21 (W1) 1079 508 505 66 6.04 hours 7.78 hours 5.56 Days 47.1% 46.8% 6.1%
#3 Feb-21 (W5) 53 29 22 2 10.35 hours 6.47 hours 6.77 Days 54.7% 41.5% 3.8%
Upvotes: 4