Reputation: 67
How can I replace the values of a df which contains NA values, to the percentage of the contribution to the sum of the row?
Example:
# dummy df
a <- c("x","y","z")
b <- c(10,5,2)
c <- c("NA",1,"NA")
d <- c("NA",4,8)
dummy <- data.frame(a,b,c,d)
a | b | c | |
---|---|---|---|
x | 10 | NA | NA |
y | 5 | 1 | 4 |
z | 2 | NA | 8 |
What I want:
a | b | c | |
---|---|---|---|
x | 100% | NA | NA |
y | 50% | 10% | 40% |
z | 20% | NA | 80% |
Upvotes: 2
Views: 74
Reputation: 67
i worked arround the problem by removing the first column, replacing the the NA with 0, doing the calculation and then reattaching the first column.
dummy[is.na(dummy)] <- 0 # sets na's as zeros
header <- dummy[1] # stores 1st column
df <- round(dummy[-1]/rowSums(dummy[-1])*100,digits=3) # calculates the %
df <- cbind(header,dummy) # joins 1st column to the results
Upvotes: 0
Reputation: 346
First, it's better to use explicit NAs and not strings that say "NA".
Second, you can solve this using dplyr's rowwise()
and across()
:
library(scales)
library(dplyr)
# dummy df with explict NAs
a <- c("x","y","z")
b <- c(10,5,2)
c <- c(NA,1, NA)
d <- c(NA, 4,8)
dummy <- data.frame(a,b,c,d)
dummy %>%
# add column of sum by row
rowwise() %>%
mutate(row_sum = sum(c_across(b:d), na.rm = TRUE),
# divide each column by sum of row
across(b:d, ~ percent(.x / row_sum))) %>%
ungroup() %>%
# remove sum column
select(-row_sum)
# A tibble: 3 x 4
# a b c d
# <chr> <chr> <chr> <chr>
# 1 x 100% NA NA
# 2 y 50% 10% 40%
# 3 z 20% NA 80%
Upvotes: 3
Reputation: 21918
You can also use this:
library(dplyr)
dummy %>%
mutate(across(b:d, ~ ifelse(.x != "NA", paste0(as.numeric(.x) * 10, "%"), .x)))
a b c d
1 x 100% NA NA
2 y 50% 10% 40%
3 z 20% NA 80%
Upvotes: 1
Reputation: 51592
You can simply do,
cbind.data.frame(dummy[1], 10 * (dummy[-1]))
# a b c d
#1 x 100 NA NA
#2 y 50 10 40
#3 z 20 NA 80
NOTE: Your columns must be numeric
Upvotes: 1