Tales Rands
Tales Rands

Reputation: 43

Add one proportion column for every column

I have a dataframe with multiple columns and multiple rows, and my goal is to add, for every one of them, a new column right after it with it's proportion of the total sum of the column.

I have something like:

a b c 
1 4 5 
8 2 3 
1 4 2

and I'm trying to transform it into something like:

a a.2 b b.2 c c.2
1 0.1 4 0.4 5 0.5 
8 0.8 2 0.2 3 0.3
1 0.1 4 0.4 2 0.2

But I can't figure out a way to NAME those new columns in add_column inside a loop.

So far, my code is as follows:

j=1
while (j <= length(colnames(eleicao))) {
  i <- colnames(sample)[j]
  nam <- paste("prop", i, sep = ".")
  j=j+1
  sample <- add_column(sample, parse(nam) = as.list(sample[i]/colSums(sample[i]))[[1]] .after = i)
}

I always get the same problem: Error: Column 'nam' already exists.

How can I accomplish my goal? How can I make add_column understand that I'm trying to name the column using the VALUE of 'nam'?

Upvotes: 1

Views: 1071

Answers (3)

Peter H.
Peter H.

Reputation: 2164

Following solution relies on dplyr included in the tidyverse.

library(tidyverse)

df <- tibble(
  a = c(1, 8, 1),
  b = c(4, 2, 4),
  c = c(5, 3, 2)
)

df %>% 
  mutate_all(funs(prop = . / sum(.)))

Which returns

# A tibble: 3 x 6
      a     b     c a_prop b_prop c_prop
  <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
1     1     4     5    0.1    0.4    0.5
2     8     2     3    0.8    0.2    0.3
3     1     4     2    0.1    0.4    0.2

Upvotes: 2

akrun
akrun

Reputation: 887088

Here is an option using prop.table

cbind(df1, prop.table(as.matrix(df1), 2))[order(rep(names(df1), 2))]
#  a a.1 b b.1 c c.1
#1 1 0.1 4 0.4 5 0.5
#2 8 0.8 2 0.2 3 0.3
#3 1 0.1 4 0.4 2 0.2

Upvotes: 2

pogibas
pogibas

Reputation: 28339

A little bit sloppy solution (using apply):

# Using OPs data stored in df
res <- do.call(cbind, apply(df, 2, function(x) data.frame(x, y = x / sum(x))))
#   a.x a.y b.x b.y c.x c.y
# 1   1 0.1   4 0.4   5 0.5
# 2   8 0.8   2 0.2   3 0.3
# 3   1 0.1   4 0.4   2 0.2

# Name
colnames(res) <- sub(".x", "", sub(".y", ".2", names(res)))

Upvotes: 2

Related Questions