Zero
Zero

Reputation: 71

A Custom sort of the values within a dataframe in R

I am a newbie trying to learn R and I have a data frame like this:

  a b c d
a 0 6 2 0
b 1 0 3 0
c 0 0 0 2
d 0 0 0 0 

I want to sort a dataframe by two actions: 1. First, find the row which has the maximum TOTAL value and creating this

  a b c d  TOTAL
a 0 6 2 0    8
b 1 0 3 0    4
c 0 0 0 2    2
d 0 0 0 0    0
  1. Second, select the row with the maximum value and recording the crossed value in front of each character from max to min. So it results into a new dataframe like this:
    'x'
a-b  6    #considering values for "a" where it meets "b"
a-c  2
b-c  3  #b has the second max TOTAL value
b-b  1
c-d  2  # finally, values in front of c

I'd appreciate your help on this one.

Upvotes: 0

Views: 100

Answers (3)

Jon Spring
Jon Spring

Reputation: 67030

EDIT: adding source data at bottom

library(tidyr); library(dplyr)
df %>%
  gather(col, val, -row) %>%   # Pull into long form, with one row for each row-col
  arrange(row, -val) %>%       # Sort by row and descending value
  filter(val != 0) %>%         # Only keep non-zeros
  unite("row", c("row", "col"))# combine row and col columns

  row val
1 a_b   6
2 a_c   2
3 b_c   3
4 b_a   1
5 c_d   2

# Inputing data with "row" column
df <- read.table(
  header = T, 
  stringsAsFactors = F,
  text = "row  a b c d
a 0 6 2 0
b 1 0 3 0
c 0 0 0 2
d 0 0 0 0 ")

Upvotes: 2

Iroha
Iroha

Reputation: 34761

Not completely certain, but is this what you want? You say you have a dataframe but it looks more like you have a matrix and it's not clear if you want to keep your first action or if that's just an intermediate step.

mat <- as.matrix(df)

df1 <- data.frame(addmargins(mat, 2))

df1

  a b c d Sum
a 0 6 2 0   8
b 1 0 3 0   4
c 0 0 0 2   2
d 0 0 0 0   0

df2 <- as.data.frame(as.table(mat))

df2 <- df2[df2$Freq != 0,]

df2[with(df2, order(ave(Freq, Var1, FUN = sum), Freq, decreasing = TRUE)), ]

   Var1 Var2 Freq
5     a    b    6
9     a    c    2
10    b    c    3
2     b    a    1
15    c    d    2

Data:

df <- read.table(text="a b c d
 0 6 2 0
 1 0 3 0
 0 0 0 2
 0 0 0 0", header = TRUE, row.names = letters[1:4])

Upvotes: 1

BENY
BENY

Reputation: 323396

First question is just rowSums , for you second I am using melt , then order with groupby max and the value itself

s=setNames(reshape2::melt(as.matrix(df)), c('rows', 'vars', 'values'))
s=s[s$values!=0,]
s[order(-ave(s$values,s$rows,FUN=max),-s$values),]
rows vars values
5     a    b      6
9     a    c      2
10    b    c      3
2     b    a      1
15    c    d      2

Upvotes: 0

Related Questions