kin182
kin182

Reputation: 403

How to calculate mean of values from specified rows and order it in R?

I have a set of data like t(USArrests):

          Alabama Alaska Arizona Arkansas California Colorado Connecticut
Murder      13.2   10.0     8.1      8.8        9.0      7.9         3.3
Assault    236.0  263.0   294.0    190.0      276.0    204.0       110.0
UrbanPop    58.0   48.0    80.0     50.0       91.0     78.0        77.0
Rape        21.2   44.5    31.0     19.5       40.6     38.7        11.1

I would like to calculate the mean of Murder and Assault only for each state and sort the states from high to low based on their mean values.

I am new to R and am lost on how to do this. Could someone help me? Thanks!

Upvotes: 1

Views: 128

Answers (3)

Jaguar
Jaguar

Reputation: 204

Consider using the data.table package:

library(data.table)
DT <- data.table(cbind(USArrests), State = row.names(USArrests))
mean_stats <- DT[, list(mean_murder = mean(Murder), 
                        mean_assault = mean(Assault)), by = State]
mean_stats[order(-mean_murder, -mean_assault)]

Here, I've ordered the results in decreasing order, first by mean murder rate and then by mean assault rate. However, as you can see, it is trivial to change that. Here is some sample output:

> head(mean_stats[order(-mean_murder, -mean_assault)])
            State mean_murder mean_assault
1:        Georgia        17.4          211
2:    Mississippi        16.1          259
3:        Florida        15.4          335
4:      Louisiana        15.4          249
5: South Carolina        14.4          279
6:        Alabama        13.2          236

If you are new to R, do yourself a favor and use the data.table package. Generally, it is fast for merging and aggregation and has a compact and understandable syntax.

Upvotes: 0

Mike H.
Mike H.

Reputation: 14360

If you want the mean of Murder and Assault together (assuming this is the case since each state only have 1 obs for each), you could do:

sort(colMeans(df[c("Murder","Assault"),]), decreasing = T)

Or if your data is really untransposed use rowMeans instead:

sort(rowMeans(USArrests[,c("Murder","Assault")]), decreasing = T)

Upvotes: 2

neilfws
neilfws

Reputation: 33782

dplyr is good solution for this. There is no need to t() the data.

library(dplyr)
library(tibble)
USArrests %>%
  rownames_to_column(var = "State") %>% 
  # perform operations by row
  rowwise() %>% 
  # add a column with the mean
  mutate(Mean = mean(c(Murder, Assault))) %>% 
  # should ungroup after using rowwise()
  ungroup() %>% 
  # sort by Mean descending
  arrange(desc(Mean))

Upvotes: 1

Related Questions