statwoman
statwoman

Reputation: 398

Mean of an element in list of lists

I have a list of lists, where each list contains tickers (names) and their values. These tickers stay the same for each list but the values differ. Now, I want to see what is the average value of each of these tickers. The issue is that I don't know how to specify to look into a specific ticker in each list and extract the value. For instance, I want the mean value of "jpm" within this 3 lists. It would be mean(c(0.08620690,0.10000000,0.10000000)) = 0.095402. How can I do so?

What I have so far:

dput(degree.l)
list(c(schwab = 0, pnc = 0.0344827586206897, jpm = 0.0862068965517241, 
amex = 0.0862068965517241, gs = 0.103448275862069, ms = 0.103448275862069, 
bofa = 0.103448275862069, citi = 0.103448275862069, wf = 0.120689655172414, 
spgl = 0.120689655172414, brk = 0.137931034482759), c(schwab = 0.0166666666666667, 
pnc = 0.05, ms = 0.0666666666666667, spgl = 0.0833333333333333, 
jpm = 0.1, bofa = 0.1, wf = 0.1, amex = 0.1, gs = 0.116666666666667, 
brk = 0.116666666666667, citi = 0.15), c(schwab = 0.0428571428571429, 
gs = 0.0714285714285714, pnc = 0.0714285714285714, citi = 0.0857142857142857, 
amex = 0.0857142857142857, spgl = 0.0857142857142857, jpm = 0.1, 
brk = 0.1, ms = 0.114285714285714, wf = 0.114285714285714, bofa = 0.128571428571429
))

degree.unl <- unlist(degree.l)

Upvotes: 5

Views: 812

Answers (4)

ThomasIsCoding
ThomasIsCoding

Reputation: 102920

A data.table option using rbindlist + colMeans

> colMeans(rbindlist(Map(function(x) data.frame(t(x)), degree.1), use.names = TRUE))
    schwab        pnc        jpm       amex         gs         ms       bofa
0.01984127 0.05197044 0.09540230 0.09064039 0.09718117 0.09480022 0.11067323
      citi         wf       spgl        brk
0.11305419 0.11165846 0.09657909 0.11819923

Then, if you want to retrieve the mean by any name, e.g., schwab, you can try it like below

colMeans(rbindlist(Map(function(x) data.frame(t(x)), degree.1), use.names = TRUE))["schwab"]

Upvotes: 0

r2evans
r2evans

Reputation: 161155

Before unlisting,

apply(do.call(rbind, degree.l), 2, mean)
#     schwab        pnc        jpm       amex         gs         ms       bofa 
# 0.01984127 0.05197044 0.07476738 0.08508484 0.09638752 0.09638752 0.10114943 
#       citi         wf       spgl        brk 
# 0.10114943 0.11721401 0.11721401 0.13883415 

Edit: since you say you can't assume that tickers are in order, we can fix that:

nms <- unique(unlist(lapply(degree.l, names)))
nms
#  [1] "schwab" "pnc"    "jpm"    "amex"   "gs"     "ms"     "bofa"   "citi"  
#  [9] "wf"     "spgl"   "brk"   

apply(do.call(rbind, lapply(degree.l, `[`, nms)), 2, mean)
#     schwab        pnc        jpm       amex         gs         ms       bofa 
# 0.01984127 0.05197044 0.09540230 0.09064039 0.09718117 0.09480022 0.11067323 
#       citi         wf       spgl        brk 
# 0.11305419 0.11165846 0.09657909 0.11819923 

For fun, we can jumble them to confirm this works:

set.seed(42)
degree.l.jumbled <- lapply(degree.l, sample)
degree.l.jumbled
# [[1]]
#     schwab         gs        brk         wf        pnc       amex       bofa 
# 0.00000000 0.10344828 0.13793103 0.12068966 0.03448276 0.08620690 0.10344828 
#       spgl       citi         ms        jpm 
# 0.12068966 0.10344828 0.10344828 0.08620690 
# [[2]]
#       amex         wf       spgl     schwab        jpm       bofa         gs 
# 0.10000000 0.10000000 0.08333333 0.01666667 0.10000000 0.10000000 0.11666667 
#        pnc        brk       citi         ms 
# 0.05000000 0.11666667 0.15000000 0.06666667 
# [[3]]
#         ms       bofa       citi       amex        jpm        brk       spgl 
# 0.11428571 0.12857143 0.08571429 0.08571429 0.10000000 0.10000000 0.08571429 
#         wf         gs        pnc     schwab 
# 0.11428571 0.07142857 0.07142857 0.04285714 
apply(do.call(rbind, lapply(degree.l.jumbled, `[`, nms)), 2, mean)
#     schwab        pnc        jpm       amex         gs         ms       bofa 
# 0.01984127 0.05197044 0.09540230 0.09064039 0.09718117 0.09480022 0.11067323 
#       citi         wf       spgl        brk 
# 0.11305419 0.11165846 0.09657909 0.11819923 

Upvotes: 4

Ben Bolker
Ben Bolker

Reputation: 227071

Another option:

get_ticker <- function(t) mean(sapply(d, "[[", t))
sapply(names(degree.l[[1]]), get_ticker)

Upvotes: 4

akrun
akrun

Reputation: 887991

We can use aggregate with stack in base R

aggregate(values ~ ind, do.call(rbind, lapply(degree.l, stack)), FUN = mean)

-ouptut

  ind     values
1  schwab 0.01984127
2     pnc 0.05197044
3     jpm 0.09540230
4    amex 0.09064039
5      gs 0.09718117
6      ms 0.09480022
7    bofa 0.11067323
8    citi 0.11305419
9      wf 0.11165846
10   spgl 0.09657909
11    brk 0.11819923

Or another option is Reduce (assuming no NAs) to do elementwise addition (+) and divide by the length of the list

 Reduce(`+`, degree.l)/length(degree.l)
    schwab        pnc        jpm       amex         gs         ms       bofa       citi         wf       spgl        brk 
0.01984127 0.05197044 0.07476738 0.08508484 0.09638752 0.09638752 0.10114943 0.10114943 0.11721401 0.11721401 0.13883415 

Or as the OP unlisted the dataset, then using that object, group by the names and use tapply

tapply(degree.unl, names(degree.unl), FUN = mean)
      amex       bofa        brk       citi         gs        jpm         ms        pnc     schwab       spgl         wf 
0.09064039 0.11067323 0.11819923 0.11305419 0.09718117 0.09540230 0.09480022 0.05197044 0.01984127 0.09657909 0.11165846 

Upvotes: 3

Related Questions