Reputation: 398
I have a list of lists, where each list contains tickers (names) and their values. These tickers stay the same for each list but the values differ. Now, I want to see what is the average value of each of these tickers. The issue is that I don't know how to specify to look into a specific ticker in each list and extract the value. For instance, I want the mean value of "jpm" within this 3 lists. It would be mean(c(0.08620690,0.10000000,0.10000000))
=
0.095402. How can I do so?
What I have so far:
dput(degree.l)
list(c(schwab = 0, pnc = 0.0344827586206897, jpm = 0.0862068965517241,
amex = 0.0862068965517241, gs = 0.103448275862069, ms = 0.103448275862069,
bofa = 0.103448275862069, citi = 0.103448275862069, wf = 0.120689655172414,
spgl = 0.120689655172414, brk = 0.137931034482759), c(schwab = 0.0166666666666667,
pnc = 0.05, ms = 0.0666666666666667, spgl = 0.0833333333333333,
jpm = 0.1, bofa = 0.1, wf = 0.1, amex = 0.1, gs = 0.116666666666667,
brk = 0.116666666666667, citi = 0.15), c(schwab = 0.0428571428571429,
gs = 0.0714285714285714, pnc = 0.0714285714285714, citi = 0.0857142857142857,
amex = 0.0857142857142857, spgl = 0.0857142857142857, jpm = 0.1,
brk = 0.1, ms = 0.114285714285714, wf = 0.114285714285714, bofa = 0.128571428571429
))
degree.unl <- unlist(degree.l)
Upvotes: 5
Views: 812
Reputation: 102920
A data.table
option using rbindlist
+ colMeans
> colMeans(rbindlist(Map(function(x) data.frame(t(x)), degree.1), use.names = TRUE))
schwab pnc jpm amex gs ms bofa
0.01984127 0.05197044 0.09540230 0.09064039 0.09718117 0.09480022 0.11067323
citi wf spgl brk
0.11305419 0.11165846 0.09657909 0.11819923
Then, if you want to retrieve the mean by any name, e.g., schwab
, you can try it like below
colMeans(rbindlist(Map(function(x) data.frame(t(x)), degree.1), use.names = TRUE))["schwab"]
Upvotes: 0
Reputation: 161155
Before unlist
ing,
apply(do.call(rbind, degree.l), 2, mean)
# schwab pnc jpm amex gs ms bofa
# 0.01984127 0.05197044 0.07476738 0.08508484 0.09638752 0.09638752 0.10114943
# citi wf spgl brk
# 0.10114943 0.11721401 0.11721401 0.13883415
Edit: since you say you can't assume that tickers are in order, we can fix that:
nms <- unique(unlist(lapply(degree.l, names)))
nms
# [1] "schwab" "pnc" "jpm" "amex" "gs" "ms" "bofa" "citi"
# [9] "wf" "spgl" "brk"
apply(do.call(rbind, lapply(degree.l, `[`, nms)), 2, mean)
# schwab pnc jpm amex gs ms bofa
# 0.01984127 0.05197044 0.09540230 0.09064039 0.09718117 0.09480022 0.11067323
# citi wf spgl brk
# 0.11305419 0.11165846 0.09657909 0.11819923
For fun, we can jumble them to confirm this works:
set.seed(42)
degree.l.jumbled <- lapply(degree.l, sample)
degree.l.jumbled
# [[1]]
# schwab gs brk wf pnc amex bofa
# 0.00000000 0.10344828 0.13793103 0.12068966 0.03448276 0.08620690 0.10344828
# spgl citi ms jpm
# 0.12068966 0.10344828 0.10344828 0.08620690
# [[2]]
# amex wf spgl schwab jpm bofa gs
# 0.10000000 0.10000000 0.08333333 0.01666667 0.10000000 0.10000000 0.11666667
# pnc brk citi ms
# 0.05000000 0.11666667 0.15000000 0.06666667
# [[3]]
# ms bofa citi amex jpm brk spgl
# 0.11428571 0.12857143 0.08571429 0.08571429 0.10000000 0.10000000 0.08571429
# wf gs pnc schwab
# 0.11428571 0.07142857 0.07142857 0.04285714
apply(do.call(rbind, lapply(degree.l.jumbled, `[`, nms)), 2, mean)
# schwab pnc jpm amex gs ms bofa
# 0.01984127 0.05197044 0.09540230 0.09064039 0.09718117 0.09480022 0.11067323
# citi wf spgl brk
# 0.11305419 0.11165846 0.09657909 0.11819923
Upvotes: 4
Reputation: 227071
Another option:
get_ticker <- function(t) mean(sapply(d, "[[", t))
sapply(names(degree.l[[1]]), get_ticker)
Upvotes: 4
Reputation: 887991
We can use aggregate
with stack
in base R
aggregate(values ~ ind, do.call(rbind, lapply(degree.l, stack)), FUN = mean)
-ouptut
ind values
1 schwab 0.01984127
2 pnc 0.05197044
3 jpm 0.09540230
4 amex 0.09064039
5 gs 0.09718117
6 ms 0.09480022
7 bofa 0.11067323
8 citi 0.11305419
9 wf 0.11165846
10 spgl 0.09657909
11 brk 0.11819923
Or another option is Reduce
(assuming no NAs) to do elementwise addition (+
) and divide by the length
of the list
Reduce(`+`, degree.l)/length(degree.l)
schwab pnc jpm amex gs ms bofa citi wf spgl brk
0.01984127 0.05197044 0.07476738 0.08508484 0.09638752 0.09638752 0.10114943 0.10114943 0.11721401 0.11721401 0.13883415
Or as the OP unlist
ed the dataset, then using that object, group by the names
and use tapply
tapply(degree.unl, names(degree.unl), FUN = mean)
amex bofa brk citi gs jpm ms pnc schwab spgl wf
0.09064039 0.11067323 0.11819923 0.11305419 0.09718117 0.09540230 0.09480022 0.05197044 0.01984127 0.09657909 0.11165846
Upvotes: 3