Bill Pearson
Bill Pearson

Reputation: 127

why does apply() produce a list sometimes, and a vector others?

I have this piece of code:

p.data=samp_data[,c('t_het_f','t_ane_f','t_loh_f')]
str(p.data)
head(p.data)
colnames(p.data)
head(apply(p.data,1,which.max))

which for one set of data produces this result:

'data.frame':   449 obs. of  3 variables:
 $ t_het_f: num  0.663 0.688 0.746 0.429 0.484 ...
 $ t_ane_f: num  0.291 0.3 0.247 0.398 0.261 ...
 $ t_loh_f: num  0.04601 0.01236 0.00657 0.17376 0.2546 ...
    t_het_f   t_ane_f     t_loh_f
1 0.6629108 0.2910798 0.046009390
...
6 0.7019118 0.2589706 0.039117647
[1] "t_het_f" "t_ane_f" "t_loh_f"
[1] 1 1 1 1 1 1

But for another set of data produces:

'data.frame':   587 obs. of  3 variables:
 $ t_het_f: num  0.505 0.566 0.205 0.367 0.59 ...
 $ t_ane_f: num  0.491 0.182 0.745 0.42 0.251 ...
 $ t_loh_f: num  0.00427 0.25193 0.05003 0.21227 0.15891 ...
    t_het_f   t_ane_f     t_loh_f
1 0.5048134 0.4909143 0.004272287
...
6 0.8159115 0.1829711 0.001117381
[1] "t_het_f" "t_ane_f" "t_loh_f"
[[1]]
t_het_f 
      1 

[[2]]
t_het_f 
      1

Why would what looks to me like the same data structure (p.data) produce a vector in one case, and a list in another?

Upvotes: 0

Views: 652

Answers (2)

Bill Pearson
Bill Pearson

Reputation: 127

Since the same function (which.max) was applied in both cases, it was not obvious that it might be returning different length values for the two datasets. The difference was being caused by the presence of 'NA' in the second dataset, but not in the first.

Upvotes: 0

akrun
akrun

Reputation: 887118

The return Value in apply depends on the length of the output as mentioned in ?apply

If each call to FUN returns a vector of length n, then apply returns an array of dimension c(n, dim(X)[MARGIN]) if n > 1. If n equals 1, apply returns a vector if MARGIN has length 1 and an array of dimension dim(X)[MARGIN] otherwise. If n is 0, the result has length 0 but not necessarily the ‘correct’ dimension.

If the calls to FUN return vectors of different lengths, apply returns a list of length prod(dim(X)[MARGIN]) with dim set to MARGIN if this has length greater than one.

Upvotes: 0

Related Questions