Reputation: 13113
I would like to tabulate by row within a data frame. I can obtain adequate results using table
within apply
in the following example:
df.1 <- read.table(text = '
state county city year1 year2 year3 year4 year5
1 2 4 0 0 0 1 2
2 5 3 10 20 10 NA 10
2 7 1 200 200 NA NA 200
3 1 1 NA NA NA NA NA
', na.strings = "NA", header=TRUE)
tdf <- t(df.1)
apply(tdf[4:nrow(tdf),1:nrow(df.1)], 2, function(x) {table(x, useNA = "ifany")})
Here are the results:
[[1]]
x
0 1 2
3 1 1
[[2]]
x
10 20 <NA>
3 1 1
[[3]]
x
200 <NA>
3 2
[[4]]
x
<NA>
5
However, in the following example each row consists of a single value.
df.2 <- read.table(text = '
state county city year1 year2 year3 year4 year5
1 2 4 0 0 0 0 0
2 5 3 1 1 1 1 1
2 7 1 2 2 2 2 2
3 1 1 NA NA NA NA NA
', na.strings = "NA", header=TRUE)
tdf.2 <- t(df.2)
apply(tdf.2[4:nrow(tdf.2),1:nrow(df.2)], 2, function(x) {table(x, useNA = "ifany")})
The output I obtain is:
# [1] 5 5 5 5
As such, I cannot tell from this output that the first 5 is for 0, the second 5 is for 1, the third 5 is for 2 and the last 5 is for NA. Is there a way I can have R return the value represented by each 5 in the second example?
Upvotes: 2
Views: 141
Reputation: 66819
Here's a table
solution:
table(
rep(rownames(df.1),5),
unlist(df.1[,4:8]),
useNA="ifany")
This gives
0 1 2 10 20 200 <NA>
1 3 1 1 0 0 0 0
2 0 0 0 3 1 0 1
3 0 0 0 0 0 3 2
4 0 0 0 0 0 0 5
...and for your df.2
:
0 1 2 <NA>
1 5 0 0 0
2 0 5 0 0
3 0 0 5 0
4 0 0 0 5
Well, this is a solution unless you really like having a list of tables for some reason.
Upvotes: 3
Reputation: 263411
Protect the result by wrapping with list
:
apply(tdf.2[4:nrow(tdf.2),1:nrow(df.2)], 2,
function(x) {list(table(x, useNA = "ifany")) })
Upvotes: 4
Reputation: 7113
I think the problem is stated in apply
s help:
... If n equals 1, apply returns a vector if MARGIN has length 1 and an array of dimension dim(X)[MARGIN] otherwise ...
The inconsistencies of the return values of base R's apply family is the reason why I shifted completely to plyr
s **ply functions. So this works as desired:
library(plyr)
alply( df.2[ 4:8 ], 1, function(x) table( unlist(x), useNA = "ifany" ) )
Upvotes: 2
Reputation: 89097
You can use lapply
to systematically output a list. You would have to loop over the row indices:
sub.df <- as.matrix(df.2[grepl("year", names(df.2))])
lapply(seq_len(nrow(sub.df)),
function(i)table(sub.df[i, ], useNA = "ifany"))
Upvotes: 6