Reputation: 165
This seems like such a simple task, its killing me that I can't figure it out.
I have the output after using apply and now all I want to do is add the output as a new row called uniq at the end of the data.frame.
df
ID A B C
1 asd dfg ghj
2 qwe sde cdf
3 wed thy red
4 asd sde grf
5 swq sde hty
uniq = apply(df, 2, function(x)length(unique(x)))
uniq output: Named int [1:4]
ID A B C
5 4 3 5
new.df = rbind(df, uniq)
what I would like to see...
ID A B C
1 asd dfg ghj
2 qwe sde cdf
3 wed thy red
4 asd sde grf
5 swq sde hty
5 4 3 5
Error - There were 4 warnings (use warnings() to see them)
I look at the data and although a new row has been added, the totals are not there and instead I am getting NAs in each cell (except for two but I have no idea why).
I saw that maybe I can't just use rrbind because they are not the same types of files and even tried converting the output to a matrix like someone suggested but it doesn't work. Arghhh!
new.df <- rbind(df, matrix(uniq, ncol=25))
Error in match.names(clabs, names(xi)) : names do not match previous names
I checked the headers and they matched - after all the uniq data came from the original df.
Any help greatly appreciated.
Upvotes: 0
Views: 163
Reputation: 99371
It's likely that you've got factor columns. I'll start by saying that what you're attempting is not a very good idea anyway because the columns of a data frame hold the variables, so doing this actually adds one observation to each column.
But you can solve your problem and get the result you desire by coercing the factor columns to characters and appending the calculation. Beginning with a data frame df
sapply(df, class)
# ID A B C
# "integer" "factor" "factor" "factor"
We can use a little function f
to manipulate the columns
f <- function(x) {
c(if(is.factor(x)) levels(x)[x] else x, length(unique(x)))
}
And now ID
is still numeric, but the other three columns are characters, and can be coerced to new factors by setting stringsAsFactors = FALSE
when creating the new data frame
data.frame(lapply(df, f), stringsAsFactors = FALSE)
# ID A B C
# 1 1 asd dfg ghj
# 2 2 qwe sde cdf
# 3 3 wed thy red
# 4 4 asd sde grf
# 5 5 swq sde hty
# 6 5 4 3 5
Upvotes: 1