SSG_08
SSG_08

Reputation: 47

Dataframe has NAs instead of values

I have a dataframe loaded from an Excel file. It is similar to this:

Gender Country Effect Use Products

Male   UK      1      2   7
Female USA     2      4   6
Male   Russia  3      5   2
Female China   4      2   3
Male   China   3      1   6
Female USA     2      5   2
Male   UK      3      3   1
Female Russia  4      1   7

I want to calculate the mean per Country, like in the example below (excluding Gender):

Country Effect Use Products

UK      3      2   7
USA     2      4   4
Russia  3      5   5
China   4      2   4

I used the similar code to perform this operation (where "d" is name of the database):

country_avg <- aggregate(d[, 3:5], list(d$`Country `), mean)

However, instead of the desired output, the outcoming database looks like this:

Group1 Effect Use Products

UK      NA     NA  NA
USA     NA     NA  NA
Russia  NA     NA  NA
China   NA     NA  NA

The numbers in my dataframe are not identified as numeric values (I've tested that using is.numeric). Moreover, R returns a lot of the following warning messages:

1: In mean.default(X[[i]], ...) :
  argument is not numeric or logical: returning NA

Please let me know how could I possibly fix this problem.

Upvotes: 0

Views: 48

Answers (1)

akrun
akrun

Reputation: 886938

Remove the backquote as it also includes a space as suffix which may not be there in the original data column name.

aggregate(d[, 3:5], list(d$Country ), FUN =  mean)

Or use the formula method which gives the names as in the original data for the grouping column

aggregate(.~ Country, d[-1], FUN = mean)

data

d <- structure(list(Gender = c("Male", "Female", "Male", "Female", 
"Male", "Female", "Male", "Female"), Country = c("UK", "USA", 
"Russia", "China", "China", "USA", "UK", "Russia"), Effect = c(1L, 
2L, 3L, 4L, 3L, 2L, 3L, 4L), Use = c(2L, 4L, 5L, 2L, 1L, 5L, 
3L, 1L), Products = c(7L, 6L, 2L, 3L, 6L, 2L, 1L, 7L)), 
class = "data.frame", row.names = c(NA, 
-8L))

Upvotes: 2

Related Questions