Mriti Agarwal
Mriti Agarwal

Reputation: 216

how to get the frequency of unique elements in a column of a dataframe in R?

I have a dataframe:

 X65L X65L.1 X65L.2   X67L X67L.1 X65L.3
 [1,] 0.0065 0.0065 0.0065 0.0067 0.0067 0.0065
 [2,] 0.0065 0.0065 0.0065 0.0067 0.0067 0.0065
 [3,] 0.0065 0.0065 0.0065 0.0067 0.0067 0.0065
 [4,] 0.0065 0.0067 0.0065 0.0067 0.0067 0.0065
 [5,] 0.0065 0.0067 0.0065 0.0067 0.0067 0.0065
 [6,] 0.0065 0.0067 0.0065 0.0067 0.0067 0.0065
 [7,] 0.0067 0.0071 0.0067 0.0067 0.0071 0.0067
 [8,] 0.0067 0.0071 0.0067 0.0067 0.0071 0.0067
 [9,] 0.0067 0.0071 0.0067 0.0067 0.0071 0.0071
[10,] 0.0067 0.0084 0.0067 0.0067 0.0084 0.0071
[11,] 0.0067 0.0084 0.0067 0.0067 0.0084 0.0084
[12,] 0.0067 0.0084 0.0067 0.0067 0.0084 0.0084

I want to count the frequency of each unique element in a column and get an output like this:

     6     3     6     0     0     6
     6     3     6    12     6     2
     0     3     0     0     3     2
     0     3     0     0     3     2

The MATLAB equivalent is:

[m1 n1]=hist(s,unique(s));

I would like to know, how this could be done in R.

Upvotes: 0

Views: 83

Answers (4)

akrun
akrun

Reputation: 887118

We could also do this with

table(c(mat), colnames(mat)[col(mat)])

-output

#           X65L X65L.1 X65L.2 X65L.3 X67L X67L.1
#  0.0065    6      3      6      6    0      0
#  0.0067    6      3      6      2   12      6
#  0.0071    0      3      0      2    0      3
#  0.0084    0      3      0      2    0      3

data

mat <- structure(c(0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0067, 
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0065, 0.0065, 0.0065,
0.0067, 0.0067, 0.0067, 0.0071, 0.0071, 0.0071, 0.0084, 0.0084,
0.0084, 0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0067,
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067,
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0071,
0.0071, 0.0071, 0.0084, 0.0084, 0.0084, 0.0065, 0.0065, 0.0065,
0.0065, 0.0065, 0.0065, 0.0067, 0.0067, 0.0071, 0.0071, 0.0084,
0.0084), .Dim = c(12L, 6L), .Dimnames = list(NULL, c("X65L",
"X65L.1", "X65L.2", "X67L", "X67L.1", "X65L.3")))

Upvotes: 0

fabla
fabla

Reputation: 1816

In order to obtain the desired output we need to fill in 0 whenever we don't have certain values within a column:

Code

# First obtain all possible values
name <- levels(as.factor(unlist(df)))
tmp1 <- rep(0, length(name))
names(tmp1) <- name

tmp1
# 0.0065 0.0067 0.0071 0.0084 
#      0      0      0      0 

# Now fill this table whenever we have additional information within a column

sapply(df, function(x){
  tmp1[names(table(x))] <- table(x) 
  tmp1
})

#        X65L X65L.1 X65L.2 X67L X67L.1 X65L.3
# 0.0065    6      3      6    6      6      6
# 0.0067    6      3      6   12      6      2
# 0.0071    0      3      0    0      3      2
# 0.0084    0      3      0    0      3      2

Data

df <- read.table(text = "X65L X65L.1 X65L.2   X67L X67L.1 X65L.3
0.0065 0.0065 0.0065 0.0067 0.0067 0.0065
0.0065 0.0065 0.0065 0.0067 0.0067 0.0065
0.0065 0.0065 0.0065 0.0067 0.0067 0.0065
0.0065 0.0067 0.0065 0.0067 0.0067 0.0065
0.0065 0.0067 0.0065 0.0067 0.0067 0.0065
0.0065 0.0067 0.0065 0.0067 0.0067 0.0065
0.0067 0.0071 0.0067 0.0067 0.0071 0.0067
0.0067 0.0071 0.0067 0.0067 0.0071 0.0067
0.0067 0.0071 0.0067 0.0067 0.0071 0.0071
0.0067 0.0084 0.0067 0.0067 0.0084 0.0071
0.0067 0.0084 0.0067 0.0067 0.0084 0.0084
0.0067 0.0084 0.0067 0.0067 0.0084 0.0084", header = T)

Upvotes: 1

ThomasIsCoding
ThomasIsCoding

Reputation: 101373

You can try the code below

apply(mat, 2, function(x) table(factor(x, levels = unique(c(mat)))))

which gives

       X65L X65L.1 X65L.2 X67L X67L.1 X65L.3
0.0065    6      3      6    0      0      6
0.0067    6      3      6   12      6      2
0.0071    0      3      0    0      3      2
0.0084    0      3      0    0      3      2

Data

> dput(mat)
structure(c(0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0067, 
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0065, 0.0065, 0.0065,
0.0067, 0.0067, 0.0067, 0.0071, 0.0071, 0.0071, 0.0084, 0.0084,
0.0084, 0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0067,
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067,
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0071,
0.0071, 0.0071, 0.0084, 0.0084, 0.0084, 0.0065, 0.0065, 0.0065,
0.0065, 0.0065, 0.0065, 0.0067, 0.0067, 0.0071, 0.0071, 0.0084,
0.0084), .Dim = c(12L, 6L), .Dimnames = list(NULL, c("X65L",
"X65L.1", "X65L.2", "X67L", "X67L.1", "X65L.3")))

Upvotes: 2

JMenezes
JMenezes

Reputation: 1059

table() does that function for a vector. You can use apply() to run this function for every column. apply(data.frame,2,table). Results may be presented as a list, tough, if the values are different between the columns.

Upvotes: 0

Related Questions