Reputation: 216
I have a dataframe:
X65L X65L.1 X65L.2 X67L X67L.1 X65L.3
[1,] 0.0065 0.0065 0.0065 0.0067 0.0067 0.0065
[2,] 0.0065 0.0065 0.0065 0.0067 0.0067 0.0065
[3,] 0.0065 0.0065 0.0065 0.0067 0.0067 0.0065
[4,] 0.0065 0.0067 0.0065 0.0067 0.0067 0.0065
[5,] 0.0065 0.0067 0.0065 0.0067 0.0067 0.0065
[6,] 0.0065 0.0067 0.0065 0.0067 0.0067 0.0065
[7,] 0.0067 0.0071 0.0067 0.0067 0.0071 0.0067
[8,] 0.0067 0.0071 0.0067 0.0067 0.0071 0.0067
[9,] 0.0067 0.0071 0.0067 0.0067 0.0071 0.0071
[10,] 0.0067 0.0084 0.0067 0.0067 0.0084 0.0071
[11,] 0.0067 0.0084 0.0067 0.0067 0.0084 0.0084
[12,] 0.0067 0.0084 0.0067 0.0067 0.0084 0.0084
I want to count the frequency of each unique element in a column and get an output like this:
6 3 6 0 0 6
6 3 6 12 6 2
0 3 0 0 3 2
0 3 0 0 3 2
The MATLAB equivalent is:
[m1 n1]=hist(s,unique(s));
I would like to know, how this could be done in R.
Upvotes: 0
Views: 83
Reputation: 887118
We could also do this with
table(c(mat), colnames(mat)[col(mat)])
-output
# X65L X65L.1 X65L.2 X65L.3 X67L X67L.1
# 0.0065 6 3 6 6 0 0
# 0.0067 6 3 6 2 12 6
# 0.0071 0 3 0 2 0 3
# 0.0084 0 3 0 2 0 3
mat <- structure(c(0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0067,
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0065, 0.0065, 0.0065,
0.0067, 0.0067, 0.0067, 0.0071, 0.0071, 0.0071, 0.0084, 0.0084,
0.0084, 0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0067,
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067,
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067,
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0071,
0.0071, 0.0071, 0.0084, 0.0084, 0.0084, 0.0065, 0.0065, 0.0065,
0.0065, 0.0065, 0.0065, 0.0067, 0.0067, 0.0071, 0.0071, 0.0084,
0.0084), .Dim = c(12L, 6L), .Dimnames = list(NULL, c("X65L",
"X65L.1", "X65L.2", "X67L", "X67L.1", "X65L.3")))
Upvotes: 0
Reputation: 1816
In order to obtain the desired output we need to fill in 0 whenever we don't have certain values within a column:
Code
# First obtain all possible values
name <- levels(as.factor(unlist(df)))
tmp1 <- rep(0, length(name))
names(tmp1) <- name
tmp1
# 0.0065 0.0067 0.0071 0.0084
# 0 0 0 0
# Now fill this table whenever we have additional information within a column
sapply(df, function(x){
tmp1[names(table(x))] <- table(x)
tmp1
})
# X65L X65L.1 X65L.2 X67L X67L.1 X65L.3
# 0.0065 6 3 6 6 6 6
# 0.0067 6 3 6 12 6 2
# 0.0071 0 3 0 0 3 2
# 0.0084 0 3 0 0 3 2
Data
df <- read.table(text = "X65L X65L.1 X65L.2 X67L X67L.1 X65L.3
0.0065 0.0065 0.0065 0.0067 0.0067 0.0065
0.0065 0.0065 0.0065 0.0067 0.0067 0.0065
0.0065 0.0065 0.0065 0.0067 0.0067 0.0065
0.0065 0.0067 0.0065 0.0067 0.0067 0.0065
0.0065 0.0067 0.0065 0.0067 0.0067 0.0065
0.0065 0.0067 0.0065 0.0067 0.0067 0.0065
0.0067 0.0071 0.0067 0.0067 0.0071 0.0067
0.0067 0.0071 0.0067 0.0067 0.0071 0.0067
0.0067 0.0071 0.0067 0.0067 0.0071 0.0071
0.0067 0.0084 0.0067 0.0067 0.0084 0.0071
0.0067 0.0084 0.0067 0.0067 0.0084 0.0084
0.0067 0.0084 0.0067 0.0067 0.0084 0.0084", header = T)
Upvotes: 1
Reputation: 101373
You can try the code below
apply(mat, 2, function(x) table(factor(x, levels = unique(c(mat)))))
which gives
X65L X65L.1 X65L.2 X67L X67L.1 X65L.3
0.0065 6 3 6 0 0 6
0.0067 6 3 6 12 6 2
0.0071 0 3 0 0 3 2
0.0084 0 3 0 0 3 2
Data
> dput(mat)
structure(c(0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0067,
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0065, 0.0065, 0.0065,
0.0067, 0.0067, 0.0067, 0.0071, 0.0071, 0.0071, 0.0084, 0.0084,
0.0084, 0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0065, 0.0067,
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067,
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067,
0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0067, 0.0071,
0.0071, 0.0071, 0.0084, 0.0084, 0.0084, 0.0065, 0.0065, 0.0065,
0.0065, 0.0065, 0.0065, 0.0067, 0.0067, 0.0071, 0.0071, 0.0084,
0.0084), .Dim = c(12L, 6L), .Dimnames = list(NULL, c("X65L",
"X65L.1", "X65L.2", "X67L", "X67L.1", "X65L.3")))
Upvotes: 2
Reputation: 1059
table()
does that function for a vector. You can use apply()
to run this function for every column. apply(data.frame,2,table)
. Results may be presented as a list, tough, if the values are different between the columns.
Upvotes: 0