M.O
M.O

Reputation: 471

Count frequency of same value in several columns

I'm quite new to R and I'm facing a problem which I guess is quite easy to fix but I couldn't find the answer.

I have a dataframe called clg where basically I have 3 columns date, X1, X2. X1 and X2 are name of country teams. X1 and X2 have the same list of countries.

I'm simply trying to count the frequency of each country in the two columns as a total.

So far, I've only been able to count the frequency of the X1 column but I didn't find a way to sum both columns.

clt <- as_tibble(na.omit(count(clg, clg$X1)))

I would like to get a data frame where in the first columns I have unique countries, and in the second column the sum of occurrences in X1 + X2.

Upvotes: 2

Views: 588

Answers (3)

akrun
akrun

Reputation: 887751

With tidyverse, we can gather into 'long' format and then do the count

library(tidyverse)
gather(clg, key, Var1, -date) %>%
     count(Var1)
# A tibble: 4 x 2
#  Var1      n
#  <chr> <int>
#1 alg       2
#2 jpn       1
#3 nor       1
#4 swe       2

data

clg <- structure(list(date = 1:3, X1 = structure(c(2L, 3L, 1L), .Label = c("alg", 
"nor", "swe"), class = "factor"), X2 = structure(c(3L, 1L, 2L
), .Label = c("alg", "jpn", "swe"), class = "factor")),
   class = "data.frame", row.names = c(NA, 
-3L))

Upvotes: 1

lroha
lroha

Reputation: 34586

You can useunlist() and table() to get the overall counts. Wrapping it in data.frame() will give you the desired two column output.

clg <- data.frame(date=1:3, 
                  X1=c("nor", "swe", "alg"), 
                  X2=c("swe", "alg", "jpn"))

data.frame(table(unlist(clg[c("X1", "X2")])))
#   Var1 Freq
# 1  alg    2
# 2  nor    1
# 3  swe    2
# 4  jpn    1

Upvotes: 2

M. Schumacher
M. Schumacher

Reputation: 150

You can obtain your goal with two steps. In the first step, you calculate the sum of occurrences for each country. In the next step, you're joining the two df's together and calculate the total sum.

   X1_sum <- df %>%
      dplyr::group_by(X1) %>%
      dplyr::summarize(n_x1 = n())

   X2_sum <- df %>%
      dplyr::group_by(X2) %>%
      dplyr::summarize(n_x2 = n()

   final_summary <- X1_sum %>%
      # merging data with by country names
      dplyr::left_join(., X2_sum, by = c("X1", "X2")) %>%
      dplyr::mutate(n_sum = n_x1 + n_x2)

Upvotes: 0

Related Questions