Reputation: 55

How to count occurrences over several columns in R

I have a table where the first five lines look like this:

WayOfResearch1	WayOfResearch2
Search engines	Recommendations from friends and family
Recommendations from friends and family	Blogs
Search engines	Search engines
Online reviews	Online reviews
Blogs	Search engines

Now what I am looking for is a way to count the occurences in both columns for the individual items. It would be great if I could have a list with the counts over both columns.

What I tried so far was evaluating a table with both columns with the apply command. There I get the occurences, however I have to add them together manually by variable. Is there a way to combine it?

Thank you in advance for your reply.

Upvotes: 2

Answers (3)

Behnam Hedayat

Reputation: 857

The reason that you get two similar types(eg. Social media) is that "seemingly" same levels in different columns, are actually different
For example: "social media" vs. "social media ".
You shoud omit all the empty spaces after each levels.
As code I wrote bellow:

library(stringr)  # library for strings manipulation

#trim all the white spaces before and after levels
research_struct_corrected <- apply(research_struct, 2, str_trim)

It removes all the white spaces before and after each levels and creates something like this(I show it in dput format to see the difference, but in real, it prints out it as routine dataframe:

 structure(
    list(
      WayOfResearch1 = c(
        "Search engines",
        "Search engines",
        "Search engines",
        "Search engines",
        "Search engines",
        "Recommendations from friends and family",
        "Search engines",
        "Search engines",
        "Search engines",
        "Search engines"
      ),
      WayOfResearch2 = c(
        "Online reviews",
        "Recommendations from friends and family",
        "Recommendations from friends and family",
        "Recommendations from friends and family",
        "Recommendations from friends and family",
        "Social media",
        "Recommendations from friends and family",
        "Online reviews",
        "Social media",
        "Online reviews"
      ),
      WayOfResearch3 = c(
        "Social media",
        "Print magazines/newspapers",
        "Social media",
        "Online reviews",
        "Online reviews",
        "Professional photographers' recommendations",
        "Social media",
        "Professional photographers' recommendations",
        "Email subscriptions",
        "Social media"
      )
    ),
    row.names = c(NA,
                  10L),
    class = "data.frame"
  )

and then:

library(knitr) #library for nice table
df_organized <- as.data.frame(table(unlist(research_struct_corrected)))

colnames(df_organized) <- c("Type", "Value")

kable(df_organized)

It create something like this:

Now, does it work?

Upvotes: 2

Falimus

Reputation: 55

To show what I currently get with Taufi's reply:

What I would need is:

Type	Value
Blogs	21
Email subscriptions	10
Bulletin boards/forums/web communities	7
Recommendations from friends and families	312

and so on.

Here is the code/structure from the first 10 lines of my research table:

research_struct <-
  structure(
    list(
      WayOfResearch1 = c(
        "Search engines                         ",
        "Search engines                         ",
        "Search engines                         ",
        "Search engines                         ",
        "Search engines                         ",
        "Recommendations from friends and family",
        "Search engines                         ",
        "Search engines                         ",
        "Search engines                         ",
        "Search engines                         "
      ),
      WayOfResearch2 = c(
        "Online reviews                          ",
        "Recommendations from friends and family ",
        "Recommendations from friends and family ",
        "Recommendations from friends and family ",
        "Recommendations from friends and family ",
        "Social media                            ",
        "Recommendations from friends and family ",
        "Online reviews                          ",
        "Social media                            ",
        "Online reviews                          "
      ),
      WayOfResearch3 = c(
        "Social media                                  ",
        "Print magazines/newspapers                    ",
        "Social media                                  ",
        "Online reviews                                ",
        "Online reviews                                ",
        "Professional photographers’ recommendations ",
        "Social media                                  ",
        "Professional photographers’ recommendations ",
        "Email subscriptions                           ",
        "Social media                                  "
      )
    ),
    row.names = c(NA,
                  10L),
    class = "data.frame"
  )

I hope it is a bit more clear now. Thanks for looking into it.

Upvotes: 0

A.Chrlt

Reputation: 316

Simplistic attempt :

example <- data.frame(Search1=c("Engines","Friends","Engines","Reviews","Blogs"),
                      Search2=c("Friends","Blogs","Engines","Reviews","Engines"))

output <- data.frame(table(example$Search1) + table(example$Search2))
output
     Var1 Freq
1   Blogs    2
2 Engines    4
3 Friends    2
4 Reviews    2

Or using unlist and table as stated by Taufi in the comments

table(unlist(example))

  Blogs Engines Friends Reviews 
      2       4       2       2

Upvotes: 1

How to count occurrences over several columns in R

Answers (3)

Related Questions