Falimus
Falimus

Reputation: 55

How to count occurrences over several columns in R

I have a table where the first five lines look like this:

WayOfResearch1 WayOfResearch2
Search engines Recommendations from friends and family
Recommendations from friends and family Blogs
Search engines Search engines
Online reviews Online reviews
Blogs Search engines

Now what I am looking for is a way to count the occurences in both columns for the individual items. It would be great if I could have a list with the counts over both columns.

What I tried so far was evaluating a table with both columns with the apply command. There I get the occurences, however I have to add them together manually by variable. Is there a way to combine it?

Thank you in advance for your reply.

Upvotes: 2

Views: 296

Answers (3)

Behnam Hedayat
Behnam Hedayat

Reputation: 857

The reason that you get two similar types(eg. Social media) is that "seemingly" same levels in different columns, are actually different
For example: "social media" vs. "social media ".
You shoud omit all the empty spaces after each levels.
As code I wrote bellow:

library(stringr)  # library for strings manipulation

#trim all the white spaces before and after levels
research_struct_corrected <- apply(research_struct, 2, str_trim)

It removes all the white spaces before and after each levels and creates something like this(I show it in dput format to see the difference, but in real, it prints out it as routine dataframe:

 structure(
    list(
      WayOfResearch1 = c(
        "Search engines",
        "Search engines",
        "Search engines",
        "Search engines",
        "Search engines",
        "Recommendations from friends and family",
        "Search engines",
        "Search engines",
        "Search engines",
        "Search engines"
      ),
      WayOfResearch2 = c(
        "Online reviews",
        "Recommendations from friends and family",
        "Recommendations from friends and family",
        "Recommendations from friends and family",
        "Recommendations from friends and family",
        "Social media",
        "Recommendations from friends and family",
        "Online reviews",
        "Social media",
        "Online reviews"
      ),
      WayOfResearch3 = c(
        "Social media",
        "Print magazines/newspapers",
        "Social media",
        "Online reviews",
        "Online reviews",
        "Professional photographers' recommendations",
        "Social media",
        "Professional photographers' recommendations",
        "Email subscriptions",
        "Social media"
      )
    ),
    row.names = c(NA,
                  10L),
    class = "data.frame"
  )

and then:

library(knitr) #library for nice table
df_organized <- as.data.frame(table(unlist(research_struct_corrected)))

colnames(df_organized) <- c("Type", "Value")

kable(df_organized)

It create something like this:
enter image description here


Now, does it work?

Upvotes: 2

Falimus
Falimus

Reputation: 55

To show what I currently get with Taufi's reply:

Result of Taufi's code

What I would need is:

Type Value
Blogs 21
Email subscriptions 10
Bulletin boards/forums/web communities 7
Recommendations from friends and families 312

and so on.

Here is the code/structure from the first 10 lines of my research table:

research_struct <-
  structure(
    list(
      WayOfResearch1 = c(
        "Search engines                         ",
        "Search engines                         ",
        "Search engines                         ",
        "Search engines                         ",
        "Search engines                         ",
        "Recommendations from friends and family",
        "Search engines                         ",
        "Search engines                         ",
        "Search engines                         ",
        "Search engines                         "
      ),
      WayOfResearch2 = c(
        "Online reviews                          ",
        "Recommendations from friends and family ",
        "Recommendations from friends and family ",
        "Recommendations from friends and family ",
        "Recommendations from friends and family ",
        "Social media                            ",
        "Recommendations from friends and family ",
        "Online reviews                          ",
        "Social media                            ",
        "Online reviews                          "
      ),
      WayOfResearch3 = c(
        "Social media                                  ",
        "Print magazines/newspapers                    ",
        "Social media                                  ",
        "Online reviews                                ",
        "Online reviews                                ",
        "Professional photographers’ recommendations ",
        "Social media                                  ",
        "Professional photographers’ recommendations ",
        "Email subscriptions                           ",
        "Social media                                  "
      )
    ),
    row.names = c(NA,
                  10L),
    class = "data.frame"
  )

I hope it is a bit more clear now. Thanks for looking into it.

Upvotes: 0

A.Chrlt
A.Chrlt

Reputation: 316

Simplistic attempt :

example <- data.frame(Search1=c("Engines","Friends","Engines","Reviews","Blogs"),
                      Search2=c("Friends","Blogs","Engines","Reviews","Engines"))

output <- data.frame(table(example$Search1) + table(example$Search2))
output
     Var1 Freq
1   Blogs    2
2 Engines    4
3 Friends    2
4 Reviews    2

Or using unlist and table as stated by Taufi in the comments

table(unlist(example))

  Blogs Engines Friends Reviews 
      2       4       2       2 

Upvotes: 1

Related Questions