Reputation: 55
I have a table where the first five lines look like this:
WayOfResearch1 | WayOfResearch2 |
---|---|
Search engines | Recommendations from friends and family |
Recommendations from friends and family | Blogs |
Search engines | Search engines |
Online reviews | Online reviews |
Blogs | Search engines |
Now what I am looking for is a way to count the occurences in both columns for the individual items. It would be great if I could have a list with the counts over both columns.
What I tried so far was evaluating a table with both columns with the apply command. There I get the occurences, however I have to add them together manually by variable. Is there a way to combine it?
Thank you in advance for your reply.
Upvotes: 2
Views: 296
Reputation: 857
The reason that you get two similar types(eg. Social media) is that "seemingly" same levels in different columns, are actually different
For example: "social media"
vs. "social media "
.
You shoud omit all the empty spaces after each levels.
As code I wrote bellow:
library(stringr) # library for strings manipulation
#trim all the white spaces before and after levels
research_struct_corrected <- apply(research_struct, 2, str_trim)
It removes all the white spaces before and after each levels and creates something like this(I show it in dput format to see the difference, but in real, it prints out it as routine dataframe:
structure(
list(
WayOfResearch1 = c(
"Search engines",
"Search engines",
"Search engines",
"Search engines",
"Search engines",
"Recommendations from friends and family",
"Search engines",
"Search engines",
"Search engines",
"Search engines"
),
WayOfResearch2 = c(
"Online reviews",
"Recommendations from friends and family",
"Recommendations from friends and family",
"Recommendations from friends and family",
"Recommendations from friends and family",
"Social media",
"Recommendations from friends and family",
"Online reviews",
"Social media",
"Online reviews"
),
WayOfResearch3 = c(
"Social media",
"Print magazines/newspapers",
"Social media",
"Online reviews",
"Online reviews",
"Professional photographers' recommendations",
"Social media",
"Professional photographers' recommendations",
"Email subscriptions",
"Social media"
)
),
row.names = c(NA,
10L),
class = "data.frame"
)
and then:
library(knitr) #library for nice table
df_organized <- as.data.frame(table(unlist(research_struct_corrected)))
colnames(df_organized) <- c("Type", "Value")
kable(df_organized)
It create something like this:
Upvotes: 2
Reputation: 55
To show what I currently get with Taufi's reply:
What I would need is:
Type | Value |
---|---|
Blogs | 21 |
Email subscriptions | 10 |
Bulletin boards/forums/web communities | 7 |
Recommendations from friends and families | 312 |
and so on.
Here is the code/structure from the first 10 lines of my research table:
research_struct <-
structure(
list(
WayOfResearch1 = c(
"Search engines ",
"Search engines ",
"Search engines ",
"Search engines ",
"Search engines ",
"Recommendations from friends and family",
"Search engines ",
"Search engines ",
"Search engines ",
"Search engines "
),
WayOfResearch2 = c(
"Online reviews ",
"Recommendations from friends and family ",
"Recommendations from friends and family ",
"Recommendations from friends and family ",
"Recommendations from friends and family ",
"Social media ",
"Recommendations from friends and family ",
"Online reviews ",
"Social media ",
"Online reviews "
),
WayOfResearch3 = c(
"Social media ",
"Print magazines/newspapers ",
"Social media ",
"Online reviews ",
"Online reviews ",
"Professional photographers’ recommendations ",
"Social media ",
"Professional photographers’ recommendations ",
"Email subscriptions ",
"Social media "
)
),
row.names = c(NA,
10L),
class = "data.frame"
)
I hope it is a bit more clear now. Thanks for looking into it.
Upvotes: 0
Reputation: 316
Simplistic attempt :
example <- data.frame(Search1=c("Engines","Friends","Engines","Reviews","Blogs"),
Search2=c("Friends","Blogs","Engines","Reviews","Engines"))
output <- data.frame(table(example$Search1) + table(example$Search2))
output
Var1 Freq
1 Blogs 2
2 Engines 4
3 Friends 2
4 Reviews 2
Or using unlist
and table
as stated by Taufi in the comments
table(unlist(example))
Blogs Engines Friends Reviews
2 4 2 2
Upvotes: 1