Reputation: 43
I have an R dataframe where I have imported a CSV file from some questionnaire data.
One of my columns is called 'NewsMethods', where respondents have been asked to list methods they get news. This data in my dataset looks like this:
Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth
Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth
Radio;Television;Social Media websites or apps;Word of mouth Television;Social Media websites or apps
... and so on.
What I would like to be able to do replace each column with the number of elements it contains. For example I would like to replace the first list with the number 5.
If anyone has any ideas for how I could do this I would be very grateful. TIA
Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth
Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth
Radio;Television;Social Media websites or apps;Word of mouth
Television;Social Media websites or apps
Newspaper;Radio;Television;News websites (such as BBC News)
Television
Radio;Television;Word of mouth
Television;Social Media websites or apps;Word of mouth
Television;Word of mouth
Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth
I am expecting this to instead read: 5 5 4 2 4 1 3 3 2 6
Upvotes: 2
Views: 1984
Reputation: 886938
We can use str_count
from stringr
library(stringr)
df1$Count <- str_count(df1$NewsMethods, ";")+1
df1$Count
#[1] 5 5 4 2 4 1 3 3 2 6
Or a base R
option is with regexpr
lengths(lapply(gregexpr(";", df1$NewsMethods), function(x) x[x>0]) )+1
#[1] 5 5 4 2 4 1 3 3 2 6
df1 <- structure(list(NewsMethods = c('Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
'Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
'Radio;Television;Social Media websites or apps;Word of mouth',
'Television;Social Media websites or apps',
'Newspaper;Radio;Television;News websites (such as BBC News)',
'Television',
'Radio;Television;Word of mouth',
'Television;Social Media websites or apps;Word of mouth',
'Television;Word of mouth',
'Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth')),
.Names = "NewsMethods", row.names = c(NA, -10L), class = "data.frame")
Upvotes: 2
Reputation: 83215
A base R alternative with a combination of strsplit
and lengths
:
lengths(strsplit(dfr$NewsMethods, split = ';'))
which gives:
> lengths(strsplit(dfr$NewsMethods, split = ';'))
[1] 5 5 4 2 4 1 3 3 2 6
Assigning the result to a count
-variable in your dataframe:
dfr$count <- lengths(strsplit(dfr$NewsMethods, split = ';'))
now your dataframe looks like:
> dfr
NewsMethods count
1 Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth 5
2 Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth 5
3 Radio;Television;Social Media websites or apps;Word of mouth 4
4 Television;Social Media websites or apps 2
5 Newspaper;Radio;Television;News websites (such as BBC News) 4
6 Television 1
7 Radio;Television;Word of mouth 3
8 Television;Social Media websites or apps;Word of mouth 3
9 Television;Word of mouth 2
10 Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth 6
Used data:
dfr <- structure(list(NewsMethods = c('Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
'Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
'Radio;Television;Social Media websites or apps;Word of mouth',
'Television;Social Media websites or apps',
'Newspaper;Radio;Television;News websites (such as BBC News)',
'Television',
'Radio;Television;Word of mouth',
'Television;Social Media websites or apps;Word of mouth',
'Television;Word of mouth',
'Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth')),
.Names = "NewsMethods", row.names = c(NA, -10L), class = "data.frame")
Upvotes: 2