J.M
J.M

Reputation: 43

Counting number of elements in a dataframe column

I have an R dataframe where I have imported a CSV file from some questionnaire data.

One of my columns is called 'NewsMethods', where respondents have been asked to list methods they get news. This data in my dataset looks like this:

  1. Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth

  2. Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth

  3. Radio;Television;Social Media websites or apps;Word of mouth Television;Social Media websites or apps

... and so on.

What I would like to be able to do replace each column with the number of elements it contains. For example I would like to replace the first list with the number 5.

If anyone has any ideas for how I could do this I would be very grateful. TIA

Edit

Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth
Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth
Radio;Television;Social Media websites or apps;Word of mouth
Television;Social Media websites or apps
Newspaper;Radio;Television;News websites (such as BBC News)
Television
Radio;Television;Word of mouth
Television;Social Media websites or apps;Word of mouth
Television;Word of mouth
Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth

I am expecting this to instead read: 5 5 4 2 4 1 3 3 2 6

Upvotes: 2

Views: 1984

Answers (2)

akrun
akrun

Reputation: 886938

We can use str_count from stringr

library(stringr)
df1$Count <- str_count(df1$NewsMethods, ";")+1
df1$Count
#[1] 5 5 4 2 4 1 3 3 2 6

Or a base R option is with regexpr

lengths(lapply(gregexpr(";", df1$NewsMethods), function(x) x[x>0]) )+1
#[1] 5 5 4 2 4 1 3 3 2 6

data

df1 <- structure(list(NewsMethods = c('Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
                                  'Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
                                  'Radio;Television;Social Media websites or apps;Word of mouth',
                                  'Television;Social Media websites or apps',
                                  'Newspaper;Radio;Television;News websites (such as BBC News)',
                                  'Television',
                                  'Radio;Television;Word of mouth',
                                  'Television;Social Media websites or apps;Word of mouth',
                                  'Television;Word of mouth',
                                  'Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth')), 
             .Names = "NewsMethods", row.names = c(NA, -10L), class = "data.frame")

Upvotes: 2

Jaap
Jaap

Reputation: 83215

A base R alternative with a combination of strsplit and lengths:

lengths(strsplit(dfr$NewsMethods, split = ';'))

which gives:

> lengths(strsplit(dfr$NewsMethods, split = ';'))
 [1] 5 5 4 2 4 1 3 3 2 6

Assigning the result to a count-variable in your dataframe:

dfr$count <- lengths(strsplit(dfr$NewsMethods, split = ';'))

now your dataframe looks like:

> dfr
                                                                                               NewsMethods count
1            Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth     5
2            Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth     5
3                                             Radio;Television;Social Media websites or apps;Word of mouth     4
4                                                                 Television;Social Media websites or apps     2
5                                              Newspaper;Radio;Television;News websites (such as BBC News)     4
6                                                                                               Television     1
7                                                                           Radio;Television;Word of mouth     3
8                                                   Television;Social Media websites or apps;Word of mouth     3
9                                                                                 Television;Word of mouth     2
10 Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth     6

Used data:

dfr <- structure(list(NewsMethods = c('Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
                                      'Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
                                      'Radio;Television;Social Media websites or apps;Word of mouth',
                                      'Television;Social Media websites or apps',
                                      'Newspaper;Radio;Television;News websites (such as BBC News)',
                                      'Television',
                                      'Radio;Television;Word of mouth',
                                      'Television;Social Media websites or apps;Word of mouth',
                                      'Television;Word of mouth',
                                      'Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth')), 
                 .Names = "NewsMethods", row.names = c(NA, -10L), class = "data.frame")

Upvotes: 2

Related Questions