peter
peter

Reputation: 13

How do I count the number of cells from the CSV file in R?

The name of my dataset is student_performance which can be seen below:

gender  race    lunch   math  reading writing 
  2      2        2      72     72      74
  2      3        2      69     90      88
  2      2        2      90     95      93
  1      1        1      47     57      44
  1      3        2      76     78      75
  2      2        2      71     83      78
  2      2        2      88     95      92
  1      2        1      40     43      39
  1      4        1      64     64      67
  2      2        1      38     60      50

I want to calculate how many digits "2" is within a gender column. For this I tried this code:

count(studentperformance$gender[1:10], vars = "2")

But the code shows error. Please suggest how can I achieve this?

Upvotes: 1

Views: 1356

Answers (3)

hello_friend
hello_friend

Reputation: 5788

Consider also:

studentperformance <- transform(studentperformance,

                            count_by_gender = ave(studentperformance$gender,

                                                  studentperformance$gender,

                                                  FUN = length))

Data:

    structure(
  list(
    gender = c(2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L,
               2L),
    race = c(2L, 3L, 2L, 1L, 3L, 2L, 2L, 2L, 4L, 2L),
    lunch = c(2L,
              2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L),
    math = c(72L, 69L, 90L,
             47L, 76L, 71L, 88L, 40L, 64L, 38L),
    reading = c(72L, 90L, 95L,
                57L, 78L, 83L, 95L, 43L, 64L, 60L),
    writing = c(74L, 88L, 93L,
                44L, 75L, 78L, 92L, 39L, 67L, 50L),
    count_by_gender = c(6L, 6L,
                        6L, 4L, 4L, 6L, 6L, 4L, 4L, 6L)
  ),
  class = "data.frame",
  row.names = c(NA,-10L)
)

Upvotes: 0

LewT
LewT

Reputation: 56

You can create some simple tables without indexing or comparisons. Try the following with count, which will return the variable gender containing the unique values of gender, and n indicating the count of each unique value:

library(dplyr)
count(df, gender)

#### OUTPUT ####
# A tibble: 2 x 2
  gender     n
   <int> <int>
1      1     4
2      2     6

You can do pretty much the same thing using base R's table. The output is just a little different: The unique values are now the variable headers 1 and 2, and the counts are the row just beneath, with 4 and 6:

table(df$gender)

#### OUTPUT ####
1 2 
4 6 

Upvotes: 1

DJV
DJV

Reputation: 4863

As @user2974951 said, you can use base R for that:

sum(studentperformance$gender==2)

[1] 6

You can also create a table for every level in gender:

table(studentperformance$gender,factor(studentperformance$gender))
   1 2
  1 4 0
  2 0 6

Sample data:

studentperformance <- read.table(text = "gender  race    lunch   math  reading writing 
  2      2        2      72     72      74
  2      3        2      69     90      88
  2      2        2      90     95      93
  1      1        1      47     57      44
  1      3        2      76     78      75
  2      2        2      71     83      78
  2      2        2      88     95      92
  1      2        1      40     43      39
  1      4        1      64     64      67
  2      2        1      38     60      50", header = TRUE)

Upvotes: 1

Related Questions