David Thurmab
David Thurmab

Reputation: 23

Beginner to R - basic of table manipulation

I am just starting with R and trying to learn ways of working with csv files

Sample Data Set

Org_Name  Question#  Response(scales from 1 through 5)
Org1      1         1
Org1      2         3
Org1      3         5
Org2      1         4
Org2      2         2
Org2      3         3
Org3      1         4
Org3      2         1
Org3      3         5

I am trying to figure out how to do some data analysis using R

So my questions for you all is this

  1. Is R even a good tool for this? . But I am not sure if Excel would be a better choice (I am more comfortable with Excel)

  2. How does one work with table in R? For example if I want to check which Org Names have scored high (4-5) in Question#2 and Low (1-2) in Question#1. How frequently does that happen? Is there a method to do this?

  3. Is there any good tutorial/resources for learning R. I understand that R is a great choice for data analysis and I would like to learn more about it.

Upvotes: 2

Views: 142

Answers (2)

phoebe
phoebe

Reputation: 9

if you're a beginner, downloading some packages will help you a lot. here are some example codes for your questions using dplyr package:

1) R is a great tool for any kind of data manipulation or analyses, and reading csv files is very easy:

dat <- read.csv ("path")

2) once you read your csv file into an object, like above "dat", the dplyrpackage has a bunch of functions to do pretty much any manipulations, e.g., your question of "check which Org Names have scored high (4-5) in Question#2 and Low (1-2) in Question#1." this will give you a the Org_Names that satisfy the conditions you specified:

dat %>%
   filter (Question2 >= 4 & Question1 <= 2) %>% select (Org_Name)

and how frequently, i'm guessing you want a count?

dat %>%
   filter (Question2 >= 4 & Question1 <= 2) %>% select (Org_Name) %>% nrow()

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522712

1) R is a great tool for handling your CSV data. In a few minutes, you can download RStudio and be up and running.

Here is some sample code which shows you how to get started:

sample <- data.frame(Org_Name = c(rep("Org1", 3), rep("Org2", 3), rep("Org3", 3)),
                     Question = c(1,2,3,1,2,3,1,2,3),
                     Response = c(1,3,5,4,2,3,4,1,5))

2) This defines a data frame called sample and assigns your data to it. To find out all Orgs which scored 4 or higher on question 2, you can use this:

> sample$Org_Name[sample$Response >= 4 & sample$Question == 2]
factor(0)

This returns factor(0) which means that no Orgs match. However, if you want to find out which Orgs have a low response for question 2 you can try:

> sample$Org_Name[sample$Response <= 2 & sample$Question == 2]
[1] Org2 Org3

3) Google is great place to start for finding R resources. And the official R documentation is good too.

Upvotes: 2

Related Questions