Reputation: 115
I wasn't able to find much pertinent to this on Stack-Overflow, or the web.
I'm getting this error:
> library(knitr)
> knit2html("pa1_template.rmd")
Error in knit2html("pa1_template.rmd") :
It seems you should call rmarkdown::render() instead of knitr::knit2html() because pa1_template.rmd appears to be an R Markdown v2 document.
I just ran it with rmarkdown::render(), and it created the HTML file. However, my assignment wants me to run it through knit2html() and create an md file.
When I run the Rmd file through the RStudio "Knit HTML" menu option, it creates the HTML file fine.
Any pointers appreciated.
Here is the content of the rmd file:
## Loading and preprocessing the data
Read the data file in.
```{r readfile}
steps<-read.csv("activity.csv",header=TRUE, sep=",")
steps_good<-subset(steps, !is.na(steps))
```
Sum the number of steps per day
```{r summarize/day}
steps_day<-aggregate(steps~date, data=steps_good, sum)
```
Create a histogram of the results
```{r histogram}
hist(steps_day$steps, main="Frequency of Steps/day", xlab="Steps/Day", border="blue", col="orange")
```
# What is the mean total number of steps taken per day?
Calculate the mean of the steps per day
```{r means_steps/day}
mean_steps<-mean(steps_day$steps)
mean_steps
```
Calculate the median of the steps per day
```{r median_steps/day}
med_steps<-median(steps_day$steps)
med_steps
```
#What is the average daily activity pattern?
Get the average steps per 5 minute interval
```{r avg_5_min}
step_5min<-aggregate(steps~interval, data=steps_good, mean)
```
Plot steps against time interval, averaged across all days
```{r plot_interval}
plot(step_5min$interval,step_5min$steps, type="l", main="steps per time interval",ylab="Steps",xlab="Interval")
```
On average, which interval during the day has the most steps.
```{r max_interval}
step_5min$interval[which.max(step_5min$steps)]
```
#Imputing missing values
How many NAs are there in the original table?
```{r NAs}
steps_na<-which(is.na(steps))
length(steps_na)
```
Merge 5 minute interval with original steps table
```{r merge}
steps_filled<-merge(steps, step_5min,by="interval")
```
Replace NA values with mean of steps values for that time interval
```{r replace_na}
steps_na<-which(is.na(steps_filled$steps.x))
steps_filled$steps.x[steps_na]<-steps_filled$steps.y[steps_na]
```
Create a histogram of the results
```{r new_hist}
steps_day_new<-aggregate(steps.x~date, data=steps_filled, sum)
hist(steps_day_new$steps.x, main="Frequency of Steps/day", xlab="Steps/Day", border="blue", col="orange")
```
It looks like the imputing of NA values increases the middle bar (mean/median) height, but other bars seem unchanged.
Calculate the new mean of the steps per day
```{r new_means_steps/day}
mean_steps<-mean(steps_day_new$steps.x)
mean_steps
```
Calculate the new median of the steps per day
```{r new_median_steps/day}
med_steps<-median(steps_day_new$steps.x)
med_steps
```
It looks like the mean did not change, but the median took on the value of the mean, now that some non-integer values were plugged in.
#Are there differences in activity patterns between weekdays and weekends?
Regenerate steps_filled, and flag whether a date is a weekend or a weekday.
Convert resulting column to factor.
```{r fill_weekdays}
steps_filled<-merge(steps, step_5min,by="interval")
steps_filled$steps.x[steps_na]<-steps_filled$steps.y[steps_na]
steps_filled<-cbind(steps_filled, wkday=weekdays(as.Date(steps_filled$date)))
steps_filled<-cbind(steps_filled, day_type="", stringsAsFactors=FALSE)
for(i in 1:nrow(steps_filled)){
if(steps_filled$wkday[i] %in% c("Saturday","Sunday"))
steps_filled$day_type[i]="Weekend"
else
steps_filled$day_type[i]="Weekday"
}
steps_filled$day_type<-as.factor(steps_filled$day_type)
```
Get average steps per interval and day_type
```{r plot_interva_day_type}
steps_interval_day<-aggregate(steps_filled$steps.x,by=list(steps_filled$interval,steps_filled$day_type),mean)
```
Plot the weekend and weekday results in a panel plot.
```{r day_type_plot}
weekday_intervals<-subset(steps_interval_day, steps_interval_day$Group.2=="Weekday",select=c("Group.1","x"))
weekend_intervals<-subset(steps_interval_day, steps_interval_day$Group.2=="Weekend",select=c("Group.1","x"))
par(mfrow=c(1,2))
plot(weekday_intervals$Group.1,weekday_intervals$x,type="l",xlim=c(0,2400), ylim=c(0,225),main="Weekdays",xlab="Intervals",ylab="Mean Steps/day")
plot(weekend_intervals$Group.1,weekend_intervals$x,type="l",xlim=c(0,2400), ylim=c(0,225),main="Weekends",xlab="Intervals",ylab="")
Upvotes: 0
Views: 2824
Reputation: 1
try this:
setwd("working_directory")
library(knitr)
knit("PA1_template.Rmd", output = NULL)
adding output=NULL"
was key for me.
Good luck!
Upvotes: 0
Reputation: 5910
In RStudio, you can add keep_md: true
in your YAML
header:
---
title: "Untitled"
output:
html_document:
keep_md: true
---
With this option, you get both HTML
and md
files.
Upvotes: 1