Kaja
Kaja

Reputation: 3057

Smoothing a plot in r

I have a time series. If i draw this time series I have such a diagram

enter image description here

my Data:

539 532 531 538 544 554 575 571 543 559 511 525 512 540
535 514 524 527 532 547 564 548 572 564 549 532 519 520
520 543 550 542 528 523 531 548 554 574 575 560 534 518
511 519 527 554 543 527 540 524 523 539 569 552 553 540
522 522 492 519 532 527 532 550 535 517 551 548 571 574
539 535 515 512 510 527 533 543 540 533 519 539 555 542
574 543 555 539 507 522 518 519 516 546 523 530 532 539
540 568 554 563 550 526 509 492 525 519 527 526 515 530
531 553 563 562 576 568 539 516 512 500 516 542 522 527
523 531

How can I smooth this graph, to see the sin function more clearly

Upvotes: 0

Views: 238

Answers (2)

jlhoward
jlhoward

Reputation: 59355

Here are some things to get you started.

df <- data.frame(index=1:length(values),values)
# loess smoothing; note the use of predict(fit)
fit.loess <- loess(values~index,df,span=.1)
plot(df, type="l", col="blue",main="loess")
lines(df$index,predict(fit.loess),col="red")

# non-linear regression usign a single sine term
fit.nls <- nls(values~a*sin(b*index+c)+d,df,
           start=c(a=1000,b=pi/10,c=0,d=mean(df$values)))
plot(df, type="l", col="blue",main="sin [1 term]")
lines(df$index,predict(fit.nls),col="red")

# non-linear regression using 2 sine terms
fit.nls <- nls(values~a1*sin(b1*index+c1)+a2*sin(b2*index+c2)+d,df,
               start=c(a1=1000,b1=pi/10,c1=1,
                       a2=1000,b2=pi/2,c2=1,d=mean(df$values)))
plot(df, type="l", col="blue",main="sin [2 terms]")
lines(df$index,predict(fit.nls),col="red")

From the non-linear fits you can get an estimate of the period (b) using summary(fit.nls).

Read the documentation on loess, nls, and predict

Upvotes: 3

andresram1
andresram1

Reputation: 131

You can use a smoothing function from any R package you wish. Basically, you can perform a moving average function like ARIMA models.

Something that is very easy to explore is this scenario (I hope this helps you):

#Read the data

cd4Data <- read.table("./RData/cd4.data",  col.names=c("time", "cd4", "age", "packs", "drugs", "sex", "cesd", "id"))

cd4Data <- cd4Data[order(cd4Data$time),]

head(cd4Data)

#Plot the data
par(mfrow=c(1,1))

plot(cd4Data$time,cd4Data$cd4,pch=19,cex=0.1)

#A moving average (With 3 points average)
plot(cd4Data$time,cd4Data$cd4,pch=19,cex=0.1)

aveTime <- aveCd4 <- rep(NA,length(3:(dim(cd4Data)[1]-2)))

for(i in 3:(dim(cd4Data)[1]-2)){

    aveTime[i] <- mean(cd4Data$time[(i-2):(i+2)])

    aveCd4[i] <- mean(cd4Data$cd4[(i-2):(i+2)])

}


lines(aveTime,aveCd4,col="blue",lwd=3)

#Average many more points

plot(cd4Data$time,cd4Data$cd4,pch=19,cex=0.1)

aveTime <- aveCd4 <- rep(NA,length(201:(dim(cd4Data)[1]-200)))

for(i in 201:(dim(cd4Data)[1]-2)){

    aveTime[i] <- mean(cd4Data$time[(i-200):(i+200)])

    aveCd4[i] <- mean(cd4Data$cd4[(i-200):(i+200)])

}

lines(aveTime,aveCd4,col="blue",lwd=3)

Upvotes: 2

Related Questions