Saurabh
Saurabh

Reputation: 1626

Unexpected result while using lowess to smooth a data.table column in R

I have a data.table test_dt in which I want to smooth the y column using lowess function.

test_dt <- structure(list(x = c(28.75, 30, 31.25, 32.5, 33.75, 35, 36.25, 
37.5, 38.75, 40, 41.25, 42.5, 43.75, 45, 46.25, 47.5, 48.75, 
50, 52.5, 55, 57.5, 60, 62.5, 63.75, 65, 67.5, 70, 72.5, 75, 
77.5, 80, 82.5, 85, 87.5, 90, 92.5, 95, 97.5, 100, 102.5, 103.75, 
105, 106.25, 107.5, 108.75, 110, 111.25, 112.5, 113.75, 115, 
116.25, 117.5, 118.75, 120, 121.25, 122.5, 125, 130, 135, 140, 
145), y = c(116.78, 115.53, 114.28, 113.05, 111.78, 110.53, 109.28, 
108.05, 106.78, 105.53, 104.28, 103.025, 101.775, 100.525, 99.28, 
98.05, 96.8, 95.525, 93.1, 90.65, 88.225, 85.775, 83.35, 82.15, 
80.9, 78.5, 76.075, 73.675, 71.25, 68.85, 66.5, 64.075, 61.725, 
59.4, 57.075, 54.725, 52.475, 50.225, 48, 45.75, 44.65, 43.55, 
42.475, 41.45, 40.35, 39.275, 38.25, 37.225, 36.175, 35.175, 
34.175, 33.225, 32.275, 31.3, 30.35, 29.45, 27.625, 24.175, 21, 
18.125, 15.55), z = c(116.778248424972, 115.531456655985, 114.284502467544, 
113.034850770519, 111.784500981402, 110.533319511795, 109.284500954429, 
108.034850457264, 106.784502297216, 105.531265565238, 104.278221015846, 
103.026780249377, 101.775992395759, 100.528761292272, 99.2853168637851, 
98.043586202838, 96.8021989104315, 95.5702032427799, 93.1041279347743, 
90.6575956222915, 88.2179393348852, 85.783500434839, 83.3503011023971, 
82.136280706039, 80.922846825298, 78.4965179152157, 76.0823895453039, 
73.6686672097464, 71.264486719796, 68.8702598156142, 66.4865368523571, 
64.1182523898466, 61.7552221811808, 59.4004347738795, 57.0823289450761, 
54.7908645949795, 52.5071096685879, 50.2308279167219, 47.9940967492558, 
45.7658417529877, 44.6514226583931, 43.5622751034012, 42.4876666190815, 
41.4173110074806, 40.3555584369672, 39.3004471381618, 38.2552969838653, 
37.2202353638959, 36.1963659189447, 35.1889616530209, 34.2004259883859, 
33.2295174626826, 32.2669278456991, 31.3171387914754, 30.3742375589802, 
29.4555719783757, 27.6243725086786, 23.9784367995753, 27.625, 
27.625, 27.625)), row.names = c(NA, -61L), class = c("data.table", 
"data.frame"))

As can be seen in the image below, I am getting an unexpected result. The expected result is that the line (z column) in the graph below should closely follow the points (y column).

Smoothing using lowess

Here is my code -

library(data.table)
library(ggplot2)
test_dt[, z := lowess(x = x, y = y, f = 0.1)$y]
ggplot(test_dt) + geom_point(aes(x, y)) + geom_line(aes(x, z))

Q1. Can someone suggest why lowess is not smoothing properly?

Q2. Since lowess is not working as expected, is there any other function in R that would be more efficient in smoothing the y column without producing a spike (as lowess did on the boundary points)?

Upvotes: 0

Views: 191

Answers (2)

Saurabh
Saurabh

Reputation: 1626

The problem got solved by keeping the number of iterations to zero.lowess acts like loess when iterations are kept at zero.

test_dt[, z := lowess(x = x, y = y, f = 0.1, iter=0)$y]

Upvotes: 0

Allan Cameron
Allan Cameron

Reputation: 174293

You could use loess instead:

test_dt[, z := predict(loess(y ~ x, data = test_dt))]

ggplot(test_dt) + geom_point(aes(x, y)) + geom_line(aes(x, z))

Note though, that if all you want to do is plot the line, this is exactly the method that geom_smooth uses, so without even creating a z column, you could do:

ggplot(test_dt, aes(x, y)) + geom_point() + geom_smooth()
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Created on 2021-11-07 by the reprex package (v2.0.0)

Upvotes: 2

Related Questions