Reputation: 117
I use sqldf function for data manipuation in R studio. And I met a problem that sqldf function doesn't take into account the condition I set up in my query. For example the code
d2 <- sqldf("select a.'n1.0', a.date, a.tot_cap*100/a.'n1.0' as afasf from data a where a.date < '2013-03-01'")
gives exact the same result as code
d2 <- sqldf("select a.'n1.0', a.date, a.tot_cap*100/a.'n1.0' as afasf from data a").
Setting single quotes, double quotes, no quotes at all doesn't help.
Does anybody know what's the problem and how to resolve it?
Thanks in advance.
Here is the result of dput(data2)
command where data2 is the first 40 rows of the initial dataset.
structure(list(REGN = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), date = structure(c(14761,
14791, 14822, 14853, 14883, 14914, 14944, 14975, 15006, 15034,
15065, 15095, 15126, 15156, 15187, 15218, 15248, 15279, 15309,
15706, 15737, 15765, 15796, 15826, 15857, 15887, 15918, 15949,
15979, 16010, 16040, 16071, 16102, 16130, 16161, 16191, 16222,
16252, 16283, 16314), class = "Date"), tot_cap = structure(c(3.29680129097603e-316,
3.27881038454837e-316, 3.31569035934127e-316, 3.42544269478722e-316,
3.46449744773999e-316, 3.76857311386686e-316, 3.83375272419446e-316,
3.89781752479879e-316, 3.88106524094525e-316, 3.91119504385185e-316,
3.54278081534629e-316, 3.66904833254234e-316, 3.84160896084211e-316,
3.83063042693901e-316, 3.87586035817945e-316, 3.98758302742101e-316,
4.04557250040469e-316, 4.19023427922167e-316, 4.27870651560933e-316,
5.55371435659506e-316, 5.45611929687006e-316, 5.73639651252866e-316,
5.74625569130456e-316, 5.74576631928235e-316, 5.79667637503333e-316,
5.86047635645149e-316, 5.66957630781748e-316, 6.16857297583704e-316,
6.25638237375425e-316, 6.32633446059442e-316, 6.54992194176386e-316,
6.44650520772079e-316, 6.41496840466797e-316, 6.51734725500882e-316,
6.52890122716964e-316, 6.49014508749322e-316, 6.47399926921995e-316,
6.49043322657787e-316, 6.47938176858544e-316, 6.65456810876004e-316
), class = "integer64"), bas_cap = structure(c(0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 5.24913731265079e-316, 5.71723269425777e-316,
5.71723269425777e-316, 5.72079658738749e-316, 5.71723269425777e-316,
5.71723269425777e-316, 5.71723269425777e-316, 5.84030513832873e-316
), class = "integer64"), osn_cap = structure(c(0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 5.24913731265079e-316, 5.71723269425777e-316,
5.71723269425777e-316, 5.72079658738749e-316, 5.71723269425777e-316,
5.71723269425777e-316, 5.71723269425777e-316, 5.84030513832873e-316
), class = "integer64"), n1.0 = c(NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 14.13, 13.91, 13.74, 13.25,
13.11, 13.59, 13.07, 13.06), n1.1 = c(NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 11.65, 12.3, 12.11, 11.75,
11.65, 12.04, 11.61, 11.53), n1.2 = c(NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 11.65, 12.3, 12.12, 11.77,
11.66, 12.06, 11.62, 11.54), n1 = c(13.96, 13.66, 15.07, 15.75,
15.49, 15.87, 15.7, 16.32, 16.17, 16.67, 14.97, 15.06, 15.22,
15.02, 14.68, 14.42, 13.5, 13.29, 13.1, 13.25, 12.53, 13.28,
13.35, 13.42, 13.67, 14.14, 13.1, 14.13, 14.13, 14.55, 14.53,
14.51, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, 40L
), class = "data.frame")
Upvotes: 1
Views: 143
Reputation: 269491
There are several problems here:
single quotes are used for quoting constants. Use double quotes to quote variable names.
SQLite does not support a Date type so the dates are read in as numbers (number of days since the UNIX Epoch) and then the statement is attempting to compare that number to a character string.
although not wrong you don't really need the alias a
since there is only one data frame so there cannot be any confusion as to what is being referenced.
the question refers to data2
but data
is used in the R code. We have changed the R code to use data2
.
See the links given in the comments under the question for more information.
Using the default SQLite backend, we preface sqldf
with fn$
to cause it to run anything between backticks in R and replace that expression with the result of that run.
d2 <- fn$sqldf("select \"n1.0\", date, tot_cap * 100/ \"n1.0\" as afasf
from data2
where date < `as.Date('2013-03-01')`")
Alternately, use the H2 backend. (As with SQLite, the H2 database is included in the R driver so you don't have to install it separately but you do need to ensure that you have Java installed -- fortunately that is extremely simple as it has an automated installation procedure.) That database does have a Date type and is able to compare such dates with correclty formatted character strings representing dates.
library(RH2) # <----------note
d3 <- fn$sqldf("select \"n1.0\", date, tot_cap * 100 / \"n1.0\" as afasf
from data2
where date < '2013-03-01'")
Be sure to detach RH2 from the search path if you want to resume using SQLite.
Upvotes: 2