Stanislav Jirak
Stanislav Jirak

Reputation: 853

How to execute RxSqlServerData method in R?

I'd like to create and deploy a model into SQL Server in order to use it with new PREDICT() in-built function. However, it seems I'm stuck with RxSqlServerData method in R. Whenever I run my script, I get this error:

Error in rxExecJob(rxCallInfo(matchCall, .rxDeprecated = "covariance"), : Data must be an RxSqlServerData data source for this compute context.

Here's my code so far:

#Logistic plain select sql query
#input_query = 'SELECT app.ClientAgeToApplicationDate AS Age, IIF(conc.FirstInstallmentDelay>60,1,0) AS FPD60 FROM dim.Application app JOIN dim.Contract con ON app.ApplicationID = con.ApplicationID JOIN dim.Contract_Calculated conc ON con.ContractID = conc.ContractId'

#LinReg aggregated query
input_query = '
        *SQL QUERY, too long to paste...*
    '

connStr <- paste("Driver=SQL Server; Server=", "czphaddwh01\\dev",
                 ";Database=", "DWH_Staging", ";Trusted_Connection=true", sep = "");

#Set compute context to SQL Server. Does not load any data into a memory of the local client. OBDC can't.
cc <- RxInSqlServer(connectionString = connStr);
rxSetComputeContext(cc)

input_data <- RxSqlServerData(sqlQuery = input_query, connectionString = connStr)
risk <- rxImport(input_data)
#head(risk)

#Binary regression for non-aggregated sql query
#logit_model <- rxLogit(Age ~ FPD60, data = risk)

#LinReg for aggregated sql query
LinReg_model <- rxLinMod(RiskFPD60 ~ Age, data = risk)

I'm new to R. Any help would be greatly appreciated.

Upvotes: 1

Views: 1057

Answers (1)

Hong Ooi
Hong Ooi

Reputation: 57686

When you run

cc <- RxInSqlServer(connectionString = connStr);
rxSetComputeContext(cc)

you tell R to run any Microsoft analytics functions (basically those starting with rx) in the SQL compute context. This means that all the processing will be handled inside the database. R essentially is acting as a shell to SQL.

Naturally, this requires that the dataset you're working with has to be actually in the database: a table, view, or query returning a resultset.

When you then run

risk <- rxImport(input_data)
LinReg_model <- rxLinMod(RiskFPD60 ~ Age, data = risk)

you import your data into a local data frame, and then try to fit a model on it. But you previously told R to do the number-crunching in-database, and your data is local. So it will complain.

The solution is to pass your RxSqlServerData object directly to rxLinMod:

LinReg_model <- rxLinMod(RiskFPD60 ~ Age, data = input_data)

Upvotes: 1

Related Questions