Reputation: 872
I am trying to save data as a 'csv' format file or extract as a data frame from a solr query similar to this:
http://localhost:8983/solr/techproducts/select?q=ipod&fl=id,cat,name,popularity,price,score&wt=csv
I get a response as a web page output as shown below:
I am able to copy, paste, save and retrieve the data in csv format from the csv output displayed in html file (webpage solr output) manually. However, I couldn't automate the process using R. I am neither able to parse and get the entire content by unlist
-ing the output in the correct order nor extract and save the content displayed in the web page as a csv file. I tried solrium
and httr
packages in vain. I also tried to Export-Csv
option in Windows PowerShell to save the data in a csv file format without any luck.
Requesting suggestions to overcome the issue.
Upvotes: 1
Views: 780
Reputation: 872
For the benefit of novices like me I am sharing the solution.
The PowerShell approach:
Step 1: I used the following PowerShell code saved as .ps1 file.
curl 'https://some_site.com/solr/some_folder/select?q=*:*&wt=csv' -L -u username:password --location-trusted -b cookie-jar.txt > OutputFilename.csv
Step 2: I used the following R script to run the PowerShell file and read the csv file generated by PowerShell as a data frame for further processing.
system2("PowerShell", args=c("-file", "C:\\FolderName\\FileName.ps1"))
df <- as.data.frame(read.csv("OutputFilename.csv", header = TRUE, sep = ",", fill = TRUE, fileEncoding="UTF-16LE"))
Using the solrium
package by Scott Chamberlain approach:
library(solrium)
solr_connect("https://site.organization.com", errors = "complete")
solr_search(name = "studies", q = "metadata:value", rows=1000, wt = "csv", callopts = httr::authenticate("usrname", "pwd"))
Upvotes: 0
Reputation: 5903
This doesn't address anything specific about powershell, but as a start using solrium
, does this work, and if not, how does it fail:
(to be fully reproducible), setup Solr
cd solr-6.6.0
bin/solr start -e cloud -noprompt
bin/solr create -c techproducts -d sample_techproducts_configs
bin/post -c techproducts example/exampledocs/*.xml
In R:
solr_connect()
(df <- solr_search(
name = "techproducts",
q = "ipod",
fl = c("id", "cat", "name", "popularity", "price", "score"),
wt = "csv"))
#> # A tibble: 3 x 6
#> id cat name popularity price score
#> * <chr> <chr> <chr> <int> <dbl> <dbl>
#> 1 IW-02 electronics,connector iPod & iPod Mini USB 2.0 Cable 1 11.50 3.2388113
#> 2 F8V7067-APL-KIT electronics,connector Belkin Mobile Power Cord for iPod w/ Dock 1 19.95 2.3162508
#> 3 MA147LL/A electronics,music Apple 60 GB iPod with Video Playback Black 10 399.00 0.9044058
write.csv(df, "myfile.csv", row.names = FALSE)
readLines("myfile.csv")
#> [1] "\"id\",\"cat\",\"name\",\"popularity\",\"price\",\"score\""
#> [2] "\"IW-02\",\"electronics,connector\",\"iPod & iPod Mini USB 2.0 Cable\",1,11.5,3.2388113"
#> [3] "\"F8V7067-APL-KIT\",\"electronics,connector\",\"Belkin Mobile Power Cord for iPod w/ Dock\",1,19.95,2.3162508"
#> [4] "\"MA147LL/A\",\"electronics,music\",\"Apple 60 GB iPod with Video Playback Black\",10,399,0.9044058"
Upvotes: 1
Reputation: 23395
Per BenH's comment, its already in CSV format so no need to use Export-CSV
. Just use the -OutFile
parameter on Invoke-WebRequest
:
Invoke-RestMethod "http://localhost:8983/solr/techproducts/select?q=ipod&fl=id,cat,name,popularity,price,score&wt=csv" -OutFile YourFile.csv
Upvotes: 2