krhlk
krhlk

Reputation: 1594

Reading data from URL

Is there a reasonably easy way to get data from some url? I tried the most obvious version, does not work:

readcsv("https://dl.dropboxusercontent.com/u/.../testdata.csv")

I did not find any usable reference. Any help?

Upvotes: 21

Views: 9410

Answers (6)

Aku
Aku

Reputation: 802

A very easy solution, alike to mike gold´s post, though in 2023 you need to specify a sink argument:

using CSV, DataFrames

my_table = CSV.read(download(some_url), DataFrame)

Upvotes: 4

Andrej Oskin
Andrej Oskin

Reputation: 2332

Nowadays you can also use UrlDownload.jl which is pure Julia, take care of download details, process data in-memory and can also work with compressed files.

Usage is straightforward

using UrlDownload

A = urldownload("https://data.ok.gov/sites/default/files/unspsc%20codes_3.csv")

Upvotes: 2

mike gold
mike gold

Reputation: 1611

If you are looking to read into a dataframe, this will also work in Julia:

using CSV   

dataset = CSV.read(download("https://mywebsite.edu/ml/machine-learning-databases/my.data"))

Upvotes: 17

rickhg12hs
rickhg12hs

Reputation: 11912

If you want to read a CSV from a URL, you can use the Requests package as @waTeim shows and then read the data through an IOBuffer. See example below.

Or, as @Colin T Bowers comments, you could use the currently (December 2017) more actively maintained HTTP.jl package like this:

julia> using HTTP

julia> res = HTTP.get("https://www.ferc.gov/docs-filing/eqr/q2-2013/soft-tools/sample-csv/transaction.txt");

julia> mycsv = readcsv(res.body);

julia> for (colnum, myheader) in enumerate(mycsv[1,:])
           println(colnum, '\t', myheader)
       end
1   transaction_unique_identifier
2   seller_company_name
3   customer_company_name
4   customer_duns_number
5   tariff_reference
6   contract_service_agreement
7   trans_id
8   transaction_begin_date
9   transaction_end_date
10  time_zone
11  point_of_delivery_control_area
12  specific location
13  class_name
14  term_name
15  increment_name
16  increment_peaking_name
17  product_name
18  transaction_quantity
19  price
20  units
21  total_transmission_charge
22  transaction_charge

Using the Requests.jl package:

julia> using Requests

julia> res = get("https://www.ferc.gov/docs-filing/eqr/q2-2013/soft-tools/sample-csv/transaction.txt");

julia> mycsv = readcsv(IOBuffer(res.data));

julia> for (colnum, myheader) in enumerate(mycsv[1,:])
         println(colnum, '\t', myheader)
       end
1   transaction_unique_identifier
2   seller_company_name
3   customer_company_name
4   customer_duns_number
5   tariff_reference
6   contract_service_agreement
7   trans_id
8   transaction_begin_date
9   transaction_end_date
10  time_zone
11  point_of_delivery_control_area
12  specific location
13  class_name
14  term_name
15  increment_name
16  increment_peaking_name
17  product_name
18  transaction_quantity
19  price
20  units
21  total_transmission_charge
22  transaction_charge

Upvotes: 24

Simon Danisch
Simon Danisch

Reputation: 594

If it is directly a csv file, something like this should work:

A = readdlm(download(url),';')

Upvotes: 3

waTeim
waTeim

Reputation: 9225

The Requests package seems to work pretty well. There are others (see the entire package list) but Requests is actively maintained.

Obtaining it

julia> Pkg.add("Requests")

julia> using Requests

Using it

You can use one of the exported functions that correspond to the various HTTP verbs get, post, etc which returns a Response type

julia> res = get("http://julialang.org")
Response(200 OK, 21 Headers, 20913 Bytes in Body)

julia> typeof(res)
Response (constructor with 8 methods)

And then, for example, you can print the data using @printf

julia> @printf("%s",res.data);
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-us" lang="en-us">
<head>
  <meta http-equiv="content-type" content="text/html; charset=utf-8" />
...

Upvotes: 10

Related Questions