codeBarer
codeBarer

Reputation: 2398

Using pyspark to read json file directly from a website

is it possible to use sqlContext to read a json file directly from a website? for instance I can read file as such:

myRDD = sqlContext.read.json("sample.json")

but get I an error when I try something like this:

myRDD = sqlContext.read.json("http://192.168.0.13:9200/sample.json")

I'm using Spark 1.4.1 Thanks in advance!

Upvotes: 2

Views: 9893

Answers (1)

zero323
zero323

Reputation: 330413

It is not possible. Paths you use should point to either local file system or other file system supported by Hadoop. As long as sample.json has an expected format (single object per line) you can try something like this:

import json
import requests

r = requests.get("http://192.168.0.13:9200/sample.json")
df = sqlContext.createDataFrame([json.loads(line) for line in r.iter_lines()])

Upvotes: 7

Related Questions