Reputation: 833
I have an issue when i try to convert my DStream[String] into Dataframes.
My goal is to transform the twitter stream[rdd] into dataframes, but with my code (below) the transformation doesn't work, at the end i receive i dataframe with only one word.
For example :hi every body
my dataframe will contain only the words "hi"
here the piece of the code
val splited_test=texts.transform(rdd => rdd.map(x=> Row.fromSeq(x.split(" "))))
splited_test.foreachRDD { rdd =>{
val fields = new Array[StructField](1)
fields(0)=(DataTypes.createStructField("text", StringType, true))
val schema = DataTypes.createStructType(fields)
val df= sqlContext.createDataFrame(rdd, schema)
}}
Upvotes: 0
Views: 581
Reputation: 910
Only the first word is stored because you used x.split (" ").
You created one field.
Modify the code as follows.
val splited_test=texts.transform(rdd => rdd.map(x=> Row.fromSeq(Seq(x))))
Upvotes: 1