Reputation: 1712
In the PySpark docs, I see many examples working on sample DataFrames like df4
here.
Where are they defined? I'd like to see them in full to better understand the docs.
Upvotes: 2
Views: 26
Reputation: 31460
They are defined in _test()
method in Class GroupedData(...)
from pyspark.sql import Row
df4 = sc.parallelize([Row(course="dotNET", year=2012, earnings=10000),
Row(course="Java", year=2012, earnings=20000),
Row(course="dotNET", year=2012, earnings=5000),
Row(course="dotNET", year=2013, earnings=48000),
Row(course="Java", year=2013, earnings=30000)]).toDF()
Upvotes: 2