Reputation: 709
I was given assignment to run some code and show results using the Apache Spark using Python Language, I installed the Apache Spark server using the following steps: https://phoenixnap.com/kb/install-spark-on-windows-10. I tried my code and everything was fine. Now I am assigned another assignment, it needs MLlib linear regression and they provide us with some code that should be running then we will add additional code for it. When I try to run the code I have some errors and warnings, part of them appeared in the previous assignment but it still working. I believe the issue is that there are additiona things related to Mlib Library should be added so the code will run correctly. Anybody has any idea what files should be added to the spark so it runs the code related to MLib? I am using Windows 10, and spark-3.0.1-bin-hadoop2.7
This is my code :
from pyspark import SparkConf, SparkContext
from pyspark.sql import SQLContext
from pyspark.ml.regression import LinearRegression
from pyspark.ml.feature import StandardScaler
conf = SparkConf().setMaster("local").setAppName("LinearRegression")
sc = SparkContext(conf = conf)
sqlContext = SQLContext(sc)
# Load training data
df = sqlContext.read.format("libsvm").option("numFeatures", 13).load("boston_housing.txt")
# Data needs to be scaled for better results and interpretation
# Initialize the `standardScaler`
standardScaler = StandardScaler(inputCol="features", outputCol="features_scaled")
# Fit the DataFrame to the scaler
scaler = standardScaler.fit(df)
# Transform the data in `df` with the scaler
scaled_df = scaler.transform(df)
# Initialize the linear regression model
lr = LinearRegression(labelCol="label", maxIter=10, regParam=0.3, elasticNetParam=0.8)
# Fit the data to the model
linearModel = lr.fit(scaled_df)
# Print the coefficients for the model
print("Coefficients: %s" % str(linearModel.coefficients))
print("Intercept: %s" % str(linearModel.intercept))
here is the screenshot for what I have when I run the code:
Upvotes: 0
Views: 567
Reputation: 42342
Try to do pip install numpy
(or pip3 install numpy
if that fails). The traceback says numpy module is not found.
Upvotes: 1