Reputation: 7374
Since MLlib uses Breeze under the hood is there a way of using MLlib with Breeze datastructures so that I do not need the whole Spark ecosystem but can still use MLlib only locally?
Upvotes: 4
Views: 385
Reputation: 2178
Totally agree with @Eliasah
You can even run MLlib within your IDE project setup. I have a Gradle project to run MLlib -
dependencies {
implementation 'org.scala-lang:scala-library:2.11.12'
compile group: 'org.apache.spark', name: 'spark-core_2.11', version: '2.4.4'
compile group: 'org.apache.spark', name: 'spark-sql_2.11', version: '2.4.4'
compile group: 'org.apache.spark', name: 'spark-mllib_2.11', version: '2.4.4'
runtime group: 'org.apache.spark', name: 'spark-core_2.11', version: '2.4.4'
runtime group: 'org.apache.spark', name: 'spark-sql_2.11', version: '2.4.4'
runtime group: 'org.apache.spark', name: 'spark-mllib_2.11', version: '2.4.4'
}
Upvotes: 0
Reputation: 40370
You can't do that. You can't use spark-mllib without spark-core even if the dependency can be pulled.
Nevertheless, if you want to run algorithms from MLLib in a standalone manner, you'll need to install spark in a standalone mode. No need for a real cluster in this case but the solution obviously won't scale.
Upvotes: 5