Chengzhi
Chengzhi

Reputation: 2591

Flink Scala ClassNotFoundException: org.apache.flink.api.common.typeinfo.TypeInformation

I am new to Flink and I was following the SocketWindowWordCount example.

I am using Scala 2.11.8 and Flink 1.3.2 and try to run it on EMR, when I run the following code, it threw errors:

Caused by: java.lang.ClassNotFoundException: org.apache.flink.api.common.typeinfo.TypeInformation

The main class looks like this:

import org.apache.flink.streaming.api.scala._
import org.apache.flink.streaming.api.windowing.time.Time

object FlinkStreamingPOC {

  def main(args: Array[String]) : Unit = {
    val env = StreamExecutionEnvironment.getExecutionEnvironment
    val stream = env.readTextFile("s3a://somebucket/prefix")
    val counts = stream.flatMap{ _.split("\\W+") }
      .map { (_, 1) }
      .keyBy(0)
      .timeWindow(Time.seconds(10))
      .sum(1)

    counts.print

    env.execute("Window Stream WordCount")
  }
}

build.sbt looks like this:

scalaVersion := "2.11.8"

val flinkVersion = "1.3.2"

libraryDependencies ++= Seq(
  "org.apache.flink" %% "flink-scala" % flinkVersion,
  "org.apache.flink" %% "flink-streaming-scala" % flinkVersion
)

I tried to import org.apache.flink.api.scala._ and org.apache.flink.streaming.api.scala._ but still got the same error message. Please suggest, thanks!

Upvotes: 5

Views: 5114

Answers (5)

henry zhu
henry zhu

Reputation: 661

The fat jar you built from a Flink project is supposed to run inside flink cluster environment, thus all Flink related dependencies would be provided by the environment.

Other answers suggest to simply comment out provided scope from the dependencies, thus include these dependencies into the fat jar. This may work, but not correct.

If you are running the jar with command like java --classpath target/your-project-jar.jar your.package.SocketWindowWordCount, then you are outside the Flink cluster environment.

The correct way is to use flink run ... command like ./bin/flink run -c your.package.SocketWindowWordCount target/your-project-jar.jar.

Try ./bin/flink run --help for details.

Upvotes: 0

YQ.Wang
YQ.Wang

Reputation: 1177

I encountered the same problem with class AverageSensorReadings in project: https://github.com/streaming-with-flink/examples-scala. It's a maven project, so I commented out all <scope>provided</scope> for every dependency in the pom file and it works.

Upvotes: 0

Zentopia
Zentopia

Reputation: 795

If you use IDEA, you can include dependencies with "Provided" scope.

enter image description here

Upvotes: 11

igx
igx

Reputation: 4231

Open the build.sbt file and remove the provided from the dependencies

Upvotes: 1

Amarjit Dhillon
Amarjit Dhillon

Reputation: 2816

You might be having the same issue as I was having which basically involves adding the jar to /lib folder, please see here for more details. In case of Amazon EMR you are using flink Dashboard . As you can see /opt have all the required jars which you need to copy in lib folder

enter image description here

Upvotes: 0

Related Questions