orestis
orestis

Reputation: 972

Getting null attribute in org.apache.spark.graphx.Edge initialization

I am using spark with scala, and what I am doing is parsing a JSON file containing wikidata items, combining it with some extra information and creating a new JSON file. In doing so, I am creating a set of WikidataItem items where each item contains a set of edges to other Wikidata items. The edges are instance of org.apache.spark.graphx.Edge. This class contains tree (var) attributes srdId, dstId, attr.

My problem is the following: whenever the I call the constructor of Edge by using new Edge(srcID=1,dstId=2,attr=3), the attr field is actually null. Instead a new field of the form attr$mcl$spis created which holds the value of attr. The value is accessible in general by calling Edge.attr but when I am serializing my WikidataItems, the edges contain in the JSON file 4 fields, namely srcId, dstId, attr, attr$mcl$sp, where attr=null. Any idea why is this happening and how is it faced?

In studying this phenomenon, I created a simple test where I just create a new Edge and then I run it through a debugger. The problem persists in this simple case as well.

The code that generates the problem is displayed below. Although I stress that the problem lies in the fact that internally when an Edge is created the value of the attribute attr remains null. This can be easily seen running just the foo method below and using a debugger.

import org.apache.spark.graphx.Edge
import java.io.StringWriter
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.module.scala.DefaultScalaModule


def toJson(obj: Any): String = {
val mapper = new ObjectMapper()
mapper.registerModule(DefaultScalaModule)

val out = new StringWriter
mapper.writeValue(out, obj)
return out.toString()
}

def foo()={
 val edge=new Edge(1,2,3)
 println(toJson(edge))
}

Upvotes: 0

Views: 317

Answers (1)

Daniel de Paula
Daniel de Paula

Reputation: 17872

Apparently it happens only with Scala's primitive numeric types. As a workaround, you can try with java's Integer, which works quite well with Scala:

scala> val edge = Edge[java.lang.Integer](1, 2, 3)
scala> println(toJson(edge))
{"srcId":1,"dstId":2,"attr":3}

Upvotes: 1

Related Questions