user1898248
user1898248

Reputation: 99

leftjoin on two GlobalKTables

I am trying to join a stream to 2 differents GlobalTables, treating them as a lookup, more specifically, devices (user agent) and geocoding (ip address).

The issue being with the serialization, but I dont get why. It gets stuck on DEFAULT_VALUE_SERDE_CLASS_CONFIG but the topic to which I want to write is serialized correctly.

//
// Set up serialization / de-serialization
private static Serde<String> stringSerde = Serdes.String();
private static Serde<PodcastData> podcastSerde = StreamsSerdes.PodCastSerde();
private static Serde<GeoCodedData> geocodedSerde = StreamsSerdes.GeoIPSerde();
private static Serde<DeviceData> deviceSerde = StreamsSerdes.DeviceSerde();
private static Serde<JoinedPodcastGeoDeviceData> podcastGeoDeviceSerde = StreamsSerdes.PodcastGeoDeviceSerde();
private static Serde<JoinedPodCastDeviceData> podcastDeviceSerde = StreamsSerdes.PodcastDeviceDataSerde()

...

GlobalKTable<String, DeviceData> deviceIDTable = builder.globalTable(kafkaProperties.getProperty("deviceid-topic"));
GlobalKTable<String, GeoCodedData> geoIPTable = builder.globalTable(kafkaProperties.getProperty("geoip-topic"));

//
// Stream from source topic
KStream<String, PodcastData> podcastStream = builder.stream(
                kafkaProperties.getProperty("source-topic"),
                Consumed.with(stringSerde, podcastSerde));

//
podcastStream

// left join the podcast stream to the device table, looking up the device
.leftJoin(deviceIDTable,
    // get a DeviceData object from the user agent
    (podcastID, podcastData) -> podcastData.getUser_agent(),

    // join podcast and device and return a JoinedPodCastDeviceData object
    (podcastData, deviceData) -> {

       JoinedPodCastDeviceData data = 
          JoinedPodCastDeviceData.builder().build();
          data.setPodcastObject(podcastData);
          data.setDeviceData(deviceData);

          return data;
})

// left join the podcast stream to the geo table, looking up the geo data
.leftJoin(geoIPTable,
    // get a Geo object from the ip address
    (podcastID, podcastDeviceData) -> podcastDeviceData.getPodcastObject().getIp_address(),

    // join podcast and geo 
    (podcastDeviceData, geoCodedData) -> {

       JoinedPodcastGeoDeviceData data= 
          JoinedPodcastGeoDeviceData.builder().build();
          data.setGeoData(geoCodedData);
          data.setDeviceData(podcastDeviceData.getDeviceData());
          data.setPodcastData(podcastDeviceData.getPodcastObject());

          return data;
})

//
.to(kafkaProperties.getProperty("sink-topic"),
                        Produced.with(stringSerde, podcastGeoDeviceSerde));

...
...

streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, stringSerde.getClass().getName());
streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, stringSerde.getClass().getName());


The error ERROR java.lang.String cannot be cast to DeviceData

Upvotes: 0

Views: 79

Answers (1)

Nishu Tayal
Nishu Tayal

Reputation: 20850

streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, stringSerde.getClass().getName());

Due to above value, the application will use String serde as default value serde unless you specify explicitly while making KTable/KStream/GlobalKTable.

Since expected value Type for deviceIDTable is DeviceData, specify that as given below:

You need to define the value serde in GlobalKTable .

GlobalKTable<String, DeviceData> deviceIDTable =    builder.globalTable(kafkaProperties.getProperty("deviceid-topic"), Materialized.<String, DeviceData, KeyValueStore<Bytes, byte[]>>as(DEVICE_STORE)
                .withKeySerde(stringSerde)
                .withValueSerde(deviceSerde));

Upvotes: 1

Related Questions