danissimo
danissimo

Reputation: 442

hazelcast 3.12: Multicast autodiscovery on localhost stopped working

Yesterday, I started two embedded Hazelcast nodes many times, and each time the second node came up, it joined the first node, forming a cluster. However, starting this morning, they no longer join the cluster. Here’s the minimal code to start a node:

object BrokenHazelcastAutodiscovery extends App {
  val cfg = new Config()
  Hazelcast.newHazelcastInstance(cfg)
}

I ran it on a MacBook. I’m not sure if I left the nodes running overnight. Suppose I did, and they have been pinging the corporate network with multicast discovery packets from time to time. ”What if the network admins somehow disabled the multicast capability on my laptop?“ I thought. Anyway, here is a piece of code that checks whether I can multicast and receive multicast packets on localhost:

object WhetherMulticastEnabledCheck {
  import java.net._

  def main(args: Array[String]): Unit = {
    val port = 33766
    val addr = new InetSocketAddress(InetAddress.getByName("225.6.7.8"), port)
    val netif = NetworkInterface.getByInetAddress(InetAddress.getLocalHost)
    val sok = new MulticastSocket(addr.getPort)
    sok.joinGroup(addr, netif)
    sok.setTimeToLive(0)

    val anotherStarted =
      try { new ServerSocket(port); false }
      catch { case _: BindException => true }
    val seq = anotherStarted match {
      case true  => new Sequence(0xFFFF0000)
      case false => new Sequence(0x0000FFFF)
    }
    new Receiver(sok).start()
    new Sender(sok, addr, seq).start()
  }

  private class Sequence(mask: Int) {
    private var i = 0
    def next: Int = {
      i += 1
      if (i > 0x0FFF) i = 1
      ((i << 16) | i) & mask
    }
  }

  private class Sender(sok: DatagramSocket, to: SocketAddress, seq: Sequence) {
    private val t = new Thread(() => {
      val buf = new Array[Byte](4)
      val p = new DatagramPacket(buf, buf.length, to)
      println("sending started...")
      while (true) {
        val t = seq.next
        buf(0) = (t >> 24).toByte
        buf(1) = (t >> 16).toByte
        buf(2) = (t >>  8).toByte
        buf(3) = (t >>  0).toByte
        sok.send(p)
        println(s"sent     ${"%08X".format(t).replace('0', '.')}")
        Thread.sleep(2500)
      }
    }, "sending-thread")
    def start(): Unit = t.start()
  }

  private class Receiver(sok: DatagramSocket) {
    private val t = new Thread(() => {
      val buf = new Array[Byte](4)
      val p = new DatagramPacket(buf, buf.length)
      println("receiving started...")
      while (true) {
        sok.receive(p)
        val t =
          (buf(0) << 24) |
          (buf(1) << 16) |
          (buf(2) <<  8) |
          (buf(3) <<  0)
        println(s"received ${"%08X".format(t).replace('0', '.')}")
      }
    }, "receiving-thread")
    def start(): Unit = t.start()
  }
}

Here is the output

 proc 1               | proc 2
----------------------|----------------------
 receiving started... |     
 sending started...   | sending started...  
 sent     .......1    | receiving started...     
 received .......1    | received ...1....     
 received ...1....    | sent     ...1....     
 sent     .......2    | received .......2     
 received .......2    | sent     ...2....     
 received ...2....    | received ...2....     
 sent     .......3    | received .......3     
...

I manually arranged the outputs side by side and shifted the second process's output down by one line to visually highlight the delay in its starting moment. This output indicates that multicast is functioning properly on my laptop.

So, could you guys suggest what might have gone wrong and caused the nodes to stop discovering each other and joining into a cluster?

UPD: I found the problem. I've rebooted my laptop for the second time, and the BrokenHazelcastAutodiscovery is working again. 🤷🏼 However, when I started Prometheus again, the nodes failed to join. Even when I stopped Prometheus, the nodes still failed to join the cluster.

Upvotes: 0

Views: 95

Answers (0)

Related Questions