14 June 2016

intro

  1. zeromq.org

  2. distributed messaging

    1. any language on any platform

    2. pub-sub, push-pull, router-dealer pattern

    3. high-speed aio engines in a tiny library

    4. any architecture centralized, distributed, small, large

    5. brokerless does not store messages on disk

internal messaging within storm worker processes

  1. from understanding storm internal message buffers

    1. intra-worker (inter-thread on same storm node) lmax disruptor

    2. inter-worker (node-to-node across the network) zeromq or netty

    3. inter-topology nothing built into storm

       # you must take care of this yourself
       # e.g. kafka/rabbitmq/database etc.
      
  2. illustration

         +---------------------------------------------------------------------+
         |                             worker process                          |
         |                +-----------------------------------+                |
         |                |          executor m of n          |                |
         |                |   +---------+       +---------+   |                |
         |                |   | user    |       |         |   |                |
         | +------------+ |   | logic   |→→→    | send    |→→→→ +------------+ |
         | |  worker    | |   | thread  |   ↓   | thread  |   |↓|  worker    | |
         | |  receive   | |   +---------+   ↓   +---------+   |↓|  send      | |
         | |  thread    | |        ↑        ↓        ↑        |↓|  thread    | |
         | | __________ | |   ___________   ↓   ___________   |↓| __________ | |
         | |  RX QUEUE →→→→→→  INC QUEUE     →→  OUT QUEUE    | →  TX QUEUE  | |
         | | __________ |↓|   ___________       ___________   |↑| __________ | |
         | +------↑-----+↓+-----------------------------------+↑+------↓-----+ |
         |        ↑       →→→ to other executors from other →→→        ↓       |
         +--------↑----------------------------------------------------↓-------+
                  ↑ topology.receiver.buffer.size(8)                   ↓        
                  ↑                topology.transfer.buffer.size(1024) ↓        
         ---------□--------------------network-layer-------------------□--------
         incoming tcp port(6700/tcp)                   outgoing tcp port(random)
    
    1. incoming tcp port(6700/tcp)

       one of the ports defined by `supervisor.slots.ports`
      
    2. outgoing tcp port(random)

    3. worker receive thread

       each worker has a single receive thread
       that listens on the `worker port`
       it puts incoming msgs from the network
       on the executors' `incoming queues`
      
    4. inc queue

       each element of the disruptor receive queue is a list of tuples
       here tuples are appended in batch
      
       topology.executor.receive.buffer.size(1024)
      
    5. send thread

       takes msgs from its `outgoing queue`
       and puts them on the shared `transfer queue`
      
    6. out queue

       each element of this disruptor `send queue` contains a `single` tuple
      
       topology.executor.send.buffer.size(1024)
      
    7. worker send thread

       reads from the disruptor `transfer queue`
       to send tuples over the network
       queue is a list of tuple
      


blog comments powered by Disqus