Distributed Message Queue
Topic: Distributed Message Queue
Interviewer: longwei
Interviewee: chenbin
Level: L5 (Senior)
System design Public speaking / behavioral interview
Topic
Mock System Design Interview Summary
Interview Overview
Date: 6/19/2022
Target level: L5
Duration: 45 minutes
Topic covered: distributed message queue
Drawing tool used: Jamboard
Key problem vote: ?
Requirements
Functional requirements
Distributed message queue across regions
In order delivery: input order is A, B, C, then output order is A, B, C
Support topics
Consumer will pull messages
Support multiple consumers?
Same order in a partition
Lower priority
Once and only once
Non functional requirements
Most important:
Reliable
Fault tolerant
Scaling is not as important
Smaller message.
Does not need to be as scalable as kafka
Initially we just consider a single consumer
System Design
External APIs
Producer: post(message, ID)
Consumer: pull(ID)
System design
13 minutes
Interviewee: we should use ack=3 to ensure the message is reliably persisted by 3 replicas
Interviewer: What’s in meta service?
Interviewee: similar to a zoo service. Supports leader/follower election
Interviewer: How do we maintain ordering?
Interviewee: we need to decide which component will do the ordering. We can synchronize the calls using the clock of the leader of the replica.
Interviewer: how do you define a leader? If we have leader and follower relations, why do we need metadata service?
Interviewee: some place we need to record which one is the leader.
Metadata: leader election for replicas
Locate message ID -> message storage service
Interviewee:
Consumer comes with consumerID
Returns list
Meta service converts message ID to message storage service.
Interviewer: does the service return a list of IDs, or the message bodies?
Interviewee: the API service will return a list of messages, not a single one.
Let’s say producer sends A, B, C, D
When consumer arrives
Returns A
Returns B, C
Returns D
Each they may get more than one message
Interviewer:
consumer sends CID to API server
API server looks up the message storage from metadata service
API server retrieves the messages from the message storage
Interviewee:
Correct.
Interviewer: how do you handle the ordering
Interviewee: we can write to the leader, and the leader decides the order. Then the leader sends the ordering to replicas
Interviewer: do you use timestamp?
Interviewee: we can use timestamp or sequence number
Interviewer: why not order in the API server. (I’m fine)
Interviewer: how to support high availability? Is meta service a single point of failure?
Interviewee: we use raft to coordinate between Meta service nodes. We can use zoo keeper with multiple nodes, running consensus algorithms.
Interviewer: zookeeper with more than one instance?
Interviewee: we can set up zookeeper on 5 machines
Interviewer: zookeeper itself is a cluster
Interviewee: yes
Interviewer: replica x 3. Metadata service has multiple replicas. We have 3 copies of the data. If one copy of the data is bad, what do you do?
Interviewee: if one copy is bad
Read and detect failure. We need to go back to the metadata service to allocate another storage node.
Or metadata service provides the addresses of multiple services. We may detect the corrupted replica during read. If we find out we have fewer than 3, we will metadata service to allocate another storage node, and replica data into the new node.
Interviewer: if the API returns a list of messages to the consumers, how do you make sure the same messages will not be returned again
Interviewee: API service will see the list of messages. If we have consumed the actual message, let’s say we have chunk services c1, c2, c3. We need to get acknowledgment of consumer, then we bump the watermark at the meta data.
Interviewer: why we don’t remove the messages?
Interviewee: benefit is we save the disk space. Alternative is we save it for a couple weeks. It may be good to retain the data to handle edge cases.
Audience Rating
Interviewee Self Feedback
Basically covered all requirements
The interviewer skipped some requirements. I followed the focus of the interviewer.
At the beginning I missed some requirements. I didn’t follow up right away.
A bit nervous
Soft skill
Audience: We can describe the use case.
Interviewee: it looks like the interviewer locked down the requirements
Audience: Different use case. There are multiple solutions on the market such as RocketMQ, Kafka. Throughput, latency, business requirements.
Interviewee: small scale. Focus was on fault tolerant.
Audience: Any business case?
Interviewee: small scale, similar to finance applications
Audience: it looks similar to Kafka
Interviewee: best to ask about the business
Hard skill
Interviewee: ordering. If there is only one consumer, it’s not hard to guarantee the order. However, it’s more difficult to guarantee order for multiple consumers. Every partition can only be consumed by (at most) one consumer.
Audience: timestamp or broker timestamp. If it’s based on broker timestamp, then it is fine.
Interviewee: I tried to clarify with interviewer. It looks like for one partition, we guarantee the order.
Audience, in Kafka, if we use the same key, we will send to the same partition.
If we need to ensure end-to-end, it’s more difficult, for example, we cannot allow two producers to send at the same time.
How to partition: handled by procedure
How to sequentialize, how big is the throughput, producer must coordinate with the broker.
Broker is within the storage. Every broker is a machine in the message storage.
Audience M: LB is put in broker. In producer client library will resolve the address of the broker. Using LB it may
Audience: zookeeper needs to push update to the producer.
Audience: if the master is down, will the zookeeper send it to the producer?
Audience M: producer will pull data from the coordinator of the cluster.
Audience M: yes we can also design metadata as zookeeper, and zookeeper pushes the configuration with the producer.
Kafka controller: relieved some duty from zookeeper.
Discuss tradeoff.
Use load balancer
Or directly put it in consumer
Consumer Pulls metadata from metadata
Metadata pushes metadata to consumer
Audience: How do we implement a queue? Table, in-memory, file
Interviewee: we can save everything on disk.
We can use version or timestamp to ensure
Walkahead log, save the offset
Audience: walkahead log?
Interviewee: file 1, file 2, file 3.
Audience: Are we saving multiple messages in the same file?
Interviewee: yes. Append to file, and change offset.
Audience: use reactor mode, use a single thread to write to the file.
Audience M: we can have multiple producers.
Interviewee: we can build an in-memory queue, the flush from queue to disk
Audience: scalability, partition
Interviewee: yes it’s important but the time is limited.
Audience: how to handle some producer sending first but arriving last?
Interviewee: we can provide some locking at the client side when sending the messages.
Audience: DDIA total order guaranteed by consensus algorithm.
Audience: rabbit queue can support ordering
Audience:
Reference: qingyun 算法
Audience: exactly once. Idempotent key for producer. Idempotent key for consumer.
https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/