Design Auction System
Materials — open to everyone, no sign-in
Topic: Design Auction System
Interviewer: video from 4-10-2022
Interviewee: system
Level: L5 (Senior)
Additional Resources:
System Design Interview
Join Us on Wechat
Interview Notes
Starting at [7:00]
Q: explain QPS
Query QPS
Bidding QPS / Writing QPS
[20:40]
Event bus: Amazon Lambda, Google Guava
Design table schema
[dived in too deep]
Q: what’s the high level flow?
Seller:
post: v1/enlist, productID, quantity, lowestprice, seller
Post: v1/createauction
[struggled a bit with arrow]
[are there multiple quantity per product?]
[27:33]
Q: focus on bidding side
Websocket to connect to bid agent
[30:30]
Kafka queue for bidding
Flink for streaming
[what are the topics?]
Steaming service to calculate the top price
[does the bid agent need to get the latest price?]
[ Why are there 2 kafka queues? ]
Redis is to get the top price
Bidding service and bid agent can read it from Redis
API:
Post v1/bid
Post bid ->kafka -> streaming service
Post bid
Compare with top price
If higher than top price, put into kafka
End of the auction, the top price will be calculated
[40:00]
Data structure to store the list of prices
[dived too deep?]
Q: How do we store data in database?
A: product table, inventory table, auction table
Wide-column non-sql database, reason
Not much joins
No need for transactions
LSM Tree database for faster writes
We may need to support distributed transactions
Seller insert data, also needs to make sure it’s in elastic search
2 kafka: make sure support exactly once
Streaming: distributed transactions
Q: How do we update database?
[47]
NoSQL database
Q: NoSQL. Throughput is high. Is it still high?
A: streaming process. Debugging, testing purpose.
We use cache write-back. We still write to DB.
Theoretically every bid is written into DB
[51]
Q: how to handle lots of database
A: it’s only for append
HDFS
we can shard database by auction ID
Time range. Hot and cold data. We can archive cold data
Discussion
We need a consumer to read from Redis
Alternatives
Kafka can support persistence. But need to calculate
Redis can support topK through Zset. Harder to support persistence
Exactly once through 2 kafka queues
Idempotency key in first kafka
Two phase commit - read the flink paper
If we were to interview again, we will not to use 2 kafka
Why do we need 2 phase commit? Why exactly once?
If I bid the highest price, we need to make sure “at least once”
We can ensure “at most once” through idempotent key
Combined these 2, we achieve exactly once.
Can we aggregate directly on the server? Can reduce traffic load on the downstream network.
Bid service can store all requests, e.g. blob storage or HDFS
Why do we need to store top-K. Why not just top?
I overplanned it. If the first payment failed, then we need to go to the 2nd one
Top-1. 2 people bidding the same price, which one is valid.
Fake bidding - should we have backup?
2 bidder at the same time? Using Redis, we can sequentialize
Sharding: can use auction ID to shard Redis.
QPS may be very high
ZSet: distributed lock
Set key + UUID of auction + price
Redis supports distributed lock
Set key, request ID (auction ID)
Redis will reject subsequent sets
Price, and timestamp
Create if not exist
Jedis: lots of options: set key if not exist, timeout
SETNX: https://redis.io/commands/setnx/
2 designs
Only handle highest price
Or handle top-K
Redis is single threaded
What if Redis has multiple instances?
Key should only contain auction ID + product ID, no price or client ID
===