Distributed Key-Value Store
Materials — open to everyone, no sign-in
Topic: Distributed Key-Value Store
Interviewer: ken
Interviewee: ryandai
Level: L4 (Experienced Individual Contributor)
Additional Resources:
系统设计活动报名
职位申请报名
https://commitway.com/job-refer
YouTube channel
https://www.youtube.com/channel/UCKzpuki3fHTfCCngJDCZ_Mg
QRCode
Mock System Design Interview Summary
Interview Overview
Date: 2/26/2022
Target level: L4 - experienced IC
Duration: 45 minutes
Topic covered: Key-value store
Drawing tool used:
Requirements
[0:00]
Functional requirements
Design the next generation of distributed KV store that will support services:
authentication, authorization, messaging, etc
Won’t be used to support payment
Range query and batch query/update
Technical implications: B+ tree more ideal than hashtables
Eventual consistency
Technical implication: no need for transactions
in-memory?
Scaling requirements
K/V Scale linearly
Multi data center after 5 years
Latency: Users should have real time experience
Availability: 99.9%
Robustness: user data should not get lost
Technical implications: how to ensure multiple replicas
read/write ratio = 99/1
Technical implication: B+ tree is better than LSM tree
CAP: Availability > Partition > Consistency
Capacity planning
1 petabytes of data, Multi data center after 5 years
1M queries per second: 【-】
10k write QPS, 990k read QPS
Technical implications:
need to distribute to multiple nodes
Some mechanism to route requests to the right nodes.
How to reduce master pressure?
Key <= 1kb, Value <= 8kb 【-】
Average k+v size 2kb => 2kb * 10k / s
[39:40 - 102]
API
getDataRequest(user_token, key, version) - res, error return a string as result
Return a string as result, version_id
Error: error_code, error message
[range query: pagination?]
===
Architecture
Data model
Key-value storage
Key global unique ID, value: string/integer/json
Key methodology
Hash function to convert key to storage key
Handle hash collision
[seems to be offrail since we need to support range query]
Open address
Separate chain
[31:27]
Sharding key: storage key
To support range query:
0-100 in node 1
101-200 in node 2
[hash to node number] [then hash to storage key]
There may be hot-spot 【+】
Hot-spot
Alternative sharding strategy:
we can use consistent hashing
Will resolve hot-key issue
[26:52]
[hoped to walk through the “life of a query”]
【scale up】- too early
Storage is 1 PB
10 TB per node is reasonable
100 nodes
May expand to more data centers in the future
Data sharding in region, or globally
Depends on data capacity
Servers
If we only shard in one region, we can synchronize data between different regions. We can query close to the customer
[22:53]
Range query: node 1 - node 2
50-100
101-150
Config service, key range in each node
Q: range query in a node?
A: we will not hash the key into a storage key
Hash table
Disk only - list of data, without memory
In-memory (binary tree) + disk (append only) - LSM tree
Q: in-memory
A: list will be in memory. Key will be sort-key
Physical storage - sort key
Mongo or dynamoDB, key - string, value - json file
[15]
Access each row to access the value
Append to the result
A:
Range query (lowkey, high key, pagination, term)
===
Noticed the high QPS
write to master
multiple read replicas (data synced from master node, sync or async): sync write
===
Consider cache between storage and disk
===
Add Cache [ ok, but a bit too early ]
Add LRU cache
Independent service
[9:30]
How do we insert?
Data storage write to key-value
Find node 1
Node 1: Resort: Database will find the
B tree strategy, logN
No sq in file
Append to the end of file
==
[1:30]
Leader election to handle master node failure
Read replica failure
===
====
Soft skill: meet expectation
Hard skill:
====
===
Interviewee:
Needed to reiterate. Hashcode -> range query
Key value: not very clear how to get
是否一步到位
Interviewer:
===
Audience discussion
Possible drill down points:
Consistent hashing
Kafka queue
Read cache - local + global cache
Configuration service, zookeeper
Consensus algorithm
Key-value, noSQL database
Sharding: key as sharding
Discussion
R » W, so no need for queue
Discussion
Highly available, strong consistency
Dynamo or cassandra pattern
Availability:
Split brain, network partition.
Cassandra - no master, sloppy quorum
Leader election: supports strong consistency not as availability
Discussion
Strong or eventual consistency: can be solved by leader election
Discussion
Should add more estimation
Discussion
cache design
Should we use queue?
Depends on latency requirement
Should clarify consistency requirement
Discussion:
Similar to SQL database + cache. Write through
Range query can be handled by SQL
Range query vs get/set. We can add cache and SQL
Range query 多坑
Discussion:
Can we propose Redis + rockDB on database
We are designing non-sql database
Focus should be on distributed
Discussion
dynamoDB uses berkeleyDB (B+ tree based)
Cassandra uses LSM tree
Discussion
Bloom filter index tree
===
===
Reference:
https://docs.google.com/document/d/1pqsaPts9fLFt2x3YFmG6D0KK-u3-wPIP6Nq4JAinGhY/edit#
Reference Design
===
Grading Criteria
Soft skill
Clarify requirements
Discuss trade offs
Present clearly; right technical terms
Pace the interview
Hard skills
Design quality
Basic facts and tradeoffs
Project and product lifecycle awareness
Problem Specific Criteria
Basics
Knowledge of Hashtable, B+, LSM tree; Tradeoffs
Data partitioning / traffic routing
Achieving availability
Achieving durability
Bonus:
Knowledge of specific technology or algorithm
===
Areas covered in 1-1 system design training
Part 1: High throughput infrastructure
Notification
Rate limiter
TopK
Key value store
Typeahead suggestion
Distributed message queue
Part 2: High volume infrastructure
Cloud file system (e.g. design google drive, dropbox)
Distributed log collection
Ads logging
Part 3: Collaboration Applications
Multi-user chat
News Feed
News Feed Real Time Comments
Like-unlike
Calendar
Part 4: Distributed Transaction Applications
Ebay auction
inventory management
Ticketmaster
Uber payment
Part 5: Content sharing Applications
YouTube
Google Photos
TinyURL
Part 6: Geography Applications
Design Uber
Design Yelp