Like/Unlike System
Topic: Like/Unlike System
Interviewer: 苏鹏 Peng
Interviewee: 桂
Level: L5 (Senior)
Topic
Mock System Design Interview Summary
Interview Overview
Date: 2/27/2022
Target level: L5
Duration: 45 minutes
Topic covered: like service
Drawing tool used: https://app.tryeraser.com/
Requirements
Functional requirements
Our website has a marketplace with millions of items for sale. For each item, users can choose to like it. An item which is liked can be unliked.
As a user, I want to like/unlike an item
As a user, I want to know if an item has been liked by me
As a user, I want to see how many likes an item has
Please design a system to support this like/unlike feature
Non functional requirements
Start with a small number of users
How accurate does it need to be?
Not high priority
Actual count not need to be most up to date
But should give feedback for increase and decrease (like/unlike)
System Design
External APIs
API:
like(user, itemId)
unlike(user, itemId)
getLikeTotalCount(itemId)
islikedby(user, itemId)
Db:
user, itemID
itemId, totlaLikes
100Mb scale
System design
Choice of database
PostgreSQL. Native support of change feed
Usually favor relational DB, easy to manage
Supports consistency. Relational db much easier
2DBs: They are different logical storage
What happens when user like and then unlike?
Remove the row, or mark removal with a flag.
Tradeoff:
Doesn’t impact like, unlike, islikedby as much
DB:
User, itemId, liked
PK: user, itemId
Not a huge difference (if user clicks like/not like very often)
If you delete a lot of things from database, does it affect DB?
like/unlike history, is it useful?
Database engine,
MySQL: B+ tree
(LMS tree is the other popular data organization).
Creation/deletion,
Intuitively add/remove is more heavy weight
Now let’s scale it up
like/unlike: 100,000 QPS
getLikeTotalCount: 1M QPS
isLikedBy: 1M QPS
Does API need to be strongly consistent?
If I like someone or some item, the button turns red.
Read after write consistency
Total likes: doesn’t need to be very accurate
DB size is small enough
OneDB or just shard based on item
Real browsing traffic, no hot user
We can shard by userID, can result in the right partition
4 or 5 shard if we shard by user
Userlike: shard by user
totalLikes: shard by itemID
Active 100k
Total users ~1M
20 TB
Storage is not a bottleneck
Which part is likely to be a bottleneck
If we have 1M users, 100 shards
The likeService
Want this to be resilient to avoid single point of failure
Add load balancer
Interviewer hint: can change DB
If we don’t care about read/write consistency, we can have other options
Compute hash in like service, compute which shard the request hits
Fixed number of shards to get started
Doesn’t do auto scaling
If we need to spin up new instance, it disrupts the hash algorithm
Ideally it should automatically scale up
Static sharding: maybe one shard has too much data
Walk through data flow
Frontend: load balancer. Like service is stateless
Each instance of like service, need to compute which shard to go to based on userID (hashcode)
Write to the userLike DB instance based on the hashcode
Storage itself is not high
We just need more read replica for totalLike table
Add likeCount service
UserLike databases uses master slave
Which DB can fit the need for userLike table?
If someone liked something, and open a different database
mongo DB -> within one client session it guarantees read / write consistency
We can still use relational DB due to strong consistency requirement
Number of sharding doesn’t need to change that often
Increase or decrease # of replicas for each shard
For strong consistency, isLikedBy(user, itemId)
Like,unlike 100k QPS, peak may be 3 times.
100 databases in cluster
10 databases is still too much
Partition out of box, mongoDB, provides sharding out of box
Can cache help?
The cache can still fail. If we can compromise in consistency, then cache will help
Add cache
Cache crashes, we may lose some write
Losing write
Based on QPS, cache may not be necessary
Distributed cache, local cache
Sticky like service, can cache locally
Strong consistency
Very difficult to use cache
Add partition
Interviewer and Audience Feedback
Some challenges
Mock interview, somewhat nervous
Requirement clarification is good
MVP - can work
Scalability and resiliency, not too satisfied
Bottleneck to optimize
重要考点
Read heavy - need to use a lot of cache
Using sharding and replica - cache may be better
Try to hint to use cache
Candidate x
Strong consistency
Can use local cache for strong consistency
Like service can have a cache, like and unlike, just write to local cache
Then it can send to queue, then update the db
If local cache - fail from DB
Local - session stickiness
Like - and like is successful
If the server didn’t send success
Not write to DB
Write to cache and write to a highly available queue
==
Interviewer does not use regular ways to give question
Talk too fast, can pause 5 seconds how to solve
Oncall
How to onboard new developer
First ask about the situation clearly
Don’t need to jumpt to answer
Interviewer split into smaller scale question
Then a large scale question
First high level design
The biggest is scalability
==
Is there sharding in your
Like service - should scale up and down based on
Stateful vs stateless
100k QPS, a big write
Batch write - backend service
Change data capture from userLikeDatabase, this one is good
On top of totalLike, we can add a cache
Like, unlike, only for myself
keyValue pair
High QPS - 100k write cannot be handled
NoSQL, cassandra or dynamoDB
Why is mysql difficult?
A single mysql
Manual sharding, 10 different database, static sharding
Split the database
Backup, snapshot, upgrade, maintenance
10->20 need to manual split
Cache, message queue sharding
Metas tool?
Write speed -> 100k hard to reach for traditional database
Load balance, can put in a lot of like service
We can handle high QPS
Local cache
When you write, write to cache
When you return, you must have successfully write to both cache and queue
User session sticky to like service
What happens to like service?
Change DB to key-value db
Total like needs to go to cache