Newsfeed Real Time Comments
Materials — open to everyone, no sign-in
Topic: Newsfeed Real Time Comments
Interviewer: ken
Level: L5 (Senior)
Additional Resources:
System Design Interview - Real time comment
6/12/2024
From FB post 2011 https://engineering.fb.com/2011/02/07/core-data/live-commenting-behind-the-scenes/
“every minute, we serve over 100 million pieces of content that may receive comments. In that same minute, users submit around 600,000 comments that need to get routed to the correct viewers. ”
“Write local, read global”
YouTube for the event:
https://www.youtube.com/live/s5XBWf-UjCc
[6:08 -> 6:53]
Functional requirements
Feed
Look
Real time
10 latest comments displayed
[6:08->6:13]
Non functional requirements
[6:08->6:19]
[
Scale
Initial scope:
1 million reads per minute
100 million users. 20 million active
10 comments per post
100 bytes per comment
20 comments per day
Scale up x100 (matching FB 2011):
High Availability
High throughput
Low latency
]
[ may want to ask about number of users]
API:
POST /comment/create feedId, comment
GET /comments/feedId -> list of historical comments
[6:08->6:21]
Data entities
Comment: comment ID, feed ID, userId, comment
PostMetaData: ID, userId,
User: userId,
[was a bit confused on feed vs post]
[6:08->6:26] [6:53]
High level design
SQL vs NoSQL are both fine due to low number of comments
NoSQL can be a fit
For now there is no SQL
If we need to support high throughput
Queue in the middle
[6:08->6:32] [6:53]
How to get comments in real time?
Polling of the comments
[6:08->6:37] [6:53]
[what if the read throughput is too high?]
Split read vs write
Real time service
Subscribe to a queue
Each comment is pushed into the queue
Kafka or Redis
There could be lots of topics
[6:08->6:37] [6:53]
Hash users to topics
Partition the topics
Hash the post into topics
Or hash the user into topics
User ID may be better
[6:08->6:32] [6:53]
Kafka
Multiple reader for the topics
How does Kakfa work
Support
[6:08-6:52] [ 6:53]