Recommendation System
Materials — open to everyone, no sign-in
Topic: Recommendation System
Interviewer: haha
Interviewee: YingYing
Level: L5 (Senior)
Sign Up Form:
QRCode
Mock System Design Interview Summary
Interview Overview
Date: 12/4/2022
Target level: L5 - senior
Duration: 45 minutes
Topic covered: Recommendation system
Drawing tool used: Excalidraw
Requirements
Functional requirements
Design a youtube/tiktok short video recommendation system
DAU: 100 million users, can be scaled up to 1B users
Video average length 5-10 minutes, average browser time 1 hour
[78]
[42:22]
Provide customized recommendations list of short videos for users
Assume 20-50 videos per list
Non-functional requirements
[81][40:22]
Performance, availability
Interviewer: focus on recommendation algorithm not system
Average videos per day = 1 hour / 5 minutes = 12
[84][37:]
100 million * 12 queries / 86400 seconds = 0.14 million = 14k qps
Interviewer: Next stage we will scale to 1 billion users, but let’s use 100 million now
[33:51]
System Design
External APIs
System design
Feature generation
Rank
Filtering
[89][29:53]
Collect 3 types of information
User profile
Video profile
User-video interaction
Use NoSQL database such as redis to store the information
[28:03]
Add candidate generation to narrow down the candidates of videos
Add ranker to choose top candidates
Add filter to apply some business logic such as inappropriate content, and add diversity
[25:00]
Interviewer: Add requirement, right now we only use static recommendation
However, there are newly uploaded videos
Interviewee: for static data we can compute the recommendation at night
[23:00]
[92][22:00]
Interviewer: after compute at night, new videos are uploaded
Interviewee: we can compute just the delta, rerank
[20:17]
Interviewer: Let’s say there is a newly uploaded video everyday, will it change the read/write ratio of the database?
Interviewee: we are only appending to the store
Assuming metadata is about 2k per video
1 video per day * 10 M creators * 2k = 20GB additional storage per day
Interviewer: what database do you want to choose, given there are new uploads from users each day
Interviewee: video size is not too large. We can consider SQL instead of NoSQL
SQL is good at handling delta changes.
Interviewer: why do we need to consider SQL database? Just to handle delta better?
Interviewee: want to backup my statement. Any database can handle read and write. It is not very write intensive.
Interviewer: do you prefer SQL or noSQL database, given we have writing at the same time as reading.
Interviewee: we don’t need to handle read and write at the same time. Reading is overnight. Writing is ongoing. If we use SQL, locking table may cause performance issues. Therefore we should use NoSQL database
Interviewer: if we want to create dynamic recommendation, what changes do we need to make to the system?
Interviewee: 2 challenges. Data collection, and performance of candidate generation and ranker.
Interviewee: if we use a pre-trained model, we can train the model offline.
For data collection, we will use streaming data. Can use Kafka to handle data in the queue.
[10:31]
Interviewer: If a user clicks unlike on one video, how does your algorithm respond to the action?
User profile did not change
Video profile did not change
User-video interaction changed
Candidate generation will generate different results
[7:59]
Interviewer: If I have a desktop and mobile. My kid is using one client, I am using another client. From your perspective this is the same client. How do you handle this design?
[7:00]
Interviewee: we can add device ID as part of the user-video interaction module
[5:47]
Interviewer: If I login to a new device, what behavior will the system use?
Interviewee: we will add device ID in the user profile
Alternative 1: In the filter stage, we can consider the safer approach
Alternative 2: we combine the previous devices’ behavior, and filter out the inappropriate ones at the final stage.
Alternative 3: just use the latest video viewed by the user
[4:01]
Interviewer: Which components are distributed vs centralized?
Interviewee:
distributed components: load balancer, application server, user profile, video profile, user-video interaction
Candidate generations are not easily distributed. To fix performance issue we can have a pretrained model that look at historical videos in a separate process, and deploy the model to the candidate generation service.
Ranker is a centralized service
Filter is centralized service
[0:30]
Interviewer: draw small diagrams
[Time is up 7:03]
1 minute overtime
Interviewer and Audience Feedback
Interviewer
Medium to easy design
The system design overall structure is right
Some small errors in details
Should improve on how to scale the system
Interviewee
Target is machine learning engineer
Should improve
Read and write ratio. I don’t have deep enough understanding
===
Offline computation
Candidate generation
TPP: taobao prediction module
Dynamically adjust to user clicks, like/unlike
Audience: I2i similarity
Answer: cold start
R2i, r22
Facebook and doordash as reference.
Reference videos:
https://www.youtube.com/watch?v=lh9CNRDqKBk
https://www.laioffer.com/en/videos/2018-08-01-design-a-recommendation-system/
Interviewee: what did you expect when you ask about
Interviewer:
We should first compute read and write ratio. If not complete dependent on transaction.
SQL/NoSQL - what are the fundamental
Append operation - is not key difference between SQL/NoSQL
Consideration
1: do we need transactions? If yes, we lean toward SQL
- QPS - very high - may need 2 phase commit to guarantee.
Right now there is no need for transaction / SQL database
If we need to change user, then it’s not append only
Ranker and filter can be decentralized.
Filter is for each user. Therefore we don’t need to be centralized.
Interviewer: I thought about only a single user.
Interviewer: which one is decentralized. I was going to ask about extending to 1 billion users
All computation can be distributed
Audience: multi-client?
Interviewer: tradeoff is device based vs user based.
[87]
Users may not login
Sparse training data
Audience: what if I just treat it as one person?
Interviewer: the same account may be used by multiple people
Interviewee: kid use vs adult use
Interviewer: most cases I have seen mixed together multiple devices
Audience: is it per-item recommendation or per-user
Interviewer: should be based on item
Audience: should I give interviewer multiple choice or open-ended what the interviewer thinks is important
Interviewer: multiple choice is best
Audience: If we finish drawing the high level diagram, then should we ask the interviewer where to go next?
Interviewer: interviewers usually should drive
林老师: if there is silence, then give multiple choice is better than open ended
Audience: how to make it more dynamic?
Interviewer: we started with static. It is fast
Using dynamic, we can improve the accuracy
Interviewee:
Offline compute based on static model
On the fly is very slow.
Audience:
How to improve machine learning model
Feature: valuation
Offline evaluation, online evaluation
Like, unlike is a feedback loop
ML model can predict the engagement of the user
Hand-craft feature or deep-learning feature
Clickthrough improvement - can be used roll out new models
Development
Audience: Are there key points you prepared to verify?
Audience: There are usually cache for database
Interviewer: default we will add cache for database
Audience
We can use noSQL
It’s probably a high read low write
Therefore we use noSQL
Audience
What do you like to test the most?
Interviewer:
After model is deployed, it’s simply an API
Audience: NoSQL vs SQL