Recommendation System

Topic: Recommendation System

Interviewer: haha

Interviewee: YingYing

Level: L5 (Senior)

QRCode

Mock System Design Interview Summary

Interview Overview

Date: 12/4/2022

Target level: L5 - senior

Duration: 45 minutes

Topic covered: Recommendation system

Drawing tool used: Excalidraw

Requirements

Functional requirements

Design a youtube/tiktok short video recommendation system

DAU: 100 million users, can be scaled up to 1B users

Video average length 5-10 minutes, average browser time 1 hour

[78]

[42:22]

Provide customized recommendations list of short videos for users

Assume 20-50 videos per list

Non-functional requirements

[81][40:22]

Performance, availability

Interviewer: focus on recommendation algorithm not system

Average videos per day = 1 hour / 5 minutes = 12

[84][37:]

100 million * 12 queries / 86400 seconds = 0.14 million = 14k qps

Interviewer: Next stage we will scale to 1 billion users, but let’s use 100 million now

[33:51]

System Design

External APIs

System design

Feature generation

Rank

Filtering

[89][29:53]

Collect 3 types of information

User profile

Video profile

User-video interaction

Use NoSQL database such as redis to store the information

[28:03]

Add candidate generation to narrow down the candidates of videos

Add ranker to choose top candidates

Add filter to apply some business logic such as inappropriate content, and add diversity

[25:00]

Interviewer: Add requirement, right now we only use static recommendation

However, there are newly uploaded videos

Interviewee: for static data we can compute the recommendation at night

[23:00]

[92][22:00]

Interviewer: after compute at night, new videos are uploaded

Interviewee: we can compute just the delta, rerank

[20:17]

Interviewer: Let’s say there is a newly uploaded video everyday, will it change the read/write ratio of the database?

Interviewee: we are only appending to the store

Assuming metadata is about 2k per video

1 video per day * 10 M creators * 2k = 20GB additional storage per day

Interviewer: what database do you want to choose, given there are new uploads from users each day

Interviewee: video size is not too large. We can consider SQL instead of NoSQL

SQL is good at handling delta changes.

Interviewer: why do we need to consider SQL database? Just to handle delta better?

Interviewee: want to backup my statement. Any database can handle read and write. It is not very write intensive.

Interviewer: do you prefer SQL or noSQL database, given we have writing at the same time as reading.

Interviewee: we don’t need to handle read and write at the same time. Reading is overnight. Writing is ongoing. If we use SQL, locking table may cause performance issues. Therefore we should use NoSQL database

Interviewer: if we want to create dynamic recommendation, what changes do we need to make to the system?

Interviewee: 2 challenges. Data collection, and performance of candidate generation and ranker.

Interviewee: if we use a pre-trained model, we can train the model offline.

For data collection, we will use streaming data. Can use Kafka to handle data in the queue.

[10:31]

Interviewer: If a user clicks unlike on one video, how does your algorithm respond to the action?

User profile did not change

Video profile did not change

User-video interaction changed

Candidate generation will generate different results

[7:59]

Interviewer: If I have a desktop and mobile. My kid is using one client, I am using another client. From your perspective this is the same client. How do you handle this design?

[7:00]

Interviewee: we can add device ID as part of the user-video interaction module

[5:47]

Interviewer: If I login to a new device, what behavior will the system use?

Interviewee: we will add device ID in the user profile

Alternative 1: In the filter stage, we can consider the safer approach

Alternative 2: we combine the previous devices’ behavior, and filter out the inappropriate ones at the final stage.

Alternative 3: just use the latest video viewed by the user

[4:01]

Interviewer: Which components are distributed vs centralized?

Interviewee:

distributed components: load balancer, application server, user profile, video profile, user-video interaction

Candidate generations are not easily distributed. To fix performance issue we can have a pretrained model that look at historical videos in a separate process, and deploy the model to the candidate generation service.

Ranker is a centralized service

Filter is centralized service

[0:30]

Interviewer: draw small diagrams

[Time is up 7:03]

1 minute overtime

Interviewer and Audience Feedback

Interviewer

Medium to easy design

The system design overall structure is right

Some small errors in details

Should improve on how to scale the system

Interviewee

Target is machine learning engineer

Should improve

Read and write ratio. I don’t have deep enough understanding

===

Offline computation

Candidate generation

TPP: taobao prediction module

Dynamically adjust to user clicks, like/unlike

Audience: I2i similarity

Answer: cold start

R2i, r22

Facebook and doordash as reference.

Reference videos:

https://www.youtube.com/watch?v=lh9CNRDqKBk

https://www.laioffer.com/en/videos/2018-08-01-design-a-recommendation-system/

Interviewee: what did you expect when you ask about

Interviewer:

We should first compute read and write ratio. If not complete dependent on transaction.

SQL/NoSQL - what are the fundamental

Append operation - is not key difference between SQL/NoSQL

Consideration

1: do we need transactions? If yes, we lean toward SQL

QPS - very high - may need 2 phase commit to guarantee.

Right now there is no need for transaction / SQL database

If we need to change user, then it’s not append only

Ranker and filter can be decentralized.

Filter is for each user. Therefore we don’t need to be centralized.

Interviewer: I thought about only a single user.

Interviewer: which one is decentralized. I was going to ask about extending to 1 billion users

All computation can be distributed

Audience: multi-client?

Interviewer: tradeoff is device based vs user based.

[87]

Users may not login

Sparse training data

Audience: what if I just treat it as one person?

Interviewer: the same account may be used by multiple people

Interviewee: kid use vs adult use

Interviewer: most cases I have seen mixed together multiple devices

Audience: is it per-item recommendation or per-user

Interviewer: should be based on item

Audience: should I give interviewer multiple choice or open-ended what the interviewer thinks is important

Interviewer: multiple choice is best

Audience: If we finish drawing the high level diagram, then should we ask the interviewer where to go next?

Interviewer: interviewers usually should drive

林老师： if there is silence, then give multiple choice is better than open ended

Audience: how to make it more dynamic?

Interviewer: we started with static. It is fast

Using dynamic, we can improve the accuracy

Interviewee:

Offline compute based on static model

On the fly is very slow.

Audience:

How to improve machine learning model

Feature: valuation

Offline evaluation, online evaluation

Like, unlike is a feedback loop

ML model can predict the engagement of the user

Hand-craft feature or deep-learning feature

Clickthrough improvement - can be used roll out new models

Development

Audience: Are there key points you prepared to verify?

Audience: There are usually cache for database

Interviewer: default we will add cache for database

Audience

We can use noSQL

It’s probably a high read low write

Therefore we use noSQL

Audience

What do you like to test the most?

Interviewer:

After model is deployed, it’s simply an API

Audience: NoSQL vs SQL

Materials — open to everyone, no sign-in