Cloud File Storage
Materials — open to everyone, no sign-in
Topic: Cloud File Storage
Interviewer: Tekken
Interviewee: Tom (慢慢失败)
Level: L5 (Senior)
Topic
Mock System Design Interview Summary
Interview Overview
Date: 2/6/2022
Target level: L5
Duration: 45 minutes
Topic covered: Cloud storage system
Drawing tool used: Excalidraw.com
Requirements
Functional requirements
Google drive: can edit/download/upload
Dropbox: local folder and cloud folder sync
Permission control
public/private
List files in the client/web
View files in the client/web
Sync the local files to cloud
Upload files/Download files/Delete files
Non functional requirements
Availability: 99.9%
Reliability: do not lose user data
Latency: List files should have realtime user experience
Capacity:
DAU - 10M
Total users - 50M
Assume: on average, every user has 1 GB files in our cloud
Storage: 50M * 1GB =
Network bandwidth:
Assumption: on average, in DAU, each user upload 100M
Upload network bandwidth = 10M * 100MB / (24 h * 3600s)
1:2 upload/download ratio
Download network bandwidth = 20M * 100MB / (24 h * 3600s)
of files per user:
Assume on average, a file has size of 50MB
=> 1GB / 50MB for a single user => 200 files
Interviewer: 10 billion files total
System Design
Starting about 7:25
External APIs
System design
Interviewer: If you upload a file from one device, how does this sync the file to a different device?
Interviewee: let’s discuss the components at high level first.
Interviewer: how to upload file?
Interviewee: let’s define the API first
API:
UploadFileRequest(user_token, file_path, file(data_stream), description, title, permission)
DownloadFileRequest(user_token, file_path) - file(data_stream)
DeleteFile(user_token, file_path)
ListFile(user_token, folder_path, pagination, sort) -> list<file_meta_data>
Interviewer: do we need multiple services (permission service, upload service, etc.)?
Interviewee: yes. The sync functionality can also be achieved by these services.
Cloud file storage
FileMetadataTable: (nosql + transaction)
File_id, user_id,
permission[public/private],
Title, description, file_path,
list<chunk_address>, creation_time, last_updated_time
File system:
Store these chunks
Interviewer: you need ACID?
Interviewee: yes. Strong consistency -> always read the latest write. So we may not need strong consistency
Interviewer: should you draw a new database component?
Interviewee: let’s finish this part of design first
To get all these chunks:
Upload in chunks.
Pro: we can always resume when we stop.
Con: It will need some client side support (cost client side CPU)
Upload the entire file and parse them to chunks
Pro: Easy to implement, supports web client
Con: If upload fails, we have to start from beginning
Adding “File Processor” to generate metadata
Workflow:
User upload: call “uploadFileRequest” there are 2 APIs, one is for device, one is for web client
Syncing file to client:
A few possible ways:
Websocket:
Pro: bidirectional, can upload/download
Con: expensive to keep connection
Pull/long polling:
Pro: cheap, easy to implement
Con: for download, we need another request
Personal prefer to use websocket
Interviewer: How do you know which part of the file requires sync?
Interviewee: 2 clients can modify the same file
To handle race condition, the behavior may not be expected
Interviewer: User A modify existing file, upload in chunk
Once the upload is done.
Next step is we need to notify other devices, which file which chunk is changed
Interviewee: file processor, we do dedup and comparison. How many chunks were updated.
Add “fanout message queue”
Dedup:
Options
Based on the check sum (recommended)
Based on the entire data
Interviewer and Audience Feedback
Audience:
参考答案:https://www.youtube.com/watch?v=PE4gwstWhmc Dropbox Senior Engineer design at Stanford University.
Interviewer
Need a working solution
5 minutes: requirement
25 minutes: system architecture design
5-10 minutes for normal case
If more time: 1-2 questions
Database, table design
Notification
Assumption: interviewee can ask interviewer question
I didn’t quite follow the diagram, but didn’t want to interrupt
Interrupt or not
Metadata - on cloud storage
Need to separate storage and database
===
Soft skill feedback from audience
Some interviewer interrupts
Some interviewee doesn’t interrupt
The right thing is interviewee’s reaction, should solicited feedback regularly
If we don’t interrupt, but the design is not what I expected, then it’s probably no hire
Some interviewer doesn’t interrupt and just provide no hire feedback
Interviewee can ask more feedback
Interviewer asked twice how to sync. Interviewee pushed back twice.
Interviewee can adapt to the pace of the interviewer
Interviewee: it feels like the question is too early to be asked
Interviewer: it feels the diagram cannot handle all use cases. I will probably draw high-level
It looks like notification portion is missing, so I provided hint for
Audience: interviewee can discuss the focus of the interview
The design doesn’t have an impressive aspect
Give interviewer some multiple choice
Audience: interviewee want to have complete coverage
Interviewer can suggest that this part is too much detail, e.g. upload, download is too
10 minutes spent on
Audience: not provide a complete requirement. Can ask the interviewer what are the key features
Interviewer wanted to show the share feature
Interviewer wanted to provide hint for
Availability: 99.9%?
Audience: 2 nines, 3 nines, will that be enough
There is no proof - 99.9%.
Handle scalability
It depends on level. Availability
Interviewer: collect positive signal. (compared to other candidates)
High availability may be a possible point of discussion.
Audience: 3-9 or 4-9 not provable
High availability
Low latency
Reliability
The numbers are not
Hard skill feedback
One topic per device?
10 topics for 10 devices?
If there is only 1 queue per user, then the message may be consumed
Long-lived connection
SSE (server sent event) may be better
If there are too many users
Message queue 50,000 connections
Database is harder to scale (partition)
Redis for metadata cache
Audience
Meta data stores the chunk
If file storage server crashes, then it gets to a strange state
If it’s not a transaction, then it can crash after uploaded to cloud storage
1 file to multiple chunks
If there are 2 chunks, then are there 2 messages?
Answer: yes
https://whimsical.com/google-drive-BBsXFU8DQX7tp9CMyTXASs
Chunking is done on the client
Don’t use S3 chunking
Client already chunked
MVP: chunk is an optimization
If we don’t do chunk, then it will still
Client: there is a client service.
Directly talking to file storage service
Sync, notification, first talk to client service
If there is a web client, then where is the client service?
If it’s an app, then client service sits on device
If it’s a web client, where is the client service?
There is local storage for web browser, but there is a limit
Many browser doesn’t have local storage
Chunker: only an app can do chunking
If there is webclient, then we don’t need to do chunking
At the beginning we don’t need to discuss chunk
Multiple client modify different part of the file?
Can use lock
For example, google sheet can update at different places
Try to lock a unit (such as a line in the file)
Google sheet - last modification wins
Google drive - does not solve this problem. The sync has a problem. It will just create different copies of the same file.
Workable solution
First upload: create meta data, then upload, then confirm upload finished
Message queue: what’s the topic?
Producer, consumer. What’s the topic.
For example, 10 devices. Then there will be 10 topics each one for one device.
Why do we use device ID as topic?
If there is one topic, then if we consume 1 message, it will disappear
You can use one queue. Notification server: which clients are connected, then push to connected clients. Then consume the message
Every time an offline client connects, then it will be diff it’s own meta data to meta data in the cloud
There may be multiple metadata service.
One in China and one in US
We should have multiple copies of the metadata
S3 chunked upload
First it will create the metadata, including all chunks
Then the chunks can be uploaded
After completing 100 chunks, then mark that as finished
Modified design after discussion