Design Uber · commitway

Topic: Design Uber

Interviewer: ken

Interviewee: 2fet - Jun Shao

Level: L5 (Senior)

Additional Resources:

Live Stream

System Design Interview

Join Us on Wechat

Supporting Documents:

Design Diagram: https://whimsical.com/mingdao-KfCG71EnhDk4jtFzerK8f4

Audience Survey: https://forms.gle/y1tH4RGKxJEGa8Tg7

System design requirements

Uber driver assignment

Reference: Design uber mock interview https://docs.google.com/document/d/1NFwNTupod7jF-nQ0VG0nvoKwtDxndxS4fNAClvjJg4U/edit

Key engineering stats

https://investor.uber.com/financials/default.aspx

https://s23.q4cdn.com/407969754/files/doc_presentations/2022/Uber-Investor-Day-2022.pdf

$25.9B gross bookings

1.8B trips

5 billion trips in 2020 https://www.businessofapps.com/data/uber-statistics/

100M users in a quarter https://www.businessofapps.com/data/uber-statistics/

4M drivers https://therideshareguy.com/how-many-uber-drivers-are-there/

Functional requirements

Customer should be able to request a ride

Customer should see all drivers nearby

Driver assignment

Driver should send their location to the server

After accepting ride, driver and riders can see each other

Functional requirements

Driver pick up the address to rider

The rider places an order to a destination

A driver can pick up an order

Non functional requirements

AP/EL model. Availability, partition, else latency (quick request/response)

Availability:

Scalable: big events, rush hours

Latency

Security

Resilience: flexible at different volume

[39:20]

[my requirements]

Scaling requirements:

Driver assignment

5 billion trips in 2020. each trip requires driver assignment

5 billion / 365 days / 100k seconds per day ~= 150 assignments/second

likely high spike in traffic hours and holidays. 1500 assignments / second

Driver location tracking

4M drivers, each one sends location update every 30 seconds

4M location updates / 30 seconds = 130k location updates per second

customer

300M users

93M monthly active users

[Interviewee should clarify the scaling requirements from interviewer]

API design

For rider:

CreateOrder(userToken, fromLocation, ToLocation, Type) -> orderID

For driver:

PickUpOrder(UserToken, orderId)

GetOrderList(UserToken, DriverLocation … fromOrderId) OrderList

Stream function, server will send order events in a stream. Will hold the connection

DriverLocation is a stream as well

fromOrderId: will not pick up old order

[31:33]

Schema

UserTable rider/driver 100M riders(userId , email, address, name, phone number)

Each user takes 1k.

[where did 100M this come from? ]

1k users = 100GB 10Million

Driver 500K drivers

100M rider + 500k

100GB

Car information(userId, carId, type, desc, maker

Order 10M (10% active users, 1 order everyday) 3.6 billion orders orderId, riderId, driverId, from, to, tiestap, status

200bytes for reach order

3.6 * 2 = 1TB / year. 5 years = 5 TB

10M orders per day / 2 rush hours * 36 write QPS 1k write QPS * 5 = 5k write QPS

NoSQL - write throughput is faster

Create order

Modify order = 5x create orders

Also need to change order status

[Some concern that the interviewee did not ask for the scaling requirements from interviewer]

[QPS calculation: we may partition by geography to reduce QPS]

[21:10]

High level design

[We likely need a queue consumer]

Add location service

[why is location service connected from LB? ]

[13:04]

[what does location service do?]

Use S2 google geo library to assign a geo ID

[the system interaction flow is not very clear]

[08:08]

Q: why are there 2 arrows LB -> location service?

[location service contains driver location or orders?]

[a bit confused about the websocket servers]

Drivers need to find the websocket server based on their locations

[does the driver change to a different web socket if they drive to a different location?]

geoID = 1sq kilometer

Need quad tree to find nearby geoID

Q: how to resolve conflicts between driver

Optimistic locking

First come-first serve

Update the order - version number increases

Use version number

Q: what’s in the quad tree index

Location ID

===

Missing capacity

User place order

Clearer picture

Geo ID to distribute orders

Network - bidirection

===

Driver throughput can be very large

Lots of drivers, updates every few seconds

Redis pubsub

===

Key problems to address (重要考点):

driver location tracking

driver assignment

state transition for trip

high availability

high throughput

Bonus:

Geographic specific data structure e.g. quadtree, geohash

Soft Skills:

gathering requirements

making decisions and justifying tradeoffs

describing the solution using clear presentation, concise language and accurate technical terms

Hard Skills:

design quality; scalability, reliability, efficiency etc (L4, L5)

basic facts about existing software solutions and hardware capabilities (L4 - partly, L5)

project/product lifecycle awareness, e.g. how a project is developed and maintained (L5)

====

Discussion after Interview

Materials — open to everyone, no sign-in