Distributed Resilient Storage in the Real World
Topic: Distributed Resilient Storage in the Real World
Presenter: Kai
Sign Up Form:
WeChat QRCode
Topic: Distributed resilient data storage
System design: IP blocker
Real world design: highly distributed authentication service
Container isolation: GVisor - Distributed system with container is slow
==
System design: IP Blocker
Senior/staff level
Country X implement a law which forbid a certain IPs to access Google’s services
Country X provide
3rd party handles 20k QPS
Service, 2M connections per second
Warm start with precomputed result
===
Missing point:
3rd party, only asked about throughput, but not latency and SLA. It’s often unavailable.
IPv4 4B addresses, can pre-build a cache of answer. It takes less than 3 days to fill the cache
For IPv6: can use the previous solution
What if we have multiple data centers and still one external service?
===
Control plane - configuration
Data plane
Key: cloud resource manager / cloud resource frontend
Similar to airflow, or uber cadence
User -> cloud resource manager: User requests a service to be allocated
Services in the private network don’t require authentication
Latency for replication: 5 minutes
Key problems in design:
Dataplane authentication: 70k-80k requests per second. How do we reduce latency?
How to detect data tampering
Improve service resilience
Key problems yet to be solved:
Big customers starving other customers
Geography boundary
Data plane authentication:
High throughput.
Low latency Need to be < 5ms
Solution: colocate auth service with (real) data service
Actual issue: an authentication request may need multiple backend
What happens if a permission is revoked?
User -> cloud service manager -> storage -> data change field -> data sync (may merge with data from other services) -> push to cache
Q: when there is auth data change, does the change propagate to all cache instances?
A: first filter by project, then push the snapshot
Data signing
Hacker may attack the storage
Authentication data may cause a lot of damage
Sign with private key
Verify with public key
Ensure all updates happens in secure machines - protects upstream
Embedding signing signature when writing happens
Public API for key metadata and clients to verify
Does not protect against upstream modifications
Protects against subsequent modifications
Distributed data store: 5 level of consistent level
Above eventual consistency, below session consistency
Distributed resilience storage: multi-tenant
GDPR
Shuffle sharding.
Big or malicious tenant may bring down service of other users
Cell-based architecture / shuffle sharding
Cell-based architecture - 5 users share the same instance.
Cache instance 2VCPU, 0.5 GB, under utilized
If increase # of users, the reliability will decrease
Container based isolation: Hard to be pay-as-you-go
Notebook service. To ensure isolation between enterprises, we used container
5-6 seconds to open jupyter for most other vendors
2-3 minutes for us
Can we use kubernetes fleet to support better isolation and faster resource isolation
Gvisor for isolation
Kata container
Distributed resilience storage: multi-tenant
Aliyun: run lambda on kubernetes fleet
If a client spans over multiple machines, then use runc
For small clients: use kata. Kata is slower