In Memory Key-Value Store

Topic: In Memory Key-Value Store

Interviewer: guanyao

Interviewee: 🌈Shijie🆒🤠✨

Level: L5 (Senior)

Additional Resources:

Slab memory management

Topic

Mock System Design Interview Summary

Interview Overview

Date: 3/20/2022

Target level: L5

Duration: 45 minutes

Topic covered: Key value pair store (large key value)

Drawing tool used: diagrams.net

Key problem vote: Key value store

Requirements

Functional requirements

Store key value pair

Has expiry without memory crash

Key value size: MB in size

Non functional requirements

Scaling: First start with one server

High available

Low latency

High throughput

Consistency (optional)

Security (optional)

System Design

External APIs

System design

Key value store

Small segment per storage

What does hash table look like?

String -> hash function -> hash code

Address -> sequence of storage chunk/block binary digits

key is large

Hash table -> hash function

Pseudo code:

Allocate memory

Index to map hash code to storage address

Q: How do you store key, value in memory

Given one request, do you always allocate memory?

Q: Create a list in memory. Use hashcode % list size

Q: what happens for hash collision

A: change the hash function. Or compare the key

Q: do we increase the size of the array?

A: usually change to a hash function, to reduce probability of collision

Q: let’s design a solution to scale up the design for single node

API

get(key)

put(key, value, TTL)

App server is separate from cache node

Q: If one cache node cannot hold all data, how do we handle distributed cache node?

A: We can partition the data

We can compute the node location based on the hash function of the key

Q: How to find the right nodes?

A: We can use consistent hashing to locate the node on the hash ring

Q: How do we handle hotspots? If some key is very popular, how do we handle it?

consistent hashing: vNodes to even distribution

Split the key with the node that is adjacent. Split to multiple ring node.

Read heavy: add more replicas, to split the read load

What kind of replica?

Build a small master-follower cluster for a node.

Q: How do we guarantee the consistency between master and follower?

A: apply 2 phase commit on all followers

Q: for in memory, consistency is not as important. For database consistency is more important

Q: continue

Configuration server (zookeeper)

Q: How does configuration monitor the health of the nodes?

A: can use heartbeat and timeout to check

Q: how do we implement TTL?

A: we can run a process in the background to check TTL

Q: on each node, you can have some job to check all TTL for key-value. That’s fine.

Q: how each hashtable works? How we organize the memory of one node.

A: we can cut the key into different segments. We can get hashcode of each segment

We can use Merkle Tree on segments of large key -> root hash code -> actual key -> value

Q: what is a merkle tree?

A: cut key into segments and build hash key

Interviewer and Audience Feedback

Interviewer:

No hire

Like to see internals of hashtable

Memcashd or redis: usually we ask about low level

Emphasis is not in consistent hashing

Next I hope to get some strength of the candidate

Big key-value may have confused the interviewee

My intention was to use multi-threading to handle

However, we were not able to get the in-memory data structure

Memory management: we didn’t discuss about range query

Emphasis is to know basic knowledge.

Interviewee:

I don’t have good knowledge for lower layer

I was trying to approach from higher

Interviewer:

Pre-allocate memory in cache node

Can reference memcached and redis

For multi-node consistent

In configuration, preallocate memory

===

Soft skill

===

Hard skill

Audience: hash mapping question

Try to ask interviewer, what are the key points

Hire:

Interviewee should cover:

Hash table basic structure. How to resolve collisions.

Write pseudo code of store(key,value).

Expect to know slab memory management, why and how?

How to auto-increase the memory?

When you read, how do you make sure data is not modified by the writer at the same time?

Bonus: lock based on slot.

Bonus: Knows difference between spinlock, mutex, read/write lock. Suppose there are enough CPUs and QPS is not high, what else can I use?

How to shard keys in distributed hash, how to shard servers. Expect to know how consistent hash works.

How does client know which server to talk to?

https://redis.io/topics/replication#:~:text=Redis%20uses%20asynchronous%20replication%2C%20with,the%20amount%20of%20data%20processed.

Strong hire:

How to implement expiration?

https://stackoverflow.com/questions/36172745/how-does-redis-expire-keys

Lazy, random, or, more active.

Implement R/W lock.

===

Preallocate memory

If you can new and malloc, you have gotten memory

Preallocate: allocate and connected. Guaranteed in physical memory

Slab memory management.

Memory pool. 2M x 100. Linked by linkedlist

Go to the pool to get memory. Smaller print

Auto-resize. Key value store. Collision. How to resize

Larger size:

Single thread is not performant enough

Need multi-thread with locking

Different types of lock

Distributed hashing

Preallocation is a basic step

Java hashmap

Linked list

Java hashmap

Pretty low level

Audience

Algorithm design or system design?

System design

Some companies cover lower level design

Audience:

Functional requirement, Non functional requirement, are fairly high level

Did we clarify the requirements

Interviewer: may have set the right expectation at the beginning

Audience:

Why larger key value, we need multi-thread?

A: multithread is always better

Q: Is the value structure complex?

A: not complex

Audience:

Redis: Multi-thread is put at IO

Lookup is still single threaded

Reactor method. Handling request using the multi-thread

Key-value is large.

Audience:

You may be able have handle without lock by sharding by key

Sharding by key may create hotspot problem

Can we separate read and write?

Read and write lock

How to increase size of hash table? Do we lock the entire table?

We can partition into slots. During expansion we can just lock a portion of the array

Read-write lock: will it block writes?

Usually write lock has higher priority

What is expansion of the memory?

Expanding buckets

Do we need to shuffle content of the bucket?

Yes

During this time can we still read/write?

These are blocked

Redis internal

There are 2 hash tables. Old and new

It gradually moves from old to new

It’s spread over time

Why not start with a very large bucket?

Large key-value? Should we save on disk?

4G: 4k entries for 1M key

Increasing capacity: read-ahead lock

Are there any questions related to replica?

No we don’t cover this part

How to handle consistency?

In non-functional requirement: it says it’s not covered

Missing some collection of durability

Redis: try to improve throughput with follower

Cluster: sharding

Data replica

Master slave: how does it sync? Copy.

RDB: send snapshot, + backlog

Sentinel of Redis: can be different nodes or same node

Can there be auto election?

You can use zookeeper

Range query: hard to handle?

Redis supports sorted set, we can query by range

Normally there is no consistency need

Redis: usually asynchronous

Normally 2 types of data

Rarely changed data, or frequently changed data but not accurate

If we don’t care about accuracy, then we don’t need locking

If we care about accuracy, then we need lock

Cache: normally bottleneck is at IO

Redis: threading for IO

Redis: key, value size has some requirement to be small

For very large key-value