Lecture 9-11: Key-Val Stores

Schema-based structured tables
Ex: mySQL, SQLServer, SQLite, Oracle
Restrictions
- Each row has a primary key that is unique
- Needs foreign key to join on another table
Problems: just not cut out for today’s workloads, which have
- Large amounts of unstructured data
- Lots of random reads and writes
- Joins are not as frequent as other operations
- Not built to scale out
  - Scale up: grow capacity of each cluster by replacing with more powerful machines → not practical because hits hardware limitations
  - Scale out: grow capacity of network by adding more clusters → needs a distributed key-value store!

Often use column-oriented storage instead of row-oriented

Helps index entries in column quicker
Helps perform range queries much quicker (avoids fetching other irrelevant data in “row”)
Maintains pointers across column entries to maintain “row”

Distributed key-value store that uses consistent hashing for mapping key → server (partitioner)
- Unlike Chord, however, each server has full membership information (no finger tables) because a data center has enough space for this → gives us O(1) network hops