đź“‚Â RDBMS
- Schema-based structured tables
- Ex: mySQL, SQLServer, SQLite, Oracle
- Restrictions
- Each row has a primary key that is unique
- Needs foreign key to join on another table
- Problems: just not cut out for today’s workloads, which have
- Large amounts of unstructured data
- Lots of random reads and writes
- Joins are not as frequent as other operations
- Not built to scale out
- Scale up: grow capacity of each cluster by replacing with more powerful machines → not practical because hits hardware limitations
- Scale out: grow capacity of network by adding more clusters → needs a distributed key-value store!
đź“‚Â NoSQL
- Unstructured, no strict schema (some columns will have some missing rows)
- Joins not always supported
Often use column-oriented storage instead of row-oriented
- Helps index entries in column quicker
- Helps perform range queries much quicker (avoids fetching other irrelevant data in “row”)
- Maintains pointers across column entries to maintain “row”
🔥 Cassandra
- Distributed key-value store that uses consistent hashing for mapping key → server (partitioner)
- Unlike Chord, however, each server has full membership information (no finger tables) because a data center has enough space for this → gives us
O(1) network hops