Managed Database Services¶

Let the cloud run your database. Here's when that's worth it.

The hook¶

Running Postgres yourself on a VM means owning the whole stack: patching the OS, patching the DB, taking backups, testing restores, configuring replication, building failover, wiring up monitoring, paging someone at 3 a.m. when disk fills up.

That's not a side quest. That's a full-time job for someone — usually a person you don't have.

Managed services (RDS, Aurora, Cloud SQL, DynamoDB, Cosmos DB) hand all of that to the provider in exchange for a markup. The interesting question isn't "should I use managed?" — it's "is the markup smaller than my engineering time plus the cost of getting reliability wrong?"

For most teams, most of the time, yes.

The concept¶

A managed database service runs the database engine for you. The provider handles:

Provisioning — spin up an instance with one API call
Patching — minor versions get applied during a maintenance window
Backups — point-in-time recovery, usually with retention you set
Replication — read replicas and multi-AZ standbys with a checkbox
Failover — automatic promotion when the primary dies
Monitoring — CPU, IOPS, slow queries, replication lag, all in the console

You're left with the parts only you can do: schema design, query tuning, choosing the right engine, and writing application code.

Managed databases come in two flavors:

Managed traditional DBs — the same Postgres / MySQL / Redis / SQL Server you'd run yourself, just on someone else's infrastructure. Examples: AWS RDS, Aurora, Google Cloud SQL, Azure Database for PostgreSQL, ElastiCache. Portability is decent — the engine is open-source or standard, so a migration to another cloud or back to self-hosted is mostly an export/import.

Cloud-native DBs — purpose-built engines, often serverless, that only run on one provider. Examples: DynamoDB (AWS), Cosmos DB (Azure), Spanner / BigQuery / Firestore (GCP). They typically have better elasticity (scale to zero, or to massive throughput, automatically) but more vendor lock-in. Migrating off DynamoDB to anything else is a rewrite.

The taxonomy of database families (relational, key-value, document, column-family, graph, time-series, search) is covered on the database-types page. This page is about the cloud-services layer on top of those families: which one to pick, on which cloud, when to pay for managed.

Diagram¶

flowchart LR
    subgraph S1[Self-hosted Postgres on EC2]
        A1[App] --> A2[Postgres on EC2]
    end
    subgraph S2[RDS]
        B1[App] --> B2[Postgres on RDS]
    end
    subgraph S3[Aurora Serverless]
        C1[App] --> C2[Aurora Serverless]
    end
    subgraph S4[DynamoDB]
        D1[App] --> D2[DynamoDB]
    end
    style S1 fill:#fee,stroke:#c33
    style S2 fill:#ffd,stroke:#cc3
    style S3 fill:#efe,stroke:#3c3
    style S4 fill:#dfd,stroke:#2a8

What you own shrinks at every step:

Layer	EC2 Postgres	RDS	Aurora Serverless	DynamoDB
OS patches	You	Provider	Provider	Provider
DB patches	You	Provider (windowed)	Provider	Provider
Backups & PITR	You	Provider	Provider	Provider
Replication / failover	You	Provider	Provider	Provider
Capacity sizing	You	You	Auto	Auto
Schema & queries	You	You	You	You

By the time you're on DynamoDB, you're basically just writing application code against an API. That's the spectrum.

Example — a typical SaaS company's DB stack¶

Imagine a 30-person B2B SaaS on AWS. Their database choices, by use case:

Primary OLTP — Aurora Postgres (managed)

The customer-facing app: users, orgs, projects, invoices, audit logs. Relational, transactional, ACID-required.

They pick Aurora Postgres over self-hosted Postgres on EC2. The Aurora instance runs about $200/month; an equivalent EC2 box would be $80/month. The $120 markup buys: automated multi-AZ failover (under a minute), continuous backups to S3, storage that auto-grows up to 128 TB, and replicas that lag by milliseconds. Self-hosting that combo properly takes a real DBA, and DBAs cost much more than $120/month.

Hot key-value lookups — DynamoDB

Session store, rate-limit counters, per-user feature flags. Pure get(key) → value traffic, hundreds of millions of operations per month, p99 latency under 10ms.

DynamoDB with on-demand billing. Autoscales without ops work, single-digit-ms reads at any volume, no instance to size. The alternative — running Cassandra or Redis Cluster yourself — would consume an engineer for the rest of time. The lock-in is real, but the use case (cache-shaped workload) is exactly DynamoDB's home turf.

Analytics — BigQuery (or Snowflake)

Product analytics, billing reports, ad-hoc data science. Hundreds of GB to a few TB.

They lift the data into BigQuery nightly via a Fivetran-style pipe. Analysts run SQL against petabyte-capable infrastructure without sizing a cluster. Pay per query scanned. A self-hosted Spark or Presto cluster for this would be a quarter-time job and cost more in EC2 than BigQuery costs in queries.

Cache — ElastiCache Redis

Session cache, hot product lookups, queue for background jobs.

Same Redis they'd run themselves, but with replication, automatic failover, and patch management handled. Markup is tolerable; the alternative is babysitting a Redis cluster, and nobody wants that.

The pattern: pick the right family per use case (that's the database-types job), then default to the managed offering for that family unless you have a specific reason not to. "Specific reason" is a short list — covered in the last section.

Mechanics — managed DB picks across the big 3¶

Family	AWS	Azure	GCP	When it shines	Lock-in
Relational (OLTP)	RDS (Postgres/MySQL/MariaDB/SQL Server/Oracle), Aurora	Azure SQL Database, Azure Database for PostgreSQL/MySQL	Cloud SQL (Postgres/MySQL/SQL Server), Spanner	Standard transactional workloads. Aurora and Spanner extend with cloud-scale storage / global consistency.	RDS/Cloud SQL: low. Aurora: medium. Spanner: high.
Key-value / wide-column	DynamoDB	Cosmos DB (Table / Cassandra API)	Firestore, Bigtable	Massive scale, predictable access patterns, serverless billing.	High — these APIs don't exist anywhere else.
Cache (in-memory)	ElastiCache (Redis / Memcached), MemoryDB	Azure Cache for Redis	Memorystore (Redis / Memcached)	Drop-in for self-hosted Redis with HA and patching handled.	Low — it's just Redis.
Search	OpenSearch Service	Azure AI Search	No first-party — use Elastic Cloud on GCP	Full-text, log analytics, faceted search.	Low to medium — Lucene-based engines are mostly portable.
Time-series	Timestream	Azure Data Explorer	No first-party — TimescaleDB on Cloud SQL or InfluxDB Cloud	IoT, observability, anything indexed by time.	Medium — APIs differ across providers.
Analytics / Warehouse	Redshift	Synapse Analytics	BigQuery	Petabyte-scale SQL analytics; separate from your OLTP DB.	High — SQL portable, performance characteristics and pricing models are not.
Graph	Neptune	Cosmos DB (Gremlin API)	No first-party	Relationship-first queries — fraud, recommendations, social.	Medium to high.

A few patterns to notice:

Open-source engines have low lock-in. Postgres on RDS, Cloud SQL, and Azure all speak the same SQL. You can leave.
Cloud-native DBs have high lock-in but better elasticity. DynamoDB, Spanner, BigQuery scale and price in ways the open-source equivalents can't match.
Some families have a clear leader. BigQuery is the warehouse standard. DynamoDB is the cloud-native KV standard. Spanner is unique for global-strong-consistency relational.
GCP has gaps. No first-party search, no first-party time-series. You bring Elastic / InfluxDB / TimescaleDB.

Concept	What it is	Why it matters here
Database types	The conceptual taxonomy of DB families	Pick the family first; then pick the managed service in that family
SQL fundamentals	The query language and relational model	Most managed DBs you'll touch are still SQL — Aurora, Cloud SQL, Azure SQL all run it
Cloud storage services	Object, block, and file storage on the big clouds	Managed DBs sit on top of cloud storage; backups land in object storage
Database sharding	Splitting data across machines by partition key	Cloud-native DBs (DynamoDB, Spanner, BigQuery) handle sharding for you — that's a big chunk of what you're paying for
Cloud cost management	Tracking and controlling cloud spend	Databases are usually the biggest line item on a cloud bill — managed markup makes this even more true
ACID / CAP / BASE	Consistency models for transactions and distributed systems	Managed services pick a model for you. RDS Postgres = ACID. DynamoDB = eventual by default, strong on request. Spanner = global strong. Know what you're getting.
Polyglot persistence	Using multiple DB types in one system on purpose	Real systems mix Aurora + DynamoDB + ElastiCache + BigQuery. Managed services make this practical for small teams.

When (and when not) to use managed¶

Use a managed DB when:

You're a small or medium team without a dedicated DBA. This is most teams. The markup is cheaper than the headcount.
Time-to-market matters more than $/GB. You'd rather ship features than tune postgresql.conf.
You want cloud-native superpowers — autoscaling, multi-region replication, serverless billing, point-in-time recovery — without building them.
Reliability matters and you don't want to be the on-call. Failover, backups, patching are exactly the things small teams forget until they bite.
You're already on a cloud and the DB lives next to your app. Network latency and IAM integration are tighter with the cloud's own services.

Skip managed (or push back) when:

Extreme scale where the markup is real money. At Netflix / Discord / Stripe scale, "managed Postgres" can cost millions in markup that funds a real DB team self-hosting on raw infrastructure.
You need niche tuning the provider doesn't expose. Some kernel parameters, custom extensions, or replication topologies aren't possible on RDS / Cloud SQL.
Regulatory or sovereignty requirements that block a specific provider, or require data to stay on hardware you control.
You're running an open-source DB the provider doesn't manage well. Some specialized engines (newer vector DBs, certain graph DBs) only have weak managed offerings — self-hosting on Kubernetes can beat a half-baked managed product.

For most teams, the default answer is yes, use managed. Self-hosting Postgres is not character-building. Pay the markup, focus on your product.

Key takeaway¶

Managed services trade money for time — and at most company sizes that's the right trade.
Two flavors: managed traditional (RDS, Cloud SQL, ElastiCache — low lock-in) and cloud-native (DynamoDB, Spanner, BigQuery — high lock-in, high elasticity).
Pick the family first (database-types), then pick the managed offering in that family.
Lock-in is real but priced in. DynamoDB and BigQuery are great enough that the lock-in is often a fair trade.
You usually outgrow your team before you outgrow managed. The "we need to self-host for cost" moment arrives much later than engineers think.

Quiz available in the SLAM OG app — three questions on when the markup is worth it, when it stops being worth it, and the lock-in trade-off between managed-traditional and cloud-native DBs.