Managed Database Services¶
Let the cloud run your database. Here's when that's worth it.
The hook¶
Running Postgres yourself on a VM means owning the whole stack: patching the OS, patching the DB, taking backups, testing restores, configuring replication, building failover, wiring up monitoring, paging someone at 3 a.m. when disk fills up.
That's not a side quest. That's a full-time job for someone — usually a person you don't have.
Managed services (RDS, Aurora, Cloud SQL, DynamoDB, Cosmos DB) hand all of that to the provider in exchange for a markup. The interesting question isn't "should I use managed?" — it's "is the markup smaller than my engineering time plus the cost of getting reliability wrong?"
For most teams, most of the time, yes.
The concept¶
A managed database service runs the database engine for you. The provider handles:
- Provisioning — spin up an instance with one API call
- Patching — minor versions get applied during a maintenance window
- Backups — point-in-time recovery, usually with retention you set
- Replication — read replicas and multi-AZ standbys with a checkbox
- Failover — automatic promotion when the primary dies
- Monitoring — CPU, IOPS, slow queries, replication lag, all in the console
You're left with the parts only you can do: schema design, query tuning, choosing the right engine, and writing application code.
Managed databases come in two flavors:
Managed traditional DBs — the same Postgres / MySQL / Redis / SQL Server you'd run yourself, just on someone else's infrastructure. Examples: AWS RDS, Aurora, Google Cloud SQL, Azure Database for PostgreSQL, ElastiCache. Portability is decent — the engine is open-source or standard, so a migration to another cloud or back to self-hosted is mostly an export/import.
Cloud-native DBs — purpose-built engines, often serverless, that only run on one provider. Examples: DynamoDB (AWS), Cosmos DB (Azure), Spanner / BigQuery / Firestore (GCP). They typically have better elasticity (scale to zero, or to massive throughput, automatically) but more vendor lock-in. Migrating off DynamoDB to anything else is a rewrite.
The taxonomy of database families (relational, key-value, document, column-family, graph, time-series, search) is covered on the database-types page. This page is about the cloud-services layer on top of those families: which one to pick, on which cloud, when to pay for managed.
Diagram¶
flowchart LR
subgraph S1[Self-hosted Postgres on EC2]
A1[App] --> A2[Postgres on EC2]
end
subgraph S2[RDS]
B1[App] --> B2[Postgres on RDS]
end
subgraph S3[Aurora Serverless]
C1[App] --> C2[Aurora Serverless]
end
subgraph S4[DynamoDB]
D1[App] --> D2[DynamoDB]
end
style S1 fill:#fee,stroke:#c33
style S2 fill:#ffd,stroke:#cc3
style S3 fill:#efe,stroke:#3c3
style S4 fill:#dfd,stroke:#2a8
What you own shrinks at every step:
| Layer | EC2 Postgres | RDS | Aurora Serverless | DynamoDB |
|---|---|---|---|---|
| OS patches | You | Provider | Provider | Provider |
| DB patches | You | Provider (windowed) | Provider | Provider |
| Backups & PITR | You | Provider | Provider | Provider |
| Replication / failover | You | Provider | Provider | Provider |
| Capacity sizing | You | You | Auto | Auto |
| Schema & queries | You | You | You | You |
By the time you're on DynamoDB, you're basically just writing application code against an API. That's the spectrum.
Example — a typical SaaS company's DB stack¶
Imagine a 30-person B2B SaaS on AWS. Their database choices, by use case:
Primary OLTP — Aurora Postgres (managed)
The customer-facing app: users, orgs, projects, invoices, audit logs. Relational, transactional, ACID-required.
They pick Aurora Postgres over self-hosted Postgres on EC2. The Aurora instance runs about $200/month; an equivalent EC2 box would be $80/month. The $120 markup buys: automated multi-AZ failover (under a minute), continuous backups to S3, storage that auto-grows up to 128 TB, and replicas that lag by milliseconds. Self-hosting that combo properly takes a real DBA, and DBAs cost much more than $120/month.
Hot key-value lookups — DynamoDB
Session store, rate-limit counters, per-user feature flags. Pure get(key) → value traffic, hundreds of millions of operations per month, p99 latency under 10ms.
DynamoDB with on-demand billing. Autoscales without ops work, single-digit-ms reads at any volume, no instance to size. The alternative — running Cassandra or Redis Cluster yourself — would consume an engineer for the rest of time. The lock-in is real, but the use case (cache-shaped workload) is exactly DynamoDB's home turf.
Analytics — BigQuery (or Snowflake)
Product analytics, billing reports, ad-hoc data science. Hundreds of GB to a few TB.
They lift the data into BigQuery nightly via a Fivetran-style pipe. Analysts run SQL against petabyte-capable infrastructure without sizing a cluster. Pay per query scanned. A self-hosted Spark or Presto cluster for this would be a quarter-time job and cost more in EC2 than BigQuery costs in queries.
Cache — ElastiCache Redis
Session cache, hot product lookups, queue for background jobs.
Same Redis they'd run themselves, but with replication, automatic failover, and patch management handled. Markup is tolerable; the alternative is babysitting a Redis cluster, and nobody wants that.
The pattern: pick the right family per use case (that's the database-types job), then default to the managed offering for that family unless you have a specific reason not to. "Specific reason" is a short list — covered in the last section.
Mechanics — managed DB picks across the big 3¶
| Family | AWS | Azure | GCP | When it shines | Lock-in |
|---|---|---|---|---|---|
| Relational (OLTP) | RDS (Postgres/MySQL/MariaDB/SQL Server/Oracle), Aurora | Azure SQL Database, Azure Database for PostgreSQL/MySQL | Cloud SQL (Postgres/MySQL/SQL Server), Spanner | Standard transactional workloads. Aurora and Spanner extend with cloud-scale storage / global consistency. | RDS/Cloud SQL: low. Aurora: medium. Spanner: high. |
| Key-value / wide-column | DynamoDB | Cosmos DB (Table / Cassandra API) | Firestore, Bigtable | Massive scale, predictable access patterns, serverless billing. | High — these APIs don't exist anywhere else. |
| Cache (in-memory) | ElastiCache (Redis / Memcached), MemoryDB | Azure Cache for Redis | Memorystore (Redis / Memcached) | Drop-in for self-hosted Redis with HA and patching handled. | Low — it's just Redis. |
| Search | OpenSearch Service | Azure AI Search | No first-party — use Elastic Cloud on GCP | Full-text, log analytics, faceted search. | Low to medium — Lucene-based engines are mostly portable. |
| Time-series | Timestream | Azure Data Explorer | No first-party — TimescaleDB on Cloud SQL or InfluxDB Cloud | IoT, observability, anything indexed by time. | Medium — APIs differ across providers. |
| Analytics / Warehouse | Redshift | Synapse Analytics | BigQuery | Petabyte-scale SQL analytics; separate from your OLTP DB. | High — SQL portable, performance characteristics and pricing models are not. |
| Graph | Neptune | Cosmos DB (Gremlin API) | No first-party | Relationship-first queries — fraud, recommendations, social. | Medium to high. |
A few patterns to notice:
- Open-source engines have low lock-in. Postgres on RDS, Cloud SQL, and Azure all speak the same SQL. You can leave.
- Cloud-native DBs have high lock-in but better elasticity. DynamoDB, Spanner, BigQuery scale and price in ways the open-source equivalents can't match.
- Some families have a clear leader. BigQuery is the warehouse standard. DynamoDB is the cloud-native KV standard. Spanner is unique for global-strong-consistency relational.
- GCP has gaps. No first-party search, no first-party time-series. You bring Elastic / InfluxDB / TimescaleDB.
Related concepts¶
| Concept | What it is | Why it matters here |
|---|---|---|
| Database types | The conceptual taxonomy of DB families | Pick the family first; then pick the managed service in that family |
| SQL fundamentals | The query language and relational model | Most managed DBs you'll touch are still SQL — Aurora, Cloud SQL, Azure SQL all run it |
| Cloud storage services | Object, block, and file storage on the big clouds | Managed DBs sit on top of cloud storage; backups land in object storage |
| Database sharding | Splitting data across machines by partition key | Cloud-native DBs (DynamoDB, Spanner, BigQuery) handle sharding for you — that's a big chunk of what you're paying for |
| Cloud cost management | Tracking and controlling cloud spend | Databases are usually the biggest line item on a cloud bill — managed markup makes this even more true |
| ACID / CAP / BASE | Consistency models for transactions and distributed systems | Managed services pick a model for you. RDS Postgres = ACID. DynamoDB = eventual by default, strong on request. Spanner = global strong. Know what you're getting. |
| Polyglot persistence | Using multiple DB types in one system on purpose | Real systems mix Aurora + DynamoDB + ElastiCache + BigQuery. Managed services make this practical for small teams. |
When (and when not) to use managed¶
Use a managed DB when:
- You're a small or medium team without a dedicated DBA. This is most teams. The markup is cheaper than the headcount.
- Time-to-market matters more than $/GB. You'd rather ship features than tune
postgresql.conf. - You want cloud-native superpowers — autoscaling, multi-region replication, serverless billing, point-in-time recovery — without building them.
- Reliability matters and you don't want to be the on-call. Failover, backups, patching are exactly the things small teams forget until they bite.
- You're already on a cloud and the DB lives next to your app. Network latency and IAM integration are tighter with the cloud's own services.
Skip managed (or push back) when:
- Extreme scale where the markup is real money. At Netflix / Discord / Stripe scale, "managed Postgres" can cost millions in markup that funds a real DB team self-hosting on raw infrastructure.
- You need niche tuning the provider doesn't expose. Some kernel parameters, custom extensions, or replication topologies aren't possible on RDS / Cloud SQL.
- Regulatory or sovereignty requirements that block a specific provider, or require data to stay on hardware you control.
- You're running an open-source DB the provider doesn't manage well. Some specialized engines (newer vector DBs, certain graph DBs) only have weak managed offerings — self-hosting on Kubernetes can beat a half-baked managed product.
For most teams, the default answer is yes, use managed. Self-hosting Postgres is not character-building. Pay the markup, focus on your product.
Key takeaway¶
- Managed services trade money for time — and at most company sizes that's the right trade.
- Two flavors: managed traditional (RDS, Cloud SQL, ElastiCache — low lock-in) and cloud-native (DynamoDB, Spanner, BigQuery — high lock-in, high elasticity).
- Pick the family first (database-types), then pick the managed offering in that family.
- Lock-in is real but priced in. DynamoDB and BigQuery are great enough that the lock-in is often a fair trade.
- You usually outgrow your team before you outgrow managed. The "we need to self-host for cost" moment arrives much later than engineers think.
Quiz available in the SLAM OG app — three questions on when the markup is worth it, when it stops being worth it, and the lock-in trade-off between managed-traditional and cloud-native DBs.