6.4 Database Design and Optimization
Goal of this section: Learn to design robust databases using the AI cross-validation method, understand key concepts like indexing, security, and connection management, and know where to start when your application slows down.
Old Wang's WeChat Reading Notes Disaster
Old Wang built a "WeChat Reading Notes" app—users could read books, write notes, highlight passages, and share. He had AI design the schema, which looked solid: users table, books table, notes table, highlights table, with proper relationships and constraints. Old Wang thought it was fine and launched.
First 100 users: everything worked. Pages loaded instantly, experience was smooth.
At 1,000 users: occasionally someone complained "loading is a bit slow." Old Wang didn't think much of it—probably just their network.
At 5,000 users: everything fell apart. Opening the notes list took 3 seconds; search timed out completely. Old Wang was baffled—only tens of thousands of rows, how could it be slow? He thought the server wasn't powerful enough and considered upgrading.
He asked a database-savvy friend to take a look. The friend said three words: "No indexes."
Old Wang's notes table had two foreign keys, user_id and book_id, but neither was indexed. Every "get this user's notes" query scanned the entire table from top to bottom—5,000 users × 50 notes each = 250,000 rows of full table scan. Like searching through a 250,000-page book with no table of contents for one person's notes, page by page.
The friend added two indexes in under a minute. Query time dropped from 3 seconds to 30 milliseconds—100× faster.
Old Wang was shocked: just two lines of code, such a huge performance gain?
His friend said: "Database design isn't just drawing tables and linking foreign keys. Indexes, security, connection management—these 'invisible' things determine whether your app can handle real users. Your schema can look beautiful, but without these fundamentals, it collapses when users scale."
This section covers these "invisible but critical" elements.
AI Cross-Validation Method (The Refining Pit)
Beginners struggle to spot schema design flaws at a glance. You ask AI to design a schema; it looks reasonable, but is it really okay? Are data types correct? Are foreign key columns indexed? Are constraints complete? Any performance pitfalls?
You might not see the issues, but another AI often can. This is the core idea of "cross-validation":
- Have AI #1 design the schema based on your requirements
- Send the generated code to AI #2, asking it to review as a "senior PostgreSQL architect"
- Have AI #1 revise based on AI #2's feedback
- Repeat for 1-2 rounds to get a robust design
Why does this work? Because every AI has blind spots. AI #1 might focus on business logic correctness while overlooking performance optimizations; AI #2, as a reviewer, more easily spots issues like "this foreign key lacks an index" or "this column has the wrong type."
It's like code review in programming—bugs in your own code are hard to spot, but colleagues catch them immediately. Cross-validation between AIs works the same way.
Usually after two rounds of "left-right sparring," you get a solid database design. You don't need to be a database expert—just know how to "make AIs criticize each other."
Indexes: The Database's Table of Contents
Old Wang's story already illustrates the importance of indexes. Here's a deeper dive, because indexes are the most important and most overlooked aspect of database performance optimization.
What Indexes Are and What Problems They Solve
A query without an index is like searching for a keyword in a 500-page book with no table of contents—you have to flip from page one to the end. This is called a Sequential Scan; the more data, the slower it gets. You won't notice with 100 rows, but at 100,000 rows it starts lagging, and at 1 million rows you might wait several seconds.
An Index is a "table of contents" for your database. With an index, the database can jump directly to the target location without scanning row by row. Speed improvements of 100-1000× are common.
The principle isn't complex. Imagine a dictionary: without a pinyin index, to find "database" you'd flip from start to finish. With the pinyin index, you check the "S" section, find "shu," and jump directly to the right page. Database indexes work similarly—they maintain a sorted "directory" to quickly locate where data resides.
Which Columns Need Indexes
Indexes aren't free—they consume extra storage (typically 10%-30% of indexed data), and every insert or update must sync the index, slowing writes. So don't index every column; only "worthwhile" ones:
| Must Index | Reason |
|---|---|
Foreign key columns (user_id, post_id) | JOINs and cascade deletes depend on them; without indexes, full table scans occur |
| Columns frequently in WHERE clauses | Like status, email, phone—these often appear in query conditions |
| ORDER BY columns | Like created_at descending; sorting without an index requires reading all data first |
| Don't Need to Index | Reason |
|---|---|
| Rarely queried columns | Like bio, avatar—almost never in WHERE or ORDER BY |
| Boolean columns (only true/false) | Too low cardinality; indexes don't help—imagine a book's table of contents with only "odd pages" and "even pages" |
| Small tables (< 1000 rows) | Full scans are fast anyway; indexes waste space and write performance |
PostgreSQL does NOT auto-index foreign keys
This is the pitfall Old Wang hit, and the most common performance trap. Many assume defining a foreign key automatically creates an index—it does not. MySQL auto-indexes foreign keys, but PostgreSQL does not. You must create them manually.
When reviewing AI-generated schemas, first check: does every foreign key column have a corresponding index? This single rule prevents 80% of performance issues.
Different Index Types
Most of the time, the default B-tree index suffices—B-tree is the most versatile database index type. You don't need to understand its internal structure; just know it fits most queries. It handles equality queries (WHERE email = 'xxx') and range queries (WHERE created_at > '2024-01-01'). When you say "add an index to a column" without specifying type, the database creates a B-tree by default.
But two special scenarios are worth knowing:
GIN Index: If your table has jsonb columns (storing user tags, product attributes, AI-generated structured data), standard B-tree indexes are useless for queries inside JSON. GIN (Generalized Inverted Index) is designed for "quickly searching inside complex data structures"—like finding a specific tag among many JSON objects.
Example: Your users table has a preferences column of type jsonb, storing {"theme": "dark", "language": "zh", "tags": ["vip", "early-adopter"]}. To find "all VIP users," a B-tree index can't help, but a GIN index drops this query from seconds to milliseconds.
Partial Index: If you use soft deletes (deleted_at field), most queries only care about undeleted data. Regular indexes include all rows, including deleted ones. Partial indexes can index only rows where deleted_at IS NULL—smaller, faster indexes uncluttered by deleted data.
You don't need to decide when to use these yourself—AI usually picks appropriate index types when generating code. But when you see GIN or partial indexes in AI output, knowing what they do and why prevents confusion.
Row-Level Security (RLS): The Last Wall Against Data Leaks
What Problem It Solves
Imagine Xiaohong's food delivery platform has a bug: an API endpoint forgot permission checks, so any user can see others' orders—including delivery addresses, phone numbers, purchase history.
If access control only exists in code, one bug leaks all user data. And code bugs are inevitable—even the best programmers make mistakes, even strict code reviews miss things.
Row-Level Security (RLS) enforces permissions at the database level. You define a rule: "users can only see their own orders," and the database automatically filters, regardless of entry point—API endpoints, admin panels, data migration scripts, all are constrained.
This means even with code bugs, the database won't leak other users' data. RLS is like the bank vault's final door—even if thieves breach all outer defenses (code-level permission checks), they still can't open the vault door (database-level RLS).
When You Need RLS
If you use Supabase, RLS is almost mandatory. Because Supabase's client SDK connects directly to the database (via PostgREST), frontend code can send SQL queries directly. Without RLS, anyone can modify query conditions in the browser console and see all users' data—this is running naked.
If you use Neon with your own backend API, RLS isn't mandatory. Because user requests go through your backend first, you can enforce access control at the API layer (e.g., "only return current logged-in user's orders"). But RLS as an extra security layer is still worth considering—more defense is better than less, especially when handling sensitive data (payment info, personal privacy).
Key Points for Reviewing RLS Policies
If AI generates RLS policies, check three things:
- Is it actually enabled? Defining policies isn't enough; you need
ALTER TABLE ... ENABLE ROW LEVEL SECURITYor policies don't take effect—like setting a phone password but not enabling lock screen, the password is useless. This is the most common omission—policy definition and enabling are separate steps, both required - Are all operations covered? SELECT, INSERT, UPDATE, DELETE each need separate policies; missing one is a security hole. For example, defining "users can only view their own orders" (SELECT policy) but forgetting UPDATE policy means users can modify others' orders
- Will it slow queries? Columns used in RLS policies (like
user_id) must be indexed, or every query will full-scan to filter permissions. This circles back to index importance—an unindexed RLS policy is secure but unbearably slow
Connection Management: The Culprit Behind "Crashes When Users Scale"
What Problem It Solves
Old Wang's app had another problem. This time not slowness, but direct errors: too many connections. Only 50 concurrent users and the database rejected new connections.
Old Wang was confused: 50 users? Others handle thousands or tens of thousands fine, my 50 crashes?
The reason: every database connection consumes server resources. Ever seen "service busy, please try again later" in some apps? One cause is exhausted connections. Each connection uses about 1-3 MB memory, plus TCP connection state, authentication info, etc. Free-tier database servers have limited resources—maybe a few hundred MB total memory, supporting roughly 100-200 connections.
The problem was Old Wang's application code: every user request created a new database connection; after handling the request, connections weren't released. 50 users refreshing pages, each triggering several requests, instantly maxed out connections. Subsequent requests queued indefinitely; users saw pages spinning forever.
Connection Pooling: Reuse, Don't Recreate
Connection Pooler solves this. The principle is simple: pre-establish a batch of database connections (say, 10), put them in a "pool." When the app needs a connection, it borrows one from the pool; when done, returns it for the next request to reuse.
Like bike-sharing: a city doesn't need everyone to own a bike; just place shared bikes around. People ride when needed, return at destination, next person uses the same bike. 100 people might only need 20 bikes—not everyone rides simultaneously.
Connection pooling works the same. 100 concurrent requests might only need 10 database connections—because each request holds a connection briefly (typically milliseconds to tens of milliseconds), mostly idle and available for reuse.
Both Neon and Supabase have built-in connection pooling. When deploying, ensure you use the pooler address (usually different port; e.g., Supabase direct is 5432, pooler is 6543). Section 6.0 covered this in detail.
Don't "Occupy the Latrine Without Doing Business"
Transactions hold a database connection during execution and may hold locks. If a transaction includes external API calls, sending emails, or other slow operations, that connection sits idle while others queue.
Imagine self-checkout at a supermarket: the person ahead scans items, then stands at the machine making a phone call before paying. Everyone behind waits. The right approach: scan items, pay, take your things (database operations), then step aside to make your call (external calls).
Principle: Only put database operations in transactions. External calls go after transaction commit.
Data Security Fundamentals
Your database holds your most valuable asset—user data. Once leaked or corrupted, consequences can be catastrophic. This section covers three basic security measures.
SQL Injection: ORMs Already Block It
SQL injection is the most classic security vulnerability, existing for over two decades and still a primary attack vector against web applications.
The principle is simple: attackers input malicious SQL fragments; if backend code directly concatenates user input into SQL statements, these fragments execute as SQL. For example, entering ' OR 1=1 -- in a login form's "username" field—if unprotected, bypasses password verification to log into any account.
Good news: using an ORM (Drizzle/Prisma) automatically prevents this. ORMs use parameterized queries internally—user input is treated as "data" not "code," so no input can execute as SQL. It's like putting user input in a sealed envelope for the database; the database reads the contents but won't execute them as instructions.
But if reviewing AI output you see handwritten SQL (string concatenation like `SELECT * FROM users WHERE name = '${userName}'`), be alert—this lacks parameterized protection and is vulnerable to SQL injection. Convert to ORM syntax or parameterized queries.
Principle of Least Privilege: Don't Run Apps with Admin Accounts
This is intuitive: you don't give delivery drivers keys to every room in your house, just the front door. You don't tell food delivery workers your bank password, just pay for that meal. Databases work the same.
The account your app uses to connect should have only minimum necessary permissions: read data, write data, modify data—but not drop tables, alter schema, or create users. Admin accounts are only for migrations and maintenance.
Why does this matter? Because application code is exposed to the internet—attackers' easiest entry point. If your app's database account has admin privileges, once compromised (via an unpatched vulnerability), attackers can do anything—drop all tables, export all data, even plant backdoors.
But if the app account only has read-write permissions, even if compromised, attackers can't do catastrophic damage—they can read data (bad enough), but at least can't drop tables, alter structure, or disrupt database operations.
Backups Are the Bottom Line
Data is your product's soul. Code can be rewritten, ugly UI can be reskinned, but if user data is lost—accounts, orders, chat history, created content—your product is finished. No technical means can recover lost user data from scratch.
Three elements of backup strategy:
- Automatic backups: Neon and Supabase both offer automatic backups; confirm they're enabled. Don't rely on manual backups—you'll forget
- Multi-location backups: Important data shouldn't live in one place. Cloud provider data centers can fail (rare but possible); don't put all eggs in one basket
- Recovery drills: Most overlooked. Regularly test restoring from backups to confirm they actually work. Too many people back up but never test, only discovering corrupted files, incompatible formats, or broken recovery procedures when they need them. Backup value isn't in "having done it," but in "can it restore"
Performance Diagnostics: What to Do When Queries Are Slow
Your app is live, then suddenly users report "page loading slowly." Don't panic; follow this order to diagnose—most problems can be located:
Step 1: Confirm it's actually a database problem. Slow pages aren't necessarily slow databases. Could be heavy frontend rendering, large images, slow API server response, or user's poor network. Add logging to slow endpoints, recording database query time. If the query takes 10ms but the endpoint takes 3 seconds, the problem isn't the database—look elsewhere.
Step 2: Check for missing indexes. If confirmed as slow database query, first suspect missing indexes. PostgreSQL has a tool called EXPLAIN ANALYZE—like an "X-ray" for queries—showing which indexes were used, how many rows scanned, time spent per step. If you see Sequential Scan, you're missing an index. Send EXPLAIN ANALYZE output to AI; it will tell you which column needs indexing.
Step 3: Check for N+1 problems. If one page makes dozens or hundreds of database queries (visible in browser DevTools Network panel), it's likely N+1. Section 6.3 covered this in detail—fix with JOINs or batch queries.
Step 4: Check connections and transactions. Are connections near the limit? Any long-uncommitted transactions blocking others? This info is visible in database platform monitoring dashboards. Both Neon and Supabase provide intuitive monitoring showing active connections, slowest queries, lock wait situations, etc.
Most performance issues resolve in the first two steps—either it's not a database problem, or it's missing indexes. Truly needing to dig into steps 3-4 is rare.
Load Supabase Postgres Best Practices
If you use an AI coding tool with Skills support, you can load the supabase-postgres-best-practices Skill. It contains 31 production-validated PostgreSQL optimization rules covering indexing strategies, connection management, RLS performance optimization, pagination patterns, etc. AI will automatically apply these best practices when generating code.
Core Takeaways from This Section
- AI cross-validation: Have different AIs review schemas; 2 rounds of iteration yields robust designs
- Indexes are performance-critical: Foreign key columns must be indexed; PostgreSQL doesn't auto-create them
- RLS prevents data leaks: Database-level access control; even code bugs won't leak data
- Connection pooling prevents crashes: Production environments must use pooling, or dozens of users can max connections
- Backups are the bottom line: Automatic backups + recovery drills; neither can be skipped
