Database Design & Best Practices
Learn normalization, indexing strategies, and schema design principles
Database Design Principles
Good database design is fundamental to building efficient, maintainable, and scalable applications. This guide covers the essential principles and practices for designing robust relational databases.
Database design is both an art and a science. While there are established rules and best practices, real-world requirements often require thoughtful trade-offs between different design goals.
Database Normalization
Normalization is the process of organizing data to minimize redundancy and dependency. It involves breaking down tables into smaller, related tables and defining relationships between them.
First Normal Form (1NF)
A table is in 1NF if:
- Each column contains atomic (indivisible) values
- Each row is unique
- No repeating groups of columns
-- ❌ Not in 1NF: Multiple values in phone_numbers column
CREATE TABLE customers_bad (
customer_id INT PRIMARY KEY,
name VARCHAR(100),
phone_numbers VARCHAR(255) -- "123-456-7890, 098-765-4321"
);
-- ✅ 1NF: Atomic values, separate table for multiple phones
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE customer_phones (
customer_id INT,
phone_number VARCHAR(20),
phone_type VARCHAR(10), -- 'home', 'work', 'mobile'
PRIMARY KEY (customer_id, phone_number),
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);Second Normal Form (2NF)
A table is in 2NF if:
- It's in 1NF
- All non-key columns are fully dependent on the primary key (no partial dependencies)
-- ❌ Not in 2NF: Partial dependency on composite key
CREATE TABLE order_items_bad (
order_id INT,
product_id INT,
product_name VARCHAR(100), -- Depends only on product_id
product_price DECIMAL(10,2), -- Depends only on product_id
quantity INT, -- Depends on both order_id and product_id
PRIMARY KEY (order_id, product_id)
);
-- ✅ 2NF: Remove partial dependencies
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE
);
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
price DECIMAL(10,2)
);
CREATE TABLE order_items (
order_id INT,
product_id INT,
quantity INT,
unit_price DECIMAL(10,2), -- Price at time of order
PRIMARY KEY (order_id, product_id),
FOREIGN KEY (order_id) REFERENCES orders(order_id),
FOREIGN KEY (product_id) REFERENCES products(product_id)
);Third Normal Form (3NF)
A table is in 3NF if:
- It's in 2NF
- No transitive dependencies (non-key columns don't depend on other non-key columns)
-- ❌ Not in 3NF: Transitive dependency
CREATE TABLE employees_bad (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
department_id INT,
department_name VARCHAR(100), -- Depends on department_id, not employee_id
department_location VARCHAR(100) -- Depends on department_id, not employee_id
);
-- ✅ 3NF: Remove transitive dependencies
CREATE TABLE departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(100),
location VARCHAR(100)
);
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
department_id INT,
FOREIGN KEY (department_id) REFERENCES departments(department_id)
);Boyce-Codd Normal Form (BCNF)
BCNF is a stricter version of 3NF. A table is in BCNF if every determinant is a candidate key.
-- Scenario: Students can have multiple majors, professors teach multiple subjects
-- ❌ Not in BCNF: Professor determines subject, but (student, professor) is the key
CREATE TABLE student_courses_bad (
student_id INT,
professor_id INT,
subject VARCHAR(50), -- Determined by professor_id
PRIMARY KEY (student_id, professor_id)
);
-- ✅ BCNF: Separate the dependency
CREATE TABLE professors (
professor_id INT PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE subjects (
subject_id INT PRIMARY KEY,
subject_name VARCHAR(100)
);
CREATE TABLE professor_subjects (
professor_id INT,
subject_id INT,
PRIMARY KEY (professor_id, subject_id),
FOREIGN KEY (professor_id) REFERENCES professors(professor_id),
FOREIGN KEY (subject_id) REFERENCES subjects(subject_id)
);
CREATE TABLE student_enrollments (
student_id INT,
professor_id INT,
subject_id INT,
enrollment_date DATE,
PRIMARY KEY (student_id, professor_id, subject_id),
FOREIGN KEY (professor_id, subject_id) REFERENCES professor_subjects(professor_id, subject_id)
);Entity-Relationship (ER) Modeling
ER modeling helps visualize database structure before implementation.
Entity Types
-- Strong Entity: Exists independently
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Weak Entity: Depends on strong entity for identification
CREATE TABLE order_items (
order_id INT,
item_sequence INT, -- Sequential number within the order
product_id INT NOT NULL,
quantity INT NOT NULL,
unit_price DECIMAL(10,2) NOT NULL,
PRIMARY KEY (order_id, item_sequence),
FOREIGN KEY (order_id) REFERENCES orders(order_id) ON DELETE CASCADE
);Relationship Types
One-to-One (1:1)
-- Each user has exactly one profile
CREATE TABLE users (
user_id INT PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL
);
CREATE TABLE user_profiles (
user_id INT PRIMARY KEY,
bio TEXT,
profile_picture_url VARCHAR(255),
birth_date DATE,
FOREIGN KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE
);One-to-Many (1:M)
-- One customer can have many orders
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL
);
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT NOT NULL,
order_date DATE NOT NULL,
total_amount DECIMAL(10,2),
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);Many-to-Many (M:M)
-- Students can enroll in many courses, courses can have many students
CREATE TABLE students (
student_id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL
);
CREATE TABLE courses (
course_id INT PRIMARY KEY,
course_name VARCHAR(100) NOT NULL,
credits INT NOT NULL
);
-- Junction/Bridge table
CREATE TABLE enrollments (
student_id INT,
course_id INT,
enrollment_date DATE NOT NULL,
grade CHAR(2),
PRIMARY KEY (student_id, course_id),
FOREIGN KEY (student_id) REFERENCES students(student_id),
FOREIGN KEY (course_id) REFERENCES courses(course_id)
);Indexing Strategies
Indexes are crucial for query performance but come with storage and maintenance overhead.
Primary Indexes
-- Primary key automatically creates unique clustered index
CREATE TABLE products (
product_id INT AUTO_INCREMENT PRIMARY KEY, -- Clustered index
sku VARCHAR(50) UNIQUE NOT NULL, -- Unique non-clustered index
name VARCHAR(100) NOT NULL
);Secondary Indexes
-- Single column index for frequent WHERE clauses
CREATE INDEX idx_products_category ON products(category);
-- Composite index for multi-column queries
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);
-- Covering index includes all needed columns
CREATE INDEX idx_orders_covering ON orders(customer_id, order_date)
INCLUDE (total_amount, status);
-- Partial index with condition (PostgreSQL)
CREATE INDEX idx_active_orders ON orders(customer_id)
WHERE status = 'active';
-- Function-based index (PostgreSQL)
CREATE INDEX idx_email_lower ON users(LOWER(email));Index Design Guidelines
Index Best Practices
- Create indexes on frequently queried columns - Use composite indexes for multi-column WHERE clauses - Consider index column order (most selective first) - Don't over-index - each index adds overhead to INSERT/UPDATE/DELETE - Monitor index usage and remove unused indexes
-- ✅ Good: Uses index efficiently
SELECT * FROM orders WHERE customer_id = 123 AND order_date >= '2024-01-01';
-- ❌ Bad: Function prevents index usage
SELECT * FROM orders WHERE YEAR(order_date) = 2024;
-- ✅ Better: Range query uses index
SELECT * FROM orders WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01';Schema Design Patterns
Lookup Tables
-- Instead of storing status as VARCHAR everywhere
CREATE TABLE order_statuses (
status_id TINYINT PRIMARY KEY,
status_name VARCHAR(20) NOT NULL,
description TEXT
);
INSERT INTO order_statuses VALUES
(1, 'pending', 'Order received, awaiting payment'),
(2, 'paid', 'Payment confirmed, processing'),
(3, 'shipped', 'Order shipped to customer'),
(4, 'delivered', 'Order delivered successfully'),
(5, 'cancelled', 'Order cancelled by customer or system');
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
status_id TINYINT NOT NULL DEFAULT 1,
order_date DATE,
FOREIGN KEY (status_id) REFERENCES order_statuses(status_id)
);Audit Trail Pattern
-- Main table
CREATE TABLE products (
product_id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
price DECIMAL(10,2) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
version INT DEFAULT 1
);
-- Audit history table
CREATE TABLE products_audit (
audit_id INT AUTO_INCREMENT PRIMARY KEY,
product_id INT NOT NULL,
name VARCHAR(100),
price DECIMAL(10,2),
action_type ENUM('INSERT', 'UPDATE', 'DELETE'),
action_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
action_user VARCHAR(50),
old_values JSON,
new_values JSON
);
-- Trigger to populate audit table (MySQL example)
DELIMITER $$
CREATE TRIGGER products_audit_trigger
AFTER UPDATE ON products
FOR EACH ROW
BEGIN
INSERT INTO products_audit (
product_id, name, price, action_type, action_user,
old_values, new_values
) VALUES (
NEW.product_id, NEW.name, NEW.price, 'UPDATE', USER(),
JSON_OBJECT('name', OLD.name, 'price', OLD.price),
JSON_OBJECT('name', NEW.name, 'price', NEW.price)
);
END$$
DELIMITER ;Soft Delete Pattern
CREATE TABLE users (
user_id INT PRIMARY KEY,
username VARCHAR(50) NOT NULL,
email VARCHAR(100) NOT NULL,
is_deleted BOOLEAN DEFAULT FALSE,
deleted_at TIMESTAMP NULL,
deleted_by INT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Create view for active users
CREATE VIEW active_users AS
SELECT user_id, username, email, created_at
FROM users
WHERE is_deleted = FALSE;
-- Soft delete procedure
DELIMITER $$
CREATE PROCEDURE soft_delete_user(IN p_user_id INT, IN p_deleted_by INT)
BEGIN
UPDATE users
SET is_deleted = TRUE,
deleted_at = CURRENT_TIMESTAMP,
deleted_by = p_deleted_by
WHERE user_id = p_user_id AND is_deleted = FALSE;
END$$
DELIMITER ;Data Integrity and Constraints
Foreign Key Constraints with Actions
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
-- CASCADE: Delete orders when customer is deleted
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
ON DELETE CASCADE
ON UPDATE CASCADE
);
CREATE TABLE order_items (
order_id INT,
product_id INT,
quantity INT,
-- RESTRICT: Prevent deletion if referenced
FOREIGN KEY (order_id) REFERENCES orders(order_id)
ON DELETE RESTRICT,
-- SET NULL: Set to NULL when product is deleted
FOREIGN KEY (product_id) REFERENCES products(product_id)
ON DELETE SET NULL
);Check Constraints
CREATE TABLE products (
product_id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
price DECIMAL(10,2) NOT NULL,
discount_percent DECIMAL(5,2) DEFAULT 0,
stock_quantity INT DEFAULT 0,
-- Ensure positive price
CONSTRAINT chk_price_positive CHECK (price > 0),
-- Ensure valid discount range
CONSTRAINT chk_discount_range CHECK (discount_percent >= 0 AND discount_percent <= 100),
-- Ensure non-negative stock
CONSTRAINT chk_stock_non_negative CHECK (stock_quantity >= 0)
);
CREATE TABLE users (
user_id INT PRIMARY KEY,
email VARCHAR(100) NOT NULL,
age INT,
status ENUM('active', 'inactive', 'suspended') DEFAULT 'active',
-- Ensure valid email format (basic check)
CONSTRAINT chk_email_format CHECK (email LIKE '%@%.%'),
-- Ensure reasonable age range
CONSTRAINT chk_age_range CHECK (age IS NULL OR (age >= 13 AND age <= 120))
);Performance Optimization
Denormalization for Performance
Sometimes breaking normalization rules improves performance:
-- Normalized approach (multiple JOINs required)
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE
);
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100)
);
-- Denormalized approach (duplicate customer name for performance)
CREATE TABLE orders_denormalized (
order_id INT PRIMARY KEY,
customer_id INT,
customer_name VARCHAR(100), -- Denormalized: duplicated from customers table
customer_email VARCHAR(100), -- Denormalized: duplicated from customers table
order_date DATE,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);Partitioning
-- Range partitioning by date (MySQL)
CREATE TABLE sales (
sale_id INT,
sale_date DATE,
amount DECIMAL(10,2),
customer_id INT
) PARTITION BY RANGE (YEAR(sale_date)) (
PARTITION p2022 VALUES LESS THAN (2023),
PARTITION p2023 VALUES LESS THAN (2024),
PARTITION p2024 VALUES LESS THAN (2025),
PARTITION p_future VALUES LESS THAN MAXVALUE
);
-- Hash partitioning for even distribution
CREATE TABLE user_sessions (
session_id VARCHAR(128) PRIMARY KEY,
user_id INT,
created_at TIMESTAMP,
expires_at TIMESTAMP
) PARTITION BY HASH(user_id) PARTITIONS 4;Security Considerations
User Management and Permissions
-- Create application-specific users with minimal privileges
CREATE USER 'app_read'@'localhost' IDENTIFIED BY 'secure_password';
CREATE USER 'app_write'@'localhost' IDENTIFIED BY 'secure_password';
-- Grant specific permissions
GRANT SELECT ON ecommerce.* TO 'app_read'@'localhost';
GRANT SELECT, INSERT, UPDATE ON ecommerce.orders TO 'app_write'@'localhost';
GRANT SELECT, INSERT, UPDATE ON ecommerce.order_items TO 'app_write'@'localhost';
-- Row-level security (PostgreSQL example)
CREATE POLICY user_data_policy ON users
FOR ALL TO application_role
USING (user_id = current_setting('app.current_user_id')::INT);Data Encryption
-- Column-level encryption (MySQL)
CREATE TABLE sensitive_data (
id INT PRIMARY KEY,
ssn VARBINARY(255), -- Store encrypted
credit_card VARBINARY(255) -- Store encrypted
);
-- Insert encrypted data
INSERT INTO sensitive_data (id, ssn, credit_card) VALUES (
1,
AES_ENCRYPT('123-45-6789', 'encryption_key'),
AES_ENCRYPT('1234-5678-9012-3456', 'encryption_key')
);
-- Retrieve decrypted data
SELECT
id,
AES_DECRYPT(ssn, 'encryption_key') as decrypted_ssn,
AES_DECRYPT(credit_card, 'encryption_key') as decrypted_card
FROM sensitive_data;Schema Versioning and Migration
Version Control for Database Schema
-- Migration: 001_create_initial_tables.sql
CREATE TABLE schema_versions (
version_number INT PRIMARY KEY,
description VARCHAR(255),
applied_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO schema_versions VALUES (1, 'Initial schema creation', NOW());
-- Migration: 002_add_user_preferences.sql
CREATE TABLE user_preferences (
user_id INT PRIMARY KEY,
theme VARCHAR(20) DEFAULT 'light',
language CHAR(2) DEFAULT 'en',
timezone VARCHAR(50) DEFAULT 'UTC',
FOREIGN KEY (user_id) REFERENCES users(user_id)
);
INSERT INTO schema_versions VALUES (2, 'Add user preferences table', NOW());
-- Migration: 003_add_email_verification.sql
ALTER TABLE users ADD COLUMN email_verified BOOLEAN DEFAULT FALSE;
ALTER TABLE users ADD COLUMN verification_token VARCHAR(64);
INSERT INTO schema_versions VALUES (3, 'Add email verification columns', NOW());Common Design Anti-Patterns to Avoid
Design Anti-Patterns
Avoid these common database design mistakes: - EAV (Entity-Attribute-Value): Makes queries complex and slow - One True Lookup Table: Single table for all lookup values - Polymorphic Associations: Foreign keys that can reference multiple tables - Storing JSON/XML when relational structure would work better - Not using appropriate data types (storing numbers as strings)
-- ❌ EAV Anti-pattern
CREATE TABLE entity_attributes (
entity_id INT,
attribute_name VARCHAR(50),
attribute_value TEXT
);
-- ✅ Better: Proper table structure
CREATE TABLE products (
product_id INT PRIMARY KEY,
name VARCHAR(100),
price DECIMAL(10,2),
weight DECIMAL(8,2),
color VARCHAR(30)
);Real-World Database Design Example
Let's put it all together with a comprehensive e-commerce database design:
-- Users and Authentication
CREATE TABLE users (
user_id INT AUTO_INCREMENT PRIMARY KEY,
email VARCHAR(255) NOT NULL UNIQUE,
password_hash VARCHAR(255) NOT NULL,
email_verified BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
is_active BOOLEAN DEFAULT TRUE
);
CREATE TABLE user_profiles (
user_id INT PRIMARY KEY,
first_name VARCHAR(100),
last_name VARCHAR(100),
phone VARCHAR(20),
date_of_birth DATE,
profile_picture_url VARCHAR(500),
FOREIGN KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE
);
-- Address Management
CREATE TABLE addresses (
address_id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT NOT NULL,
address_type ENUM('billing', 'shipping') NOT NULL,
street_address VARCHAR(255) NOT NULL,
city VARCHAR(100) NOT NULL,
state_province VARCHAR(100),
postal_code VARCHAR(20),
country CHAR(2) NOT NULL,
is_default BOOLEAN DEFAULT FALSE,
FOREIGN KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE,
INDEX idx_user_address (user_id, address_type)
);
-- Product Catalog
CREATE TABLE categories (
category_id INT AUTO_INCREMENT PRIMARY KEY,
parent_category_id INT,
category_name VARCHAR(100) NOT NULL,
slug VARCHAR(100) NOT NULL UNIQUE,
description TEXT,
is_active BOOLEAN DEFAULT TRUE,
FOREIGN KEY (parent_category_id) REFERENCES categories(category_id),
INDEX idx_parent_category (parent_category_id),
INDEX idx_slug (slug)
);
CREATE TABLE products (
product_id INT AUTO_INCREMENT PRIMARY KEY,
sku VARCHAR(100) NOT NULL UNIQUE,
name VARCHAR(255) NOT NULL,
description TEXT,
price DECIMAL(10,2) NOT NULL,
cost DECIMAL(10,2),
weight DECIMAL(8,2),
dimensions VARCHAR(50),
stock_quantity INT NOT NULL DEFAULT 0,
min_stock_level INT DEFAULT 0,
is_active BOOLEAN DEFAULT TRUE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
CONSTRAINT chk_price_positive CHECK (price > 0),
CONSTRAINT chk_stock_non_negative CHECK (stock_quantity >= 0),
INDEX idx_sku (sku),
INDEX idx_name_search (name),
INDEX idx_price (price),
INDEX idx_stock (stock_quantity)
);
CREATE TABLE product_categories (
product_id INT,
category_id INT,
PRIMARY KEY (product_id, category_id),
FOREIGN KEY (product_id) REFERENCES products(product_id) ON DELETE CASCADE,
FOREIGN KEY (category_id) REFERENCES categories(category_id) ON DELETE CASCADE
);
-- Order Management
CREATE TABLE order_statuses (
status_id TINYINT PRIMARY KEY,
status_name VARCHAR(50) NOT NULL,
description VARCHAR(255)
);
INSERT INTO order_statuses VALUES
(1, 'pending', 'Order created, awaiting payment'),
(2, 'paid', 'Payment received, processing order'),
(3, 'shipped', 'Order shipped to customer'),
(4, 'delivered', 'Order successfully delivered'),
(5, 'cancelled', 'Order cancelled'),
(6, 'refunded', 'Order refunded');
CREATE TABLE orders (
order_id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT NOT NULL,
order_number VARCHAR(50) NOT NULL UNIQUE,
status_id TINYINT NOT NULL DEFAULT 1,
-- Pricing
subtotal DECIMAL(10,2) NOT NULL,
tax_amount DECIMAL(10,2) DEFAULT 0,
shipping_amount DECIMAL(10,2) DEFAULT 0,
discount_amount DECIMAL(10,2) DEFAULT 0,
total_amount DECIMAL(10,2) NOT NULL,
-- Addresses (denormalized for historical accuracy)
billing_address JSON,
shipping_address JSON,
-- Timestamps
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
shipped_at TIMESTAMP NULL,
delivered_at TIMESTAMP NULL,
FOREIGN KEY (user_id) REFERENCES users(user_id),
FOREIGN KEY (status_id) REFERENCES order_statuses(status_id),
INDEX idx_user_orders (user_id, created_at),
INDEX idx_order_number (order_number),
INDEX idx_status (status_id),
INDEX idx_created_date (created_at)
);
CREATE TABLE order_items (
order_id INT,
product_id INT,
quantity INT NOT NULL,
unit_price DECIMAL(10,2) NOT NULL,
total_price DECIMAL(10,2) NOT NULL,
-- Product info snapshot (for historical accuracy)
product_name VARCHAR(255) NOT NULL,
product_sku VARCHAR(100) NOT NULL,
PRIMARY KEY (order_id, product_id),
FOREIGN KEY (order_id) REFERENCES orders(order_id) ON DELETE CASCADE,
FOREIGN KEY (product_id) REFERENCES products(product_id),
CONSTRAINT chk_quantity_positive CHECK (quantity > 0),
CONSTRAINT chk_unit_price_positive CHECK (unit_price > 0)
);
-- Shopping Cart
CREATE TABLE cart_items (
user_id INT,
product_id INT,
quantity INT NOT NULL,
added_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (user_id, product_id),
FOREIGN KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE,
FOREIGN KEY (product_id) REFERENCES products(product_id) ON DELETE CASCADE,
CONSTRAINT chk_cart_quantity_positive CHECK (quantity > 0)
);
-- Reviews and Ratings
CREATE TABLE product_reviews (
review_id INT AUTO_INCREMENT PRIMARY KEY,
product_id INT NOT NULL,
user_id INT NOT NULL,
order_id INT, -- Optional: link to verified purchase
rating TINYINT NOT NULL,
title VARCHAR(255),
review_text TEXT,
is_verified_purchase BOOLEAN DEFAULT FALSE,
is_approved BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (product_id) REFERENCES products(product_id) ON DELETE CASCADE,
FOREIGN KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE,
FOREIGN KEY (order_id) REFERENCES orders(order_id),
CONSTRAINT chk_rating_range CHECK (rating >= 1 AND rating <= 5),
CONSTRAINT uk_user_product_review UNIQUE (user_id, product_id),
INDEX idx_product_reviews (product_id, is_approved),
INDEX idx_user_reviews (user_id)
);
-- Inventory Tracking
CREATE TABLE inventory_transactions (
transaction_id INT AUTO_INCREMENT PRIMARY KEY,
product_id INT NOT NULL,
transaction_type ENUM('in', 'out', 'adjustment') NOT NULL,
quantity_change INT NOT NULL,
reason VARCHAR(255),
reference_id INT, -- Could be order_id, return_id, etc.
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
created_by INT,
FOREIGN KEY (product_id) REFERENCES products(product_id),
FOREIGN KEY (created_by) REFERENCES users(user_id),
INDEX idx_product_transactions (product_id, created_at),
INDEX idx_transaction_type (transaction_type)
);Indexes for Performance
-- Composite indexes for common query patterns
CREATE INDEX idx_orders_user_status_date ON orders(user_id, status_id, created_at DESC);
CREATE INDEX idx_products_active_price ON products(is_active, price) WHERE is_active = TRUE;
CREATE INDEX idx_reviews_product_approved ON product_reviews(product_id, is_approved, rating);
-- Full-text search indexes
ALTER TABLE products ADD FULLTEXT(name, description);
ALTER TABLE product_reviews ADD FULLTEXT(title, review_text);
-- Covering indexes for common queries
CREATE INDEX idx_order_items_covering ON order_items(order_id) INCLUDE (product_id, quantity, unit_price, total_price);Views for Complex Queries
-- Product summary with aggregated data
CREATE VIEW product_summary AS
SELECT
p.product_id,
p.sku,
p.name,
p.price,
p.stock_quantity,
p.is_active,
COALESCE(AVG(pr.rating), 0) as avg_rating,
COUNT(pr.review_id) as review_count,
COALESCE(SUM(oi.quantity), 0) as total_sold
FROM products p
LEFT JOIN product_reviews pr ON p.product_id = pr.product_id AND pr.is_approved = TRUE
LEFT JOIN order_items oi ON p.product_id = oi.product_id
LEFT JOIN orders o ON oi.order_id = o.order_id AND o.status_id IN (3, 4) -- Shipped or delivered
GROUP BY p.product_id, p.sku, p.name, p.price, p.stock_quantity, p.is_active;
-- Customer lifetime value view
CREATE VIEW customer_ltv AS
SELECT
u.user_id,
u.email,
up.first_name,
up.last_name,
COUNT(o.order_id) as total_orders,
COALESCE(SUM(o.total_amount), 0) as lifetime_value,
COALESCE(AVG(o.total_amount), 0) as avg_order_value,
MAX(o.created_at) as last_order_date,
MIN(o.created_at) as first_order_date
FROM users u
LEFT JOIN user_profiles up ON u.user_id = up.user_id
LEFT JOIN orders o ON u.user_id = o.user_id AND o.status_id IN (2, 3, 4) -- Paid, shipped, delivered
WHERE u.is_active = TRUE
GROUP BY u.user_id, u.email, up.first_name, up.last_name;Stored Procedures for Business Logic
DELIMITER $
-- Procedure to create an order from cart
CREATE PROCEDURE create_order_from_cart(
IN p_user_id INT,
IN p_shipping_address_id INT,
IN p_billing_address_id INT,
OUT p_order_id INT
)
BEGIN
DECLARE v_subtotal DECIMAL(10,2) DEFAULT 0;
DECLARE v_tax_rate DECIMAL(5,4) DEFAULT 0.0875; -- 8.75% tax
DECLARE v_shipping_rate DECIMAL(10,2) DEFAULT 9.99;
DECLARE v_total DECIMAL(10,2);
DECLARE v_order_number VARCHAR(50);
DECLARE EXIT HANDLER FOR SQLEXCEPTION
BEGIN
ROLLBACK;
RESIGNAL;
END;
START TRANSACTION;
-- Calculate subtotal from cart
SELECT SUM(ci.quantity * p.price) INTO v_subtotal
FROM cart_items ci
JOIN products p ON ci.product_id = p.product_id
WHERE ci.user_id = p_user_id;
-- Generate order number
SET v_order_number = CONCAT('ORD', YEAR(NOW()), LPAD(FLOOR(RAND() * 10000), 4, '0'));
-- Calculate total
SET v_total = v_subtotal + (v_subtotal * v_tax_rate) + v_shipping_rate;
-- Create order
INSERT INTO orders (
user_id, order_number, subtotal, tax_amount,
shipping_amount, total_amount, shipping_address, billing_address
) VALUES (
p_user_id, v_order_number, v_subtotal,
v_subtotal * v_tax_rate, v_shipping_rate, v_total,
(SELECT JSON_OBJECT('address_id', address_id, 'street', street_address, 'city', city, 'state', state_province, 'postal_code', postal_code, 'country', country) FROM addresses WHERE address_id = p_shipping_address_id),
(SELECT JSON_OBJECT('address_id', address_id, 'street', street_address, 'city', city, 'state', state_province, 'postal_code', postal_code, 'country', country) FROM addresses WHERE address_id = p_billing_address_id)
);
SET p_order_id = LAST_INSERT_ID();
-- Create order items from cart
INSERT INTO order_items (order_id, product_id, quantity, unit_price, total_price, product_name, product_sku)
SELECT
p_order_id,
ci.product_id,
ci.quantity,
p.price,
ci.quantity * p.price,
p.name,
p.sku
FROM cart_items ci
JOIN products p ON ci.product_id = p.product_id
WHERE ci.user_id = p_user_id;
-- Update inventory
INSERT INTO inventory_transactions (product_id, transaction_type, quantity_change, reason, reference_id)
SELECT
ci.product_id,
'out',
-ci.quantity,
'Order created',
p_order_id
FROM cart_items ci
WHERE ci.user_id = p_user_id;
-- Update product stock
UPDATE products p
JOIN cart_items ci ON p.product_id = ci.product_id
SET p.stock_quantity = p.stock_quantity - ci.quantity
WHERE ci.user_id = p_user_id;
-- Clear cart
DELETE FROM cart_items WHERE user_id = p_user_id;
COMMIT;
END$
DELIMITER ;Performance Monitoring Queries
-- Identify slow-performing products (high cart abandonment)
SELECT
p.product_id,
p.name,
COUNT(DISTINCT ci.user_id) as added_to_cart,
COUNT(DISTINCT oi.order_id) as actually_purchased,
ROUND((COUNT(DISTINCT oi.order_id) / COUNT(DISTINCT ci.user_id)) * 100, 2) as conversion_rate
FROM products p
LEFT JOIN cart_items ci ON p.product_id = ci.product_id
LEFT JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY p.product_id, p.name
HAVING added_to_cart > 10
ORDER BY conversion_rate ASC;
-- Monitor inventory levels
SELECT
p.product_id,
p.sku,
p.name,
p.stock_quantity,
p.min_stock_level,
CASE
WHEN p.stock_quantity <= 0 THEN 'Out of Stock'
WHEN p.stock_quantity <= p.min_stock_level THEN 'Low Stock'
ELSE 'In Stock'
END as stock_status,
AVG(oi.quantity) as avg_daily_sales
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
LEFT JOIN orders o ON oi.order_id = o.order_id
AND o.created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
AND o.status_id IN (2, 3, 4)
WHERE p.is_active = TRUE
GROUP BY p.product_id, p.sku, p.name, p.stock_quantity, p.min_stock_level
ORDER BY stock_status, p.stock_quantity ASC;Database Maintenance Best Practices
Regular Maintenance Tasks
-- Analyze and optimize tables
ANALYZE TABLE products, orders, order_items;
OPTIMIZE TABLE cart_items, inventory_transactions;
-- Check for unused indexes
SELECT
s.table_name,
s.index_name,
s.cardinality
FROM information_schema.statistics s
LEFT JOIN (
SELECT DISTINCT table_name, index_name
FROM information_schema.statistics
WHERE table_schema = 'your_database'
) used_indexes ON s.table_name = used_indexes.table_name
AND s.index_name = used_indexes.index_name
WHERE s.table_schema = 'your_database'
AND used_indexes.index_name IS NULL;
-- Archive old data
CREATE TABLE orders_archive LIKE orders;
INSERT INTO orders_archive
SELECT * FROM orders
WHERE created_at < DATE_SUB(NOW(), INTERVAL 2 YEAR);
-- Clean up old cart items (older than 30 days)
DELETE FROM cart_items
WHERE updated_at < DATE_SUB(NOW(), INTERVAL 30 DAY);Summary
Effective database design is crucial for building scalable, maintainable applications. Key principles include:
Normalization: Reduce redundancy and maintain data integrity while balancing performance needs.
Indexing Strategy: Create indexes strategically based on query patterns, not just primary keys.
Data Integrity: Use constraints, foreign keys, and validation to maintain data quality.
Security: Implement proper access controls, encryption for sensitive data, and audit trails.
Performance: Consider denormalization, partitioning, and caching strategies for high-traffic applications.
Maintainability: Use consistent naming conventions, document your schema, and plan for evolution.
Remember that database design is often about trade-offs. Perfect normalization might hurt performance, while denormalization might complicate maintenance. Always design with your specific use case, scale requirements, and team capabilities in mind.
Next Steps
You now have a comprehensive understanding of SQL and database design. Continue your learning journey by exploring:
Last updated on