Multi-Table Joins
In real applications, data is usually distributed across multiple tables. SQL's JOIN operation allows us to combine data from multiple tables for querying. This chapter will introduce various JOIN types and their use cases in detail.
Why Do We Need JOIN?
Data Normalization
To avoid data redundancy, we usually distribute data across multiple tables:
-- Users table
CREATE TABLE users (
id INT PRIMARY KEY,
name VARCHAR(50),
email VARCHAR(100)
);
-- Orders table
CREATE TABLE orders (
id INT PRIMARY KEY,
user_id INT,
total_amount DECIMAL(10, 2),
order_date DATE,
FOREIGN KEY (user_id) REFERENCES users(id)
);Query Related Data
To query users and their order information, we need to use JOIN:
SELECT
users.name,
orders.total_amount,
orders.order_date
FROM users
JOIN orders ON users.id = orders.user_id;Prepare Sample Data
Create Tables
-- Users table
CREATE TABLE users (
id INT PRIMARY KEY,
name VARCHAR(50),
city VARCHAR(50)
);
-- Orders table
CREATE TABLE orders (
id INT PRIMARY KEY,
user_id INT,
product VARCHAR(50),
amount DECIMAL(10, 2)
);Insert Data
-- Insert user data
INSERT INTO users VALUES
(1, 'Zhang San', 'Beijing'),
(2, 'Li Si', 'Shanghai'),
(3, 'Wang Wu', 'Guangzhou'),
(4, 'Zhao Liu', 'Shenzhen');
-- Insert order data
INSERT INTO orders VALUES
(101, 1, 'Laptop', 5000),
(102, 1, 'Mouse', 50),
(103, 2, 'Keyboard', 200),
(104, 3, 'Monitor', 1500);Data Overview:
users table:
| id | name | city |
|---|---|---|
| 1 | Zhang San | Beijing |
| 2 | Li Si | Shanghai |
| 3 | Wang Wu | Guangzhou |
| 4 | Zhao Liu | Shenzhen |
orders table:
| id | user_id | product | amount |
|---|---|---|---|
| 101 | 1 | Laptop | 5000 |
| 102 | 1 | Mouse | 50 |
| 103 | 2 | Keyboard | 200 |
| 104 | 3 | Monitor | 1500 |
INNER JOIN
Basic Syntax
SELECT columns
FROM table1
INNER JOIN table2 ON table1.column = table2.column;Example
SELECT
users.name,
users.city,
orders.product,
orders.amount
FROM users
INNER JOIN orders ON users.id = orders.user_id;Result:
| name | city | product | amount |
|---|---|---|---|
| Zhang San | Beijing | Laptop | 5000 |
| Zhang San | Beijing | Mouse | 50 |
| Li Si | Shanghai | Keyboard | 200 |
| Wang Wu | Guangzhou | Monitor | 1500 |
Note: Zhao Liu has no orders, so he is not in the result.
Shorthand Form
The INNER keyword can be omitted:
SELECT users.name, orders.product
FROM users
JOIN orders ON users.id = orders.user_id;Using Table Aliases
SELECT
u.name,
o.product,
o.amount
FROM users u
JOIN orders o ON u.id = o.user_id;LEFT JOIN
Basic Syntax
SELECT columns
FROM table1
LEFT JOIN table2 ON table1.column = table2.column;Example
SELECT
users.name,
users.city,
orders.product,
orders.amount
FROM users
LEFT JOIN orders ON users.id = orders.user_id;Result:
| name | city | product | amount |
|---|---|---|---|
| Zhang San | Beijing | Laptop | 5000 |
| Zhang San | Beijing | Mouse | 50 |
| Li Si | Shanghai | Keyboard | 200 |
| Wang Wu | Guangzhou | Monitor | 1500 |
| Zhao Liu | Shenzhen | NULL | NULL |
Note: Zhao Liu has no orders but still appears in the result with NULL order information.
Find Users Without Orders
SELECT
users.name,
users.city
FROM users
LEFT JOIN orders ON users.id = orders.user_id
WHERE orders.id IS NULL;Result:
| name | city |
|---|---|
| Zhao Liu | Shenzhen |
RIGHT JOIN
Basic Syntax
SELECT columns
FROM table1
RIGHT JOIN table2 ON table1.column = table2.column;Example
SELECT
users.name,
orders.product,
orders.amount
FROM users
RIGHT JOIN orders ON users.id = orders.user_id;Explanation: RIGHT JOIN returns all records from the right table (orders), even if there are no matching records in the left table (users).
Note: In practice, LEFT JOIN is more commonly used. RIGHT JOIN can be implemented by swapping table positions with LEFT JOIN.
FULL OUTER JOIN
Basic Syntax
SELECT columns
FROM table1
FULL OUTER JOIN table2 ON table1.column = table2.column;Example
SELECT
users.name,
orders.product
FROM users
FULL OUTER JOIN orders ON users.id = orders.user_id;Explanation: Returns all records from both tables, filling unmatched places with NULL.
Note: MySQL does not support FULL OUTER JOIN, but it can be implemented using UNION:
SELECT users.name, orders.product
FROM users
LEFT JOIN orders ON users.id = orders.user_id
UNION
SELECT users.name, orders.product
FROM users
RIGHT JOIN orders ON users.id = orders.user_id;CROSS JOIN
Basic Syntax
SELECT columns
FROM table1
CROSS JOIN table2;Example
SELECT
users.name,
orders.product
FROM users
CROSS JOIN orders;Explanation: Returns the Cartesian product of two tables, i.e., every combination of each user with each order.
Result rows: 4 users × 4 orders = 16 rows
Use cases: Generate all possible combinations, such as shift schedules, test data, etc.
SELF JOIN
Concept
A table joins with itself, usually used to handle hierarchical relationships.
Example: Employees and Managers
-- Employees table
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(50),
manager_id INT
);
INSERT INTO employees VALUES
(1, 'CEO Zhang', NULL),
(2, 'Manager Li', 1),
(3, 'Supervisor Wang', 2),
(4, 'Employee Zhao', 2);Query employees and their managers:
SELECT
e.name AS Employee,
m.name AS Manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;Result:
| Employee | Manager |
|---|---|
| CEO Zhang | NULL |
| Manager Li | CEO Zhang |
| Supervisor Wang | Manager Li |
| Employee Zhao | Manager Li |
Multi-Table Joins
Joining Three Tables
-- Add products table
CREATE TABLE products (
id INT PRIMARY KEY,
name VARCHAR(50),
category VARCHAR(50)
);
-- Query user, order, and product information
SELECT
u.name AS User,
o.amount AS Amount,
p.name AS Product,
p.category AS Category
FROM users u
JOIN orders o ON u.id = o.user_id
JOIN products p ON o.product_id = p.id;Join Order
-- First join users and orders, then join products
SELECT *
FROM users
JOIN orders ON users.id = orders.user_id
JOIN products ON orders.product_id = products.id;Join Conditions
Equi-Join
SELECT *
FROM users
JOIN orders ON users.id = orders.user_id;Non-Equi Join
-- Query orders with prices in a certain range
SELECT *
FROM orders o
JOIN price_ranges pr ON o.amount BETWEEN pr.min_price AND pr.max_price;Multi-Condition Join
SELECT *
FROM users u
JOIN orders o ON u.id = o.user_id
AND o.order_date >= '2024-01-01';Filtering with WHERE
Filter After JOIN
SELECT
u.name,
o.product,
o.amount
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100;JOIN Condition vs WHERE Condition
-- Filter in JOIN condition (recommended)
SELECT *
FROM users u
LEFT JOIN orders o ON u.id = o.user_id AND o.amount > 100;
-- Filter in WHERE
SELECT *
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100;Difference:
- Filtering in JOIN condition does not affect LEFT JOIN returning all records from the left table
- Filtering in WHERE affects the final result
Aggregation with JOIN
Count Orders per User
SELECT
u.name,
COUNT(o.id) AS Order_Count,
COALESCE(SUM(o.amount), 0) AS Total_Amount
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name;Result:
| name | Order_Count | Total_Amount |
|---|---|---|
| Zhang San | 2 | 5050 |
| Li Si | 1 | 200 |
| Wang Wu | 1 | 1500 |
| Zhao Liu | 0 | 0 |
Performance Optimization
1. Use Indexes
-- Create index for join columns
CREATE INDEX idx_orders_user_id ON orders(user_id);2. Select Only Needed Columns
-- Not recommended
SELECT * FROM users JOIN orders ON users.id = orders.user_id;
-- Recommended
SELECT users.name, orders.product
FROM users JOIN orders ON users.id = orders.user_id;3. Filter Before Joining
-- Filter data first to reduce the amount of data to join
SELECT u.name, o.product
FROM users u
JOIN (
SELECT * FROM orders WHERE amount > 1000
) o ON u.id = o.user_id;4. Use EXPLAIN for Analysis
EXPLAIN SELECT *
FROM users
JOIN orders ON users.id = orders.user_id;Practical Examples
Example 1: Query User Order Statistics
SELECT
u.name AS User_Name,
u.city AS City,
COUNT(o.id) AS Order_Count,
COALESCE(SUM(o.amount), 0) AS Total_Spending,
COALESCE(AVG(o.amount), 0) AS Average_Spending
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name, u.city
ORDER BY Total_Spending DESC;Example 2: Query Recent Orders
SELECT
u.name,
o.product,
o.amount,
o.order_date
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.order_date >= DATE_SUB(NOW(), INTERVAL 30 DAY)
ORDER BY o.order_date DESC;Example 3: Query High-Value Customers
SELECT
u.name,
u.email,
SUM(o.amount) AS Total_Spending
FROM users u
JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name, u.email
HAVING Total_Spending > 5000
ORDER BY Total_Spending DESC;Common Errors
Error 1: Forgetting ON Condition
-- Error: Missing ON condition
SELECT * FROM users JOIN orders;
-- Correct
SELECT * FROM users JOIN orders ON users.id = orders.user_id;Error 2: Column Name Ambiguity
-- Error: id column exists in both tables
SELECT id, name FROM users JOIN orders ON users.id = orders.user_id;
-- Correct: Qualify column names with table name or alias
SELECT users.id, users.name FROM users JOIN orders ON users.id = orders.user_id;Error 3: Confusing LEFT JOIN and INNER JOIN
-- If you want all users (including those without orders), use LEFT JOIN
SELECT * FROM users LEFT JOIN orders ON users.id = orders.user_id;
-- If you only want users with orders, use INNER JOIN
SELECT * FROM users JOIN orders ON users.id = orders.user_id;JOIN Type Comparison
| JOIN Type | Description | Use Cases |
|---|---|---|
| INNER JOIN | Returns matching records from both tables | Query related data |
| LEFT JOIN | Returns all records from left table | Query main table and its related data |
| RIGHT JOIN | Returns all records from right table | Less commonly used, can be replaced with LEFT JOIN |
| FULL OUTER JOIN | Returns all records from both tables | Query all data, including unmatched |
| CROSS JOIN | Returns Cartesian product | Generate all possible combinations |
| SELF JOIN | Table joins with itself | Handle hierarchical relationships |
Summary
This chapter introduced multi-table joins in SQL:
- INNER JOIN: Returns matching records from both tables
- LEFT JOIN: Returns all records from left table, NULL for unmatched right table
- RIGHT JOIN: Returns all records from right table, NULL for unmatched left table
- FULL OUTER JOIN: Returns all records from both tables
- CROSS JOIN: Returns Cartesian product
- SELF JOIN: Table joins with itself
- Multi-Table Joins: Can join multiple tables
- Performance Optimization: Use indexes, select only needed columns, filter before joining
Mastering JOIN operations is an important milestone in SQL learning, enabling you to handle complex multi-table queries.
Next Step: Learn Subqueries to learn how to nest queries within queries.