聚合函数
聚合函数用于对一组值进行计算,返回单个结果值。它们在数据分析和统计中非常重要。
常用聚合函数
COUNT
计算行数:
sql
-- 计算所有行数
SELECT COUNT(*) FROM users;
-- 计算非 NULL 值的行数
SELECT COUNT(email) FROM users;
-- 计算不同的值
SELECT COUNT(DISTINCT city) FROM users;SUM
计算数值列的总和:
sql
-- 计算所有订单的总金额
SELECT SUM(total_amount) FROM orders;
-- 计算特定用户的订单总额
SELECT SUM(total_amount) FROM orders WHERE user_id = 1;AVG
计算数值列的平均值:
sql
-- 计算用户的平均年龄
SELECT AVG(age) FROM users;
-- 计算订单的平均金额
SELECT AVG(total_amount) FROM orders;MIN 和 MAX
查找最小和最大值:
sql
-- 查找最小年龄
SELECT MIN(age) FROM users;
-- 查找最大年龄
SELECT MAX(age) FROM users;
-- 查找最早的订单
SELECT MIN(order_date) FROM orders;分组聚合
GROUP BY 基础
按列分组进行聚合:
sql
-- 按城市分组,统计每个城市的用户数
SELECT
city,
COUNT(*) AS user_count
FROM users
GROUP BY city;结果:
| city | user_count |
|---|---|
| 北京 | 150 |
| 上海 | 120 |
| 广州 | 80 |
多列分组
sql
-- 按城市和性别分组
SELECT
city,
gender,
COUNT(*) AS user_count
FROM users
GROUP BY city, gender;使用 HAVING 过滤分组
HAVING 用于过滤聚合结果:
sql
-- 查找用户数大于 100 的城市
SELECT
city,
COUNT(*) AS user_count
FROM users
GROUP BY city
HAVING COUNT(*) > 100;实战示例
示例 1:销售统计
sql
SELECT
DATE_FORMAT(order_date, '%Y-%m') AS month,
COUNT(*) AS order_count,
SUM(total_amount) AS total_sales,
AVG(total_amount) AS avg_order_value
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY DATE_FORMAT(order_date, '%Y-%m')
ORDER BY month;示例 2:用户行为分析
sql
SELECT
u.id AS user_id,
u.username,
COUNT(o.id) AS order_count,
SUM(o.total_amount) AS total_spent,
AVG(o.total_amount) AS avg_order_value,
MAX(o.order_date) AS last_order_date
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at >= '2024-01-01'
GROUP BY u.id
ORDER BY total_spent DESC;性能优化
1. 使用 WHERE 而不是 HAVING
sql
-- 不推荐
SELECT city, COUNT(*)
FROM users
GROUP BY city
HAVING city = '北京';
-- 推荐
SELECT city, COUNT(*)
FROM users
WHERE city = '北京'
GROUP BY city;2. 为分组列创建索引
sql
CREATE INDEX idx_user_city ON users(city);小结
聚合函数是数据分析的基础工具,掌握它们对于数据统计和分析至关重要。
下一步: 学习 GROUP BY 分组