PostgreSQL 函数
简介
PostgreSQL 提供了一套丰富的内置函数,用于数据操作、计算和转换。函数可以在 SELECT 语句、WHERE 子句和其他 SQL 上下文中使用。
函数类别
- 聚合函数 - 对行集进行操作
- 字符串函数 - 文本操作
- 数值函数 - 数学运算
- 日期/时间函数 - 日期和时间操作
- 条件函数 - 条件逻辑
- 窗口函数 - 跨行集的计算
- JSON 函数 - JSON 数据操作
- 数组函数 - 数组操作
- 类型转换函数 - 数据类型转换
聚合函数
聚合函数从一组输入值计算单个结果。
常用聚合函数
| 函数 | 描述 | 示例 |
|---|---|---|
| COUNT() | 计数行数 | COUNT(*) |
| SUM() | 求和 | SUM(amount) |
| AVG() | 平均值 | AVG(price) |
| MIN() | 最小值 | MIN(age) |
| MAX() | 最大值 | MAX(salary) |
sql
CREATE TABLE sales (
id SERIAL PRIMARY KEY,
product VARCHAR(100),
quantity INTEGER,
price NUMERIC(10,2),
sale_date DATE
);
INSERT INTO sales (product, quantity, price, sale_date) VALUES
('笔记本电脑', 5, 999.99, '2024-01-15'),
('鼠标', 20, 29.99, '2024-01-16'),
('键盘', 15, 79.99, '2024-01-17'),
('显示器', 8, 299.99, '2024-01-18');
-- 聚合示例
SELECT
COUNT(*) AS total_sales,
SUM(quantity) AS total_quantity,
AVG(price) AS avg_price,
MIN(price) AS min_price,
MAX(price) AS max_price,
SUM(quantity * price) AS total_revenue
FROM sales;高级聚合函数
sql
-- STRING_AGG: 连接字符串
SELECT STRING_AGG(product, ', ') AS products FROM sales;
-- ARRAY_AGG: 聚合到数组
SELECT ARRAY_AGG(product) AS product_array FROM sales;
-- COUNT DISTINCT: 计数唯一值
SELECT COUNT(DISTINCT product) AS unique_products FROM sales;
-- 带 GROUP BY 的聚合
SELECT
DATE_TRUNC('month', sale_date) AS month,
COUNT(*) AS sales_count,
SUM(quantity * price) AS monthly_revenue
FROM sales
GROUP BY DATE_TRUNC('month', sale_date);字符串函数
字符串函数操作文本数据。
常用字符串函数
| 函数 | 描述 | 示例 |
|---|---|---|
| LENGTH() | 字符串长度 | LENGTH('hello') → 5 |
| UPPER() | 转换为大写 | UPPER('hello') → 'HELLO' |
| LOWER() | 转换为小写 | LOWER('HELLO') → 'hello' |
| TRIM() | 移除空格 | TRIM(' text ') → 'text' |
| SUBSTRING() | 提取子字符串 | SUBSTRING('hello', 1, 3) → 'hel' |
| CONCAT() | 连接字符串 | CONCAT('a', 'b') → 'ab' |
| REPLACE() | 替换子字符串 | REPLACE('hello', 'l', 'r') → 'herro' |
sql
CREATE TABLE users (
id SERIAL PRIMARY KEY,
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(255)
);
INSERT INTO users (first_name, last_name, email) VALUES
('张', '三', 'zhang.san@example.com'),
('李', '四', 'LI.SI@EXAMPLE.COM'),
(' 王 ', '五', 'wang@example.com');
-- 字符串函数示例
SELECT
first_name,
last_name,
CONCAT(first_name, last_name) AS full_name,
UPPER(first_name) AS upper_name,
LOWER(email) AS lower_email,
LENGTH(first_name) AS name_length,
TRIM(first_name) AS trimmed_name,
SUBSTRING(email, 1, POSITION('@' IN email) - 1) AS username,
REPLACE(email, '@example.com', '@newdomain.com') AS new_email
FROM users;高级字符串函数
sql
-- LEFT 和 RIGHT: 从开头/结尾提取
SELECT
LEFT('PostgreSQL', 4) AS left_part, -- 'Post'
RIGHT('PostgreSQL', 3) AS right_part; -- 'SQL'
-- LPAD 和 RPAD: 填充字符串
SELECT
LPAD('42', 5, '0') AS padded_left, -- '00042'
RPAD('42', 5, '0') AS padded_right; -- '42000'
-- SPLIT_PART: 分割字符串
SELECT SPLIT_PART('one,two,three', ',', 2) AS second_part; -- 'two'
-- REGEXP_REPLACE: 正则表达式替换
SELECT REGEXP_REPLACE('Hello 123 World', '\d+', 'XXX') AS replaced; -- 'Hello XXX World'
-- INITCAP: 每个单词首字母大写
SELECT INITCAP('hello world') AS capitalized; -- 'Hello World'数值函数
数值函数执行数学运算。
常用数值函数
| 函数 | 描述 | 示例 |
|---|---|---|
| ABS() | 绝对值 | ABS(-5) → 5 |
| ROUND() | 四舍五入 | ROUND(3.7) → 4 |
| CEIL() | 向上取整 | CEIL(3.2) → 4 |
| FLOOR() | 向下取整 | FLOOR(3.8) → 3 |
| POWER() | 幂运算 | POWER(2, 3) → 8 |
| SQRT() | 平方根 | SQRT(16) → 4 |
| MOD() | 取模 | MOD(10, 3) → 1 |
sql
CREATE TABLE measurements (
id SERIAL PRIMARY KEY,
value NUMERIC(10,4)
);
INSERT INTO measurements (value) VALUES
(3.14159), (-5.678), (10.5), (99.999);
-- 数值函数示例
SELECT
value,
ABS(value) AS absolute,
ROUND(value, 2) AS rounded,
CEIL(value) AS ceiling,
FLOOR(value) AS floor,
POWER(value, 2) AS squared,
SQRT(ABS(value)) AS square_root,
MOD(value::INTEGER, 3) AS modulo
FROM measurements;
-- 数学运算
SELECT
PI() AS pi_value,
POWER(2, 10) AS two_to_ten,
SQRT(144) AS sqrt_144,
EXP(1) AS e_value,
LN(10) AS natural_log,
LOG(100) AS log_base_10;日期/时间函数
日期和时间函数操作时间数据。
常用日期/时间函数
| 函数 | 描述 | 示例 |
|---|---|---|
| NOW() | 当前时间戳 | NOW() |
| CURRENT_DATE | 当前日期 | CURRENT_DATE |
| CURRENT_TIME | 当前时间 | CURRENT_TIME |
| AGE() | 计算年龄 | AGE(timestamp) |
| EXTRACT() | 提取部分 | EXTRACT(YEAR FROM date) |
| DATE_TRUNC() | 截断到单位 | DATE_TRUNC('month', date) |
sql
CREATE TABLE events (
id SERIAL PRIMARY KEY,
event_name VARCHAR(100),
event_date TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW()
);
INSERT INTO events (event_name, event_date) VALUES
('会议', '2024-06-15 09:00:00'),
('研讨会', '2024-07-20 14:30:00'),
('网络研讨会', '2024-08-10 10:00:00');
-- 日期/时间函数示例
SELECT
event_name,
event_date,
NOW() AS current_time,
CURRENT_DATE AS today,
AGE(event_date, NOW()) AS time_until_event,
EXTRACT(YEAR FROM event_date) AS year,
EXTRACT(MONTH FROM event_date) AS month,
EXTRACT(DAY FROM event_date) AS day,
DATE_TRUNC('month', event_date) AS month_start,
TO_CHAR(event_date, 'YYYY-MM-DD HH24:MI:SS') AS formatted_date
FROM events;
-- 日期算术
SELECT
CURRENT_DATE AS today,
CURRENT_DATE + INTERVAL '7 days' AS next_week,
CURRENT_DATE - INTERVAL '1 month' AS last_month,
CURRENT_DATE + INTERVAL '1 year' AS next_year;高级日期/时间函数
sql
-- DATE_PART: 提取特定部分
SELECT
DATE_PART('year', NOW()) AS year,
DATE_PART('quarter', NOW()) AS quarter,
DATE_PART('week', NOW()) AS week,
DATE_PART('dow', NOW()) AS day_of_week; -- 0=星期日
-- MAKE_DATE 和 MAKE_TIMESTAMP
SELECT
MAKE_DATE(2024, 6, 15) AS created_date,
MAKE_TIMESTAMP(2024, 6, 15, 10, 30, 0) AS created_timestamp;
-- JUSTIFY_DAYS 和 JUSTIFY_HOURS: 规范化间隔
SELECT
JUSTIFY_DAYS(INTERVAL '35 days') AS normalized_days, -- '1 mon 5 days'
JUSTIFY_HOURS(INTERVAL '27 hours') AS normalized_hours; -- '1 day 03:00:00'条件函数
条件函数提供 if-then-else 逻辑。
CASE 表达式
sql
SELECT
product,
price,
CASE
WHEN price < 50 THEN '经济型'
WHEN price < 200 THEN '中档'
ELSE '高档'
END AS price_category
FROM sales;COALESCE 函数
返回第一个非 NULL 值。
sql
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
discount_price NUMERIC(10,2),
regular_price NUMERIC(10,2)
);
INSERT INTO products (name, discount_price, regular_price) VALUES
('产品 A', NULL, 99.99),
('产品 B', 79.99, 99.99),
('产品 C', NULL, 149.99);
-- COALESCE 示例
SELECT
name,
COALESCE(discount_price, regular_price) AS final_price,
COALESCE(discount_price, regular_price, 0) AS price_with_default
FROM products;NULLIF 函数
如果两个值相等则返回 NULL。
sql
SELECT
NULLIF(10, 10) AS result1, -- NULL
NULLIF(10, 5) AS result2; -- 10
-- 避免除以零
SELECT
quantity,
total_amount / NULLIF(quantity, 0) AS price_per_unit
FROM sales;GREATEST 和 LEAST
sql
SELECT
GREATEST(10, 20, 5, 30) AS max_value, -- 30
LEAST(10, 20, 5, 30) AS min_value; -- 5窗口函数
窗口函数对与当前行相关的一组行执行计算。
常用窗口函数
sql
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50),
salary NUMERIC(10,2)
);
INSERT INTO employees (name, department, salary) VALUES
('张三', '销售', 60000),
('李四', '销售', 55000),
('王五', 'IT', 70000),
('赵六', 'IT', 75000),
('钱七', 'HR', 50000);
-- ROW_NUMBER: 分配唯一行号
SELECT
name,
department,
salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num
FROM employees;
-- RANK 和 DENSE_RANK
SELECT
name,
department,
salary,
RANK() OVER (ORDER BY salary DESC) AS rank,
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;
-- PARTITION BY: 按组窗口
SELECT
name,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
AVG(salary) OVER (PARTITION BY department) AS dept_avg_salary
FROM employees;高级窗口函数
sql
-- LAG 和 LEAD: 访问上一行/下一行
SELECT
name,
salary,
LAG(salary) OVER (ORDER BY salary) AS previous_salary,
LEAD(salary) OVER (ORDER BY salary) AS next_salary,
salary - LAG(salary) OVER (ORDER BY salary) AS salary_diff
FROM employees;
-- FIRST_VALUE 和 LAST_VALUE
SELECT
name,
department,
salary,
FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary) AS lowest_in_dept,
LAST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS highest_in_dept
FROM employees;
-- NTILE: 分成桶
SELECT
name,
salary,
NTILE(4) OVER (ORDER BY salary) AS quartile
FROM employees;JSON 函数
PostgreSQL 提供广泛的 JSON 操作函数。
JSON 查询函数
sql
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
attributes JSONB
);
INSERT INTO products (name, attributes) VALUES
('笔记本电脑', '{"brand": "Dell", "ram": "16GB", "storage": "512GB", "price": 999}'),
('手机', '{"brand": "Apple", "model": "iPhone 14", "storage": "256GB", "price": 899}'),
('平板', '{"brand": "Samsung", "screen": "10.5 inch", "storage": "128GB", "price": 499}');
-- 提取 JSON 值
SELECT
name,
attributes->>'brand' AS brand,
attributes->>'storage' AS storage,
(attributes->>'price')::NUMERIC AS price
FROM products;
-- JSON 路径查询
SELECT
name,
attributes->'brand' AS brand_json, -- 返回 JSON
attributes->>'brand' AS brand_text -- 返回文本
FROM products;
-- 检查键是否存在
SELECT name FROM products WHERE attributes ? 'model';
-- 包含运算符
SELECT name FROM products WHERE attributes @> '{"brand": "Apple"}';JSON 操作函数
sql
-- JSONB_SET: 更新 JSON 值
UPDATE products
SET attributes = JSONB_SET(attributes, '{price}', '899')
WHERE name = '笔记本电脑';
-- JSONB_INSERT: 插入新键
UPDATE products
SET attributes = JSONB_INSERT(attributes, '{warranty}', '"2 years"')
WHERE name = '笔记本电脑';
-- 连接 JSON
UPDATE products
SET attributes = attributes || '{"color": "black"}'
WHERE name = '手机';
-- 移除键
UPDATE products
SET attributes = attributes - 'model'
WHERE name = '手机';数组函数
PostgreSQL 提供处理数组的函数。
sql
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
tags TEXT[],
prices NUMERIC[]
);
INSERT INTO products (name, tags, prices) VALUES
('笔记本电脑', ARRAY['电子产品', '计算机', '便携'], ARRAY[999, 1299, 1599]),
('书籍', ARRAY['小说', '畅销书'], ARRAY[9.99, 12.99]);
-- 数组函数
SELECT
name,
tags,
array_length(tags, 1) AS tag_count,
tags[1] AS first_tag,
array_append(tags, 'new') AS with_new_tag,
array_prepend('featured', tags) AS with_featured,
array_cat(tags, ARRAY['sale']) AS concatenated,
'电子产品' = ANY(tags) AS has_electronics
FROM products;
-- UNNEST: 将数组转换为行
SELECT
name,
UNNEST(tags) AS tag
FROM products;
-- 数组聚合
SELECT
ARRAY_AGG(name) AS all_products,
ARRAY_AGG(name ORDER BY name) AS sorted_products
FROM products;类型转换函数
在不同数据类型之间转换。
sql
-- CAST 和 :: 操作符
SELECT
CAST('123' AS INTEGER) AS int_value,
'123'::INTEGER AS int_value2,
CAST('2024-01-01' AS DATE) AS date_value,
'2024-01-01'::DATE AS date_value2;
-- TO_CHAR: 格式化输出
SELECT
TO_CHAR(NOW(), 'YYYY-MM-DD') AS formatted_date,
TO_CHAR(NOW(), 'HH24:MI:SS') AS formatted_time,
TO_CHAR(1234.56, '9,999.99') AS formatted_number;
-- TO_NUMBER: 解析数字
SELECT
TO_NUMBER('1,234.56', '9,999.99') AS parsed_number;
-- TO_DATE 和 TO_TIMESTAMP
SELECT
TO_DATE('2024-01-15', 'YYYY-MM-DD') AS parsed_date,
TO_TIMESTAMP('2024-01-15 10:30:00', 'YYYY-MM-DD HH24:MI:SS') AS parsed_timestamp;系统信息函数
获取数据库系统信息。
sql
-- 数据库信息
SELECT
VERSION() AS postgres_version,
CURRENT_DATABASE() AS current_db,
CURRENT_SCHEMA() AS current_schema,
CURRENT_USER AS current_user,
SESSION_USER AS session_user;
-- 连接信息
SELECT
INET_CLIENT_ADDR() AS client_ip,
INET_CLIENT_PORT() AS client_port,
INET_SERVER_ADDR() AS server_ip,
INET_SERVER_PORT() AS server_port;
-- 数据库大小
SELECT
PG_DATABASE_SIZE(CURRENT_DATABASE()) AS db_size_bytes,
PG_SIZE_PRETTY(PG_DATABASE_SIZE(CURRENT_DATABASE())) AS db_size_pretty;用户定义函数
在 PostgreSQL 中创建自定义函数。
基本函数
sql
-- 创建简单函数
CREATE OR REPLACE FUNCTION add_numbers(a INTEGER, b INTEGER)
RETURNS INTEGER AS $$
BEGIN
RETURN a + b;
END;
$$ LANGUAGE plpgsql;
-- 使用函数
SELECT add_numbers(5, 3); -- 返回 8带默认参数的函数
sql
CREATE OR REPLACE FUNCTION calculate_discount(
price NUMERIC,
discount_percent NUMERIC DEFAULT 10
)
RETURNS NUMERIC AS $$
BEGIN
RETURN price * (1 - discount_percent / 100);
END;
$$ LANGUAGE plpgsql;
-- 使用带和不带默认值
SELECT calculate_discount(100); -- 返回 90
SELECT calculate_discount(100, 20); -- 返回 80返回表的函数
sql
CREATE OR REPLACE FUNCTION get_high_salary_employees(min_salary NUMERIC)
RETURNS TABLE(
employee_name VARCHAR,
employee_salary NUMERIC,
department_name VARCHAR
) AS $$
BEGIN
RETURN QUERY
SELECT name, salary, department
FROM employees
WHERE salary >= min_salary
ORDER BY salary DESC;
END;
$$ LANGUAGE plpgsql;
-- 使用函数
SELECT * FROM get_high_salary_employees(60000);带控制流的函数
sql
CREATE OR REPLACE FUNCTION get_grade(score INTEGER)
RETURNS VARCHAR AS $$
BEGIN
IF score >= 90 THEN
RETURN 'A';
ELSIF score >= 80 THEN
RETURN 'B';
ELSIF score >= 70 THEN
RETURN 'C';
ELSIF score >= 60 THEN
RETURN 'D';
ELSE
RETURN 'F';
END IF;
END;
$$ LANGUAGE plpgsql;
SELECT get_grade(85); -- 返回 'B'最佳实践
使用适当的函数
- 为数据类型选择正确的函数
- 尽可能使用内置函数而不是自定义逻辑
性能考虑
- 避免在大表的 WHERE 子句中使用函数(阻止索引使用)
- 尽可能使用窗口函数而不是子查询
NULL 处理
- 始终考虑函数中的 NULL 值
- 适当使用 COALESCE 或 NULLIF
类型安全
- 使用显式类型转换避免错误
- 在用户定义函数中验证输入参数
文档
- 注释复杂函数
- 使用描述性的函数和参数名称
常见函数模式
模式 1:数据清理
sql
-- 清理和标准化电子邮件地址
SELECT
LOWER(TRIM(email)) AS clean_email,
REGEXP_REPLACE(phone, '[^0-9]', '', 'g') AS clean_phone
FROM users;模式 2:数据聚合
sql
-- 计算运行总计
SELECT
sale_date,
amount,
SUM(amount) OVER (ORDER BY sale_date) AS running_total
FROM sales;模式 3:数据转换
sql
-- 将 JSON 转换为列
SELECT
id,
attributes->>'brand' AS brand,
(attributes->>'price')::NUMERIC AS price,
ARRAY_LENGTH(STRING_TO_ARRAY(attributes->>'tags', ','), 1) AS tag_count
FROM products;总结
PostgreSQL 函数提供强大的数据操作能力:
- 聚合函数: COUNT、SUM、AVG、MIN、MAX
- 字符串函数: CONCAT、UPPER、LOWER、SUBSTRING、TRIM
- 数值函数: ROUND、CEIL、FLOOR、ABS、POWER
- 日期/时间函数: NOW、EXTRACT、DATE_TRUNC、AGE
- 条件函数: CASE、COALESCE、NULLIF
- 窗口函数: ROW_NUMBER、RANK、LAG、LEAD
- JSON 函数: ->、->>、@>、JSONB_SET
- 数组函数: ARRAY_AGG、UNNEST、array_length
- 用户定义函数: 使用 PL/pgSQL 的自定义逻辑
理解并有效使用这些函数对于 PostgreSQL 中的高效数据处理和分析至关重要。