irpas技术客

【3万字总结】postgresql基础入门_poizxc2014_postgres入门

未知 6770

目录

🏈1、postgresql简介

👛1.1、PG起源

👛1.2、PG特点

👛1.3、许可证

🏈2、postgresql安装

👛2.1、yum安装

👛2.2、源码安装

👛2.3、交互式脚本安装

🏈3、postgresql参数解释

👛3.1、shared_buffers

👛3.2、temp_buffers:

👛3.3、work_mem

👛3.4、maintenance_work_mem

👛3.5、max_stack_depth:

🏈4、?postgresql初始化与启动关闭

👛4.1、创建操作系统用户

👛4.2、initdb数据库

👛4.3、启动数据库

👛4.4、查看数据库状态

👛4.5、关闭数据库

🏈5、元命令?

🏈6、数据类型

数字类型

字符类型

时间与日期

网络类型

🏈7、postgresql高级特性

👛7.1、with查询

👛7.2、returning 返回更新后的数据

👛7.3、upsert

👛7.4、数据抽样

👛7.5、聚合函数

👛7.6、窗口函数

🏈8、查看全部参数


👩?💻博客主页:大家好我是poizxc2014的博客主页

?欢迎关注🖱点赞🎀收藏?留言? 📖个人主页:poizxc2014的博客_CSDN博客-数据库,mysql,java领域博主

💻首发时间:🎞2022年05月01日🎠

🎨你做三四月的事,八九月就会有答案,一起加油吧 🔥💖🔮😘🔏🀄🎧如果觉得博主的文章还不错的话,👍请三连支持一下博主哦🤞

如果觉得博主的文章还不错的话,请三连支持一下博主哦

最后的话,在很多方面还做的不好的地方,欢迎大佬指正,一起学习哦,冲冲冲

🏈1、postgresql简介

👛1.1、PG起源 PostgreSQL是由PostgreSQL社区全球志愿者开发团队开发的开源对象-关系型数据库.它源于 UCBerkeley大学1977年的Ingres计划,这个项目是由著名的数据库科学家Michael Stonebraker(2015年图灵奖获得者)领导的在1994年,两个 UC Berkeley大学的研究生 Andrew Y和Jol!chen增加了一个SQL语言解释器来替代早先的基于 Ingres的QUEL系统,建立了Postgres95.为了反映数据库的新SQL查询语言特性, Postgres95在1996年重命名为 PostgreSQL,并第一次发行了以PostgreSQL命名的6.0版本在2005年, PostgreSQL发行了以原生方式运行在Windows系统下的8.0版本.2010年PostgreSQL9.0的发行,PostgreSQL进入了黄金发展阶段,目前, PostgreSQL最新的稳定版是 PostgreSQL10.==PostgreSQL是目前可免费获得的最高级的开源数据库.它非常稳定可靠== 👛1.2、PG特点 PostgreSQL几乎支持多种操作系统,包括各种 Linux发行版及多种UNX、类UNX系统以及 Windows系统,例如AIⅨX、BSD、 HP-UX、 SGIIRIX、 Mac os x、 Solaris、Tru64.它有丰富的编程接口,如C、C++、Go、Java、Perl、 Python、Ruby、Tcl和开放数据库连接(ODBC)的编程接口支持广泛的数据类型,数组、json、 json及几何类型,还可以使用SQL命令 CREATE TYPE创建自定义类型.支持大部分的SQL标准,可以支持复杂SQL查询、支持SQL子查询、 Window function,有非常丰富的统计函数和统计语法支持;支持主键、外键、触发器、视图、物化视图,还可以用多种语言来编写存储过程,例如C、Java、 python、R语言等.持并行计算和基于MVCC的多版本并发控制,支持同步、半同步、异步的流复制支持逻辑复制和订阅, Hot Standby,支持多种数据源的外部表( Foreign data wrappers),可以将其他数据源当作自己的数据表使用,例如 Oracle、 MySQL、 Informix、 SQLite、MS SQL Server等. 👛1.3、许可证

postgresql使用postgresql licence生命,类似于BSD和MIT许可证

🏈2、postgresql安装

postgresql一般有yum安装和编译安装。现在为了安装方便,提供了脚本的傻瓜式安装。

👛2.1、yum安装

查看安装源

# yum search postgresql …… postgresql.i686 : PostgreSQL client programs postgresql.x86_64 : PostgreSQL client programs …… postgresql-contrib.x86_64 : Extension modules distributed with PostgreSQL postgresql-debuginfo.x86_64 : Debug information for package postgresql postgresql-devel.i686 : PostgreSQL development header files and libraries postgresql-devel.x86_64 : PostgreSQL development header files and libraries …… postgresql-libs.i686 : The shared libraries required for any PostgreSQL clients postgresql-libs.x86_64 : The shared libraries required for any PostgreSQL clients …… postgresql-server.x86_64 : The programs needed to create and run a PostgreSQL server

可以看到postgresql是客户端,postgresql-server是服务器,postgresql-contrib是相应的扩展

安装

# yum install postgresql postgresql-server postgresql-contrib -y

安装之后,postgres可执行文件位于

# which postgres /usr/bin/postgres 👛2.2、源码安装

下载

进入PostgreSQL: Downloadshttps://www.postgresql.org/download?官网下载安装包

查看编译选项

./configure --help 选项说明--prefix= PREFIX指定安装目录,默认的安装目录为“/usr/ ocal/pgsql”--includedir=DIR指定C和C++的头文件目录,默认的安装目录为“ PREFIX/ Include。--with- report= PORTNUM指定初始化数据目录时的默认端口,这个值可以在安装之后进行修改需要重启数据库),修改它只在自行制作RPM包时有用,其他时候意义并不大。--with- blocksize= BLOCKSIZE指定数据文件的块大小,默认的是8kB,如果在OLAP场景下可以适当加这个值到32kB,以提高OLAP的性能,但在OLTP场景下建议使用&kB默认值。--with- segsize= SEGSIZE指定单个数据文件的大小,默认是1GB。--with-wal- blocksize= BLOCKSIZE指定WAL文件的块大小,默认是8kB。--with- wal-segsize= SEGSIZE指定单个WAL文件的大小,默认是16MB。

安装

使用gmake 和gmake install

👛2.3、交互式脚本安装

9.5之后提供了交互式的脚本安装,但11版本之后不再提供。参考链接:PostgreSQL: Linux downloads (Red Hat family)https://www.postgresql.org/download/linux/redhat/

# ./postgresql-10.13-1-linux-x64.run ---------------------------------------------------------------------------- Welcome to the PostgreSQL Setup Wizard. ---------------------------------------------------------------------------- Please specify the directory where PostgreSQL will be installed. Installation Directory [/opt/PostgreSQL/10]: ---------------------------------------------------------------------------- Select the components you want to install; clear the components you do not want to install. Click Next when you are ready to continue. PostgreSQL Server [Y/n] :Y pgAdmin 4 [Y/n] :Y Stack Builder [Y/n] :Y Command Line Tools [Y/n] :Y Is the selection above correct? [Y/n]: Y ---------------------------------------------------------------------------- Please select a directory under which to store your data. 🏈3、postgresql参数解释 👛3.1、shared_buffers

integer 类型,设置数据库服务器将使用的共享内存缓冲区数量,此缓冲区为缓冲数据块所用。此缓冲区是放在共享内存中的。每个缓冲区大小的典型值是 8K 字节,默认值通常是 4000,对于 8KB 的数据块则共享内存缓冲区大小为 400*8KB=32MB。这个数值必须大于 16,并且至少是 max_connections 数值的两倍。通常都会把此值设置的大一些,这样可以改进性能。一般设置为物理内存的 25%,若把 shared_buffers 设置的更大,如超过物理内存的 40%,就会发现缓冲的效果并不明显了,这是因为 PostgreSQL 是运行文件系统之上的,若文件系统也有缓存,将导致双缓存过多,造成负面影响。

👛3.2、temp_buffers:

integer 类型,设置每个数据库会话使用的临时缓冲区的最大数目。此本地缓冲区只用于访问临时表。临时缓冲区是在某个连接会话的服务进程中分配的,属于本地内存。临时缓冲区的大小也是按数据块大小分配的,默认是 1000,对于 8K 的数据块大小为 8MB。

👛3.3、work_mem

integer 类型,声明内部排序操作和 Hash 表在开始使用临时磁盘文件之前可使用的内存数目。这个内存也是本地内存,默认是 1MB。请注意对于复杂的查询,可能会同时并发运行好几个排序或散列(hash)操作;每个排序或散列操作都会分配这个参数声明的内存来存储中间数据,只有存不下才会使用临时文件。同样,好几个正在运行的会话可能会同时进行排序操作,因此使用的总内存量可能是 work_mem 的好几倍。 ORDER BY、DISTINCT 和 MERGE JOINS 都要用到排序操作。Hash 表在以 Hash join、Hash 为基础的聚集、以 Hash 为基础的 IN 子查询处理中都要用到。

👛3.4、maintenance_work_mem

integer 类型,声明在维护性操作(比如 CACUUM、CREATE INDEX、ALTER TABLE ADD FOREIGN KEY等)中使用的最大内存数。默认是 16 MB。在一个数据库会话里,只有一个这样的操作可以执行行,并且一个数据库实例通常不会有太多这样的工作并发执行,把这个数值设置得比 work_mem 大一些通常是合适的。更大的设置可以提高上述操作的速度。

👛3.5、max_stack_depth:

integer 类型,声明服务器执行堆栈的最大安全深度。默认值 2MB。如果发现不能运行复杂的函数,可以适当提高此配置的值,不过通常情况下保持默认值就够了。

把 max_stack_depth 参数设置得大于实际的操作系统内核限制值时,意味着一个正在运行的递归函数可能会导致 PostgreSQL 后台服务进程奔溃。在一些操作系统平台上,PG 能够检测出内核限制,这时它将不允许将其设置为一个不安全的值。但PG并不能在所有操作系统的平台都检测它的限制值,所以还是建议设置一个明确的值。

🏈4、?postgresql初始化与启动关闭 👛4.1、创建操作系统用户 # groupadd -g 1000 postgres # useradd -g 1000 -u 1000 postgres # id postgres uid=1000(postgres) gid=1000(postgres) groups=1000(postgres)

注:如果使用yum安装,用户和用户组就直接存在了,无需额外创建

👛4.2、initdb数据库

一定要initdb,否则会报以下错误

May 01 17:26:12 postgresql-check-db-dir[13693]: "/var/lib/pgsql/data" is missing or empty. May 01 17:26:12 postgresql-check-db-dir[13693]: Use "postgresql-setup initdb" to initialize the database cluster. # initdb -D ./data/ -W The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "en_US.UTF-8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". fixing permissions on existing directory ./data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 32MB creating configuration files ... ok creating template1 database in ./data/base/1 ... ok initializing pg_authid ... ok Enter new superuser password: Enter it again: setting password ... ok initializing dependencies ... ok creating system views ... ok loading system objects' descriptions ... ok creating collations ... ok creating conversions ... ok creating dictionaries ... ok setting privileges on built-in objects ... ok creating information schema ... ok loading PL/pgSQL server-side language ... ok vacuuming database template1 ... ok copying template1 to template0 ... ok copying template1 to postgres ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. You can now start the database server using: postgres -D ./data or pg_ctl -D ./data -l logfile start 👛4.3、启动数据库 # service postgresql start Redirecting to /bin/systemctl start postgresql.service

或者

# pg_ctl -D data/ start server starting 👛4.4、查看数据库状态 # su - postgres # pg_ctl -D data/ status pg_ctl: server is running (PID: 13827) /usr/bin/postgres "-D" "/var/lib/pgsql/data" "-p" "5432" 👛4.5、关闭数据库

关闭数据库因为紧急程度,有多种模式

smart:等待活动事务结束,并且等待客户端主动断开连接后才关闭

fast:回滚所有活动的事务,并主动断开客户端的连接

immediate:立刻关闭数据库,下一次启动会进入到恢复模式

三种情况可以分别简写为:-ms -mf -mi

# pg_ctl -D data/ -ms stop waiting for server to shut down................................................... done server stopped 🏈5、元命令?

查看全部

postgres=# \? General \copyright show PostgreSQL usage and distribution terms \g [FILE] or ; execute query (and send results to file or |pipe) \h [NAME] help on syntax of SQL commands, * for all commands \q quit psql Query Buffer \e [FILE] [LINE] edit the query buffer (or file) with external editor \ef [FUNCNAME [LINE]] edit function definition with external editor \p show the contents of the query buffer \r reset (clear) the query buffer \s [FILE] display history or save it to file \w FILE write query buffer to file Input/Output \copy ... perform SQL COPY with data stream to the client host \echo [STRING] write string to standard output \i FILE execute commands from file \ir FILE as \i, but relative to location of current script \o [FILE] send all query results to file or |pipe \qecho [STRING] write string to query output stream (see \o) Informational (options: S = show system objects, + = additional detail) \d[S+] list tables, views, and sequences \d[S+] NAME describe table, view, sequence, or index \da[S] [PATTERN] list aggregates \db[+] [PATTERN] list tablespaces \dc[S+] [PATTERN] list conversions \dC[+] [PATTERN] list casts \dd[S] [PATTERN] show object descriptions not displayed elsewhere \ddp [PATTERN] list default privileges \dD[S+] [PATTERN] list domains \det[+] [PATTERN] list foreign tables \des[+] [PATTERN] list foreign servers \deu[+] [PATTERN] list user mappings \dew[+] [PATTERN] list foreign-data wrappers \df[antw][S+] [PATRN] list [only agg/normal/trigger/window] functions \dF[+] [PATTERN] list text search configurations \dFd[+] [PATTERN] list text search dictionaries \dFp[+] [PATTERN] list text search parsers \dFt[+] [PATTERN] list text search templates \dg[+] [PATTERN] list roles \di[S+] [PATTERN] list indexes \dl list large objects, same as \lo_list \dL[S+] [PATTERN] list procedural languages \dn[S+] [PATTERN] list schemas \do[S] [PATTERN] list operators \dO[S+] [PATTERN] list collations \dp [PATTERN] list table, view, and sequence access privileges \drds [PATRN1 [PATRN2]] list per-database role settings \ds[S+] [PATTERN] list sequences \dt[S+] [PATTERN] list tables \dT[S+] [PATTERN] list data types \du[+] [PATTERN] list roles \dv[S+] [PATTERN] list views \dE[S+] [PATTERN] list foreign tables \dx[+] [PATTERN] list extensions \l[+] list all databases \sf[+] FUNCNAME show a function's definition \z [PATTERN] same as \dp Formatting \a toggle between unaligned and aligned output mode \C [STRING] set table title, or unset if none \f [STRING] show or set field separator for unaligned query output \H toggle HTML output mode (currently off) \pset NAME [VALUE] set table output option (NAME := {format|border|expanded|fieldsep|fieldsep_zero|footer|null| numericlocale|recordsep|recordsep_zero|tuples_only|title|tableattr|pager}) \t [on|off] show only rows (currently off) \T [STRING] set HTML <table> tag attributes, or unset if none \x [on|off|auto] toggle expanded output (currently off) Connection \c[onnect] {[DBNAME|- USER|- HOST|- PORT|-] | conninfo} connect to new database (currently "postgres") \encoding [ENCODING] show or set client encoding \password [USERNAME] securely change the password for a user \conninfo display information about current connection Operating System \cd [DIR] change the current working directory \setenv NAME [VALUE] set or unset environment variable \timing [on|off] toggle timing of commands (currently off) \! [COMMAND] execute command in shell or start interactive shell Variables \prompt [TEXT] NAME prompt user to set internal variable \set [NAME [VALUE]] set internal variable, or list all if no parameters \unset NAME unset (delete) internal variable Large Objects \lo_export LOBOID FILE \lo_import FILE [COMMENT] \lo_list \lo_unlink LOBOID large object operations

基本规则

使用\开头,后面接简写简写命令,一般都可以加+,获得更加详细的信息\d命令指describ,是最重要的元命令。默认是显示表、视图和序列,但往往后面可以跟其他单字母,扩展命令的使用\d加S(大写),显示的是包括系统表的关系\d等命令后可以明确紧跟对象名,精确显示该对象可以在psql里执行shell命令,改变当前目录等

查看数据库

postgres=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+----------+----------+-------------+-------------+----------------------- postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres (3 rows)

查看表空间

postgres=# \db List of tablespaces Name | Owner | Location ------------+----------+---------- pg_default | postgres | pg_global | postgres | (2 rows)

查看表等

postgres=# \d List of relations Schema | Name | Type | Owner --------+--------+-------+---------- public | test | table | postgres public | v_test | view | postgres (2 rows) postgres=# \dt List of relations Schema | Name | Type | Owner --------+------+-------+---------- public | test | table | postgres (1 row) postgres=# \dv List of relations Schema | Name | Type | Owner --------+--------+------+---------- public | v_test | view | postgres (1 row) postgres=# \d test Table "public.test" Column | Type | Modifiers --------+---------+----------- a | integer | postgres=# \dv v_test List of relations Schema | Name | Type | Owner --------+--------+------+---------- public | v_test | view | postgres (1 row) postgres=# \di test_i List of relations Schema | Name | Type | Owner | Table --------+--------+-------+----------+------- public | test_i | index | postgres | test (1 row) postgres=# \di+ test_i List of relations Schema | Name | Type | Owner | Table | Size | Description --------+--------+-------+----------+-------+-------+------------- public | test_i | index | postgres | test | 16 kB | (1 row)

查看函数定义

\sf

切换显示模式

类似于MySQL的\G

postgres=# \x Expanded display is on. postgres=# \dv v_test List of relations -[ RECORD 1 ]---- Schema | public Name | v_test Type | view Owner | postgres postgres=# \dv+ v_test List of relations -[ RECORD 1 ]--------- Schema | public Name | v_test Type | view Owner | postgres Size | 0 bytes Description | postgres=# \x Expanded display is off.

获取元命令对应的SQL代码

[postgres@ ~]# psql -E psql (9.2.24) Type "help" for help. postgres=# postgres=# postgres=# \d ********* QUERY ********** SELECT n.nspname as "Schema", c.relname as "Name", CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' END as "Type", pg_catalog.pg_get_userbyid(c.relowner) as "Owner" FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind IN ('r','v','S','f','') AND n.nspname <> 'pg_catalog' AND n.nspname <> 'information_schema' AND n.nspname !~ '^pg_toast' AND pg_catalog.pg_table_is_visible(c.oid) ORDER BY 1,2; ************************** List of relations Schema | Name | Type | Owner --------+--------+-------+---------- public | test | table | postgres public | v_test | view | postgres (2 rows)

导入导出数据

copy和\copy都可以导入

COPY命令是SQL命令,\copy是元命令COPY命令必须具有 SUPERUSER超级权限(将数据通过 stdin、 stdout方式导入导出情况除外),而\copy元命令不需要 SUPERUSER权限。COPY命令读取或写入数据库服务端主机上的文件,而\copy元命令是从psql客户端大数据量导出到文件或大文件数据导入数据库,COPY比\copy能高 🏈6、数据类型 数字类型 类型长度说明范围与其他db比较smallint2字节小范围整数类型32768到+32767integer4字节整数类型2147483648到+2147483647bigint8字节大范围整数类型-9233203685477808到+9223203685477807decimal可变用户指定精度小数点前131072位;小数点后16383位numeric可变用户指定精度小数点前131072位;小数点后16383位real4字节变长,不精确6位十进制精度double precision8字节变长,不精确15位十进制精度smallserial2字节smallint I自增序列1到32767serial4字节Integer自增序列1到2147483647bigserial8字节bigint自增序列1到922372036854775807money8字节金钱类型

运算以及函数介绍:

#select 3+5,3-5,3*5,5/3,5.0/3,5%3,mod(5,3); ?column? | ?column? | ?column? | ?column? | ?column? | ?column? | mod ----------+----------+----------+----------+--------------------+----------+----- 8 | -2 | 15 | 1 | 1.6666666666666667 | 2 | 2 (1 row) #select round(10.1),round(10.9),ceil(4.3),ceil(4.9),ceil(-3.6),ceil(-3.4),floor(3.2),floor(-3.2); round | round | ceil | ceil | ceil | ceil | floor | floor -------+-------+------+------+------+------+-------+------- 10 | 11 | 5 | 5 | -3 | -3 | 3 | -4 (1 row) 字符类型 类型长度说明范围与其他db比较character varying(n),varchar(n)变长字符最大数有限制character(n), char(n)定长字符数没达到最大值则使用空白填充text变长无长度限制

字符函数

char_length与octet_length

=#select char_length('test'),octet_length('test'),char_length('中国'),octet_length('中国'); char_length | octet_length | char_length | octet_length -------------+--------------+-------------+-------------- 4 | 4 | 2 | 6 (1 row)

substring与substr很相似,但substr不支持from……for这种用法

#select substring('test' from 3 for 5),substring('test',3,5),substring('test',3,4); substring | substring | substring -----------+-----------+----------- st | st | st (1 row) #select substr('test',3,5),substr('test',3,4); substr | substr --------+-------- st | st (1 row) #select substr('test' from 3 for 4); ERROR: syntax error at or near "from" LINE 1: select substr('test' from 3 for 4); ^

position返回第一个匹配的位置,若没匹配则返回0

#select position('a' in 'abcda'),position('ab' in 'abcda'),position('ad' in 'abcda'); position | position | position ----------+----------+---------- 1 | 1 | 0 (1 row)

split_part

#select split_part('abcdabcab','a',1),split_part('abcdabcab','a',2),split_part('bcdabcab','a',1); split_part | split_part | split_part | split_part ------------+------------+------------+------------ | bcd | bcd | bc (1 row) 时间与日期 类型长度说明范围与其他db比较timestamp[(p)][without time zone]8字节包括日期和时间,不带时区,简写成 timestamptimestamp[(p)] with time zone8字节包括日期和时间,带时区,简写成 timestampdate4字节日期,但不包含一天中的时间time[(p)][without time zone]8字节天中的时间,不包含日期,不带时区time[(p)]with time zone12字节天中的时间,不包含日期,带时区interval[fields][(p)]16字节时间间隔

类型转换

#select now(),now():: timestamp with time zone,now():: timestamp without time zone; now | now | now -------------------------------+-------------------------------+---------------------------- 2022-05-01 14:26:34.770816+08 | 2022-05-01 14:26:34.770816+08 | 2022-05-01 14:26:34.770816 (1 row)

可以看到now是默认带时区的

#select now()::date,now():: time with time zone,now():: time without time zone,now():: time(3) without time zone; now | now | now | now ------------+--------------------+-----------------+-------------- 2022-05-01 | 14:56:28.395949+08 | 14:56:28.395949 | 14:56:28.396 (1 row)

可以看到精度的区别,默认和最大为6,超过6会显示warning而直接截断

#select date '2022-05-01'+ interval '1 day' - interval '2 hour',interval '1 month'/double precision '3'; ?column? | ?column? ---------------------+---------- 2022-05-01 22:00:00 | 10 days (1 row)

时间函数

#select current_date,current_time; date | timetz ------------+-------------------- 2022-05-01 | 15:11:33.052683+08 (1 row) #select extract(hour from current_time),extract(hour from current_date),extract(week from now()); date_part | date_part | date_part -----------+-----------+----------- 15 | 0 | 21 (1 row) 网络类型 类型长度说明范围与其他db比较cidr7或19字节IPV4和IPv6网络inet7或19字节IPv4和IPv6网络macaddr6字节MAC地址macaddr88字节MAC地址(EUI-64格式)

举例:

#select '192.168.1.1'::cidr,'192.168.1.100/24'::inet; cidr | inet ----------------+------------------ 192.168.1.1/32 | 192.168.1.100/24 (1 row)

可以看到inet不一定会转换成普通的点分十进制的形式

🏈7、postgresql高级特性 👛7.1、with查询

with查询是postgresql的高级特性,oracle 很早就支持,mysql的新版本也是支持的。说明这种with查询的必要性,能极大提升sql的表现力

with t as (select * from test) select * from t; +------+ | a | +------+ | 1 | +------+ 1 row in set (0.00 sec)

多表中间结果

with t1 as ( select a from test), t2 as ( select a from b) select * from t1 , t2 where t1.a=t2.a; +------+------+ | a | a | +------+------+ | 1 | 1 | +------+------+ 1 row in set (0.01 sec)

递归使用with

with的一个重要特定就是可以递归使用,引用自己的结果。

# with recursive t (x) as ( select 1 union select x + 1 from t where x < 5) select sum(x) from t; sum ----- 15 (1 row) 👛7.2、returning 返回更新后的数据

创建表test

# \d test Table "public.test" Column | Type | Modifiers --------+--------------+----------- a | integer | b | character(8) | Indexes: "test_i" btree (a)

插入数据

insert into test values(1,'test1'),(2,'test2');

使用returning

# update test set a = 5 where a=1 returning * ; a | b ---+---------- 5 | test1 (1 row) 👛7.3、upsert # insert into test_insert(a,b) values (1,'test1') on conflict(a) do nothing; # insert into test_insert(a,b) values (1,'test1') on conflict(a) do update set a=EXCLUDED.a;

注意,这个EXCLUDED表示的是values里的数值,且必须大写

👛7.4、数据抽样

普通的数据抽样

# select * from test order by random() limit 1;

tablesample数据抽样。9.5以后才支持

# select * from test tablesample system(0.00001); # select * from test tablesample bernoulli(0.0001);

system抽样方式是随机抽取表的数据块上的数据,bernoulli抽样是随机抽取表的数据行。这会导致system抽样,每次返回的行数可能不一样,因为每个块上的数据行数不一定一样。

👛7.5、聚合函数

string_agg函数

# create table city( country varchar(64),city varchar(64)); # insert into city values('中国','台北'),('中国','上海'),('中国','香港'),('日本','东京'),('日本','大阪'); # select * from city; country | city ---------+------ 中国 | 台北 中国 | 上海 中国 | 香港 日本 | 东京 日本 | 大阪 (5 rows) # select country,string_agg(city,',') from city group by country; country | string_agg ---------+---------------- 中国 | 台北,上海,香港 日本 | 东京,大阪 (2 rows)

array_agg

与string_agg类似,但返回的是数组

# select country,array_agg(city) from city group by country; country | array_agg ---------+------------------ 中国 | {台北,上海,香港} 日本 | {东京,大阪} (2 rows) 👛7.6、窗口函数

窗口函数包括内置的,比如(),rank(),lag(),以及聚合函数、自定义函数后接over属性

创建基本表并插入数据

# CREATE TABLE score (id serial primary key, subject character varying(32), stu_name character varying(32), score numeric(3,0)); # INSERT INTO score(subject, stu_name, score) VALUEs('Chinese','francs',70),('Chinese','matiler',70),('Chinese','tutu', 80),('English','matiler',75),('English','francs',90),('English','tutu', 60),('Math','francs',80),('Math','matiler', 99),('Math','tutu', 65);

avg() over

如果想查每个学生的成绩并显示课程的平均分,传统方法就是通过表连接,即先查询平均分,然后再与原表连接。如下:

# select id,a.subject,stu_name,score,avg from score as a join (select subject,avg(score) from score group by subject) as b on a.subject=b.subject ; id | subject | stu_name | score | avg ----+---------+----------+-------+--------------------- 1 | Chinese | francs | 70 | 73.3333333333333333 2 | Chinese | matiler | 70 | 73.3333333333333333 3 | Chinese | tutu | 80 | 73.3333333333333333 4 | English | matiler | 75 | 75.0000000000000000 5 | English | francs | 90 | 75.0000000000000000 6 | English | tutu | 60 | 75.0000000000000000 7 | Math | francs | 80 | 81.3333333333333333 8 | Math | matiler | 99 | 81.3333333333333333 9 | Math | tutu | 65 | 81.3333333333333333 (9 rows)

如果使用窗口函数,可以方便很多:

# select id,subject,stu_name,score,avg(score) over(partition by subject) from score; id | subject | stu_name | score | avg ----+---------+----------+-------+--------------------- 1 | Chinese | francs | 70 | 73.3333333333333333 2 | Chinese | matiler | 70 | 73.3333333333333333 3 | Chinese | tutu | 80 | 73.3333333333333333 4 | English | matiler | 75 | 75.0000000000000000 5 | English | francs | 90 | 75.0000000000000000 6 | English | tutu | 60 | 75.0000000000000000 7 | Math | francs | 80 | 81.3333333333333333 8 | Math | matiler | 99 | 81.3333333333333333 9 | Math | tutu | 65 | 81.3333333333333333 (9 rows)

查看查询计划

# explain analyze select id,a.subject,stu_name,score,avg from score as a join (select subject,avg(score) from score group by subject) as b on a.subject=b.subject ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=797.06..1643.18 rows=36864 width=53) (actual time=14.061..27.067 rows=36864 loops=1) Hash Cond: ((a.subject)::text = (b.subject)::text) -> Seq Scan on score a (cost=0.00..612.64 rows=36864 width=21) (actual time=0.013..3.193 rows=36864 loops=1) -> Hash (cost=797.03..797.03 rows=3 width=38) (actual time=14.037..14.037 rows=3 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Subquery Scan on b (cost=796.96..797.03 rows=3 width=38) (actual time=14.024..14.030 rows=3 loops=1) -> HashAggregate (cost=796.96..797.00 rows=3 width=38) (actual time=14.023..14.027 rows=3 loops=1) Group Key: score.subject -> Seq Scan on score (cost=0.00..612.64 rows=36864 width=11) (actual time=0.003..3.231 rows=36864 loops=1) Planning time: 0.219 ms Execution time: 28.528 ms (11 rows)

与窗口函数的查询计划

# explain analyze select id,subject,stu_name,score,avg(score) over(partition by subject) from score; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- WindowAgg (cost=3408.76..4053.88 rows=36864 width=53) (actual time=22.146..40.023 rows=36864 loops=1) -> Sort (cost=3408.76..3500.92 rows=36864 width=21) (actual time=17.463..20.134 rows=36864 loops=1) Sort Key: subject Sort Method: quicksort Memory: 3826kB -> Seq Scan on score (cost=0.00..612.64 rows=36864 width=21) (actual time=0.015..5.280 rows=36864 loops=1) Planning time: 0.130 ms Execution time: 42.077 ms (7 rows)

虽然窗口函数方便阅读,但执行性能比表连接还是要差一些。而且随着数据量增大,这种趋势越发明显。

数据量类型执行时间9表连接0.108ms9窗口函数0.110ms36864表连接28ms36864窗口函数45ms294912表连接180ms294912窗口函数510ms

row_number

# select row_number() over(partition by subject order by score) ,* from score; row_number | id | subject | stu_name | score ------------+----+---------+----------+------- 1 | 2 | Chinese | matiler | 70 2 | 1 | Chinese | francs | 70 3 | 3 | Chinese | tutu | 80 1 | 6 | English | tutu | 60 2 | 4 | English | matiler | 75 3 | 5 | English | francs | 90 1 | 9 | Math | tutu | 65 2 | 7 | Math | francs | 80 3 | 8 | Math | matiler | 99 (9 rows)

rank()与dense_rank()

两个函数很相似,都可以在组内排名,但rank在字段相同时会产生间隙gap,而dense_rank不会

# select rank() over(partition by subject order by score) ,* from score; rank | id | subject | stu_name | score ------+----+---------+----------+------- 1 | 2 | Chinese | matiler | 70 1 | 1 | Chinese | francs | 70 3 | 3 | Chinese | tutu | 80 1 | 6 | English | tutu | 60 2 | 4 | English | matiler | 75 3 | 5 | English | francs | 90 1 | 9 | Math | tutu | 65 2 | 7 | Math | francs | 80 3 | 8 | Math | matiler | 99 (9 rows) # select dense_rank() over(partition by subject order by score) ,* from score; dense_rank | id | subject | stu_name | score ------------+----+---------+----------+------- 1 | 2 | Chinese | matiler | 70 1 | 1 | Chinese | francs | 70 2 | 3 | Chinese | tutu | 80 1 | 6 | English | tutu | 60 2 | 4 | English | matiler | 75 3 | 5 | English | francs | 90 1 | 9 | Math | tutu | 65 2 | 7 | Math | francs | 80 3 | 8 | Math | matiler | 99 (9 rows)

lag()

# select lag(id,1) over(partition by stu_name),* from score; lag | id | subject | stu_name | score -----+----+---------+----------+------- | 1 | Chinese | francs | 70 1 | 5 | English | francs | 90 5 | 7 | Math | francs | 80 | 4 | English | matiler | 75 4 | 8 | Math | matiler | 99 8 | 2 | Chinese | matiler | 70 | 6 | English | tutu | 60 6 | 9 | Math | tutu | 65 9 | 3 | Chinese | tutu | 80 (9 rows)

first_value() 和last_value()

first_value获取组内的第一个值,last_value组内最后一个值

# select first_value(score) over(partition by subject ),* from score; first_value | id | subject | stu_name | score -------------+----+---------+----------+------- 70 | 1 | Chinese | francs | 70 70 | 2 | Chinese | matiler | 70 70 | 3 | Chinese | tutu | 80 75 | 4 | English | matiler | 75 75 | 5 | English | francs | 90 75 | 6 | English | tutu | 60 80 | 7 | Math | francs | 80 80 | 8 | Math | matiler | 99 80 | 9 | Math | tutu | 65 (9 rows) # select last_value(id) over(partition by subject ),* from score; last_value | id | subject | stu_name | score ------------+----+---------+----------+------- 3 | 1 | Chinese | francs | 70 3 | 2 | Chinese | matiler | 70 3 | 3 | Chinese | tutu | 80 6 | 4 | English | matiler | 75 6 | 5 | English | francs | 90 6 | 6 | English | tutu | 60 9 | 7 | Math | francs | 80 9 | 8 | Math | matiler | 99 9 | 9 | Math | tutu | 65 (9 rows)

但在这里有一个bug,就是就是如果在窗口函数里面加了order by,会出现诡异现象

# select last_value(id) over(partition by subject order by score ),* from score; last_value | id | subject | stu_name | score ------------+----+---------+----------+------- 1 | 2 | Chinese | matiler | 70 1 | 1 | Chinese | francs | 70 3 | 3 | Chinese | tutu | 80 6 | 6 | English | tutu | 60 4 | 4 | English | matiler | 75 5 | 5 | English | francs | 90 9 | 9 | Math | tutu | 65 7 | 7 | Math | francs | 80 8 | 8 | Math | matiler | 99 (9 rows)

这个的结果与下面语句相同

# select last_value(id) over(partition by subject ,score ),* from score; last_value | id | subject | stu_name | score ------------+----+---------+----------+------- 1 | 2 | Chinese | matiler | 70 1 | 1 | Chinese | francs | 70 3 | 3 | Chinese | tutu | 80 6 | 6 | English | tutu | 60 4 | 4 | English | matiler | 75 5 | 5 | English | francs | 90 9 | 9 | Math | tutu | 65 7 | 7 | Math | francs | 80 8 | 8 | Math | matiler | 99 (9 rows)

原因应该是分组的时候将order 里面的条件当成了partition的条件了,也就是将(subject,score)当分组条件

nth_value()

取结果集分组的指定行数据

# select nth_value(id,2) over(partition by subject ),* from score; nth_value | id | subject | stu_name | score -----------+----+---------+----------+------- 2 | 1 | Chinese | francs | 70 2 | 2 | Chinese | matiler | 70 2 | 3 | Chinese | tutu | 80 5 | 4 | English | matiler | 75 5 | 5 | English | francs | 90 5 | 6 | English | tutu | 60 8 | 7 | Math | francs | 80 8 | 8 | Math | matiler | 99 8 | 9 | Math | tutu | 65 (9 rows)

但下面的结果无法理解

# select nth_value(id,2) over(partition by subject order by id desc),* from score; nth_value | id | subject | stu_name | score -----------+----+---------+----------+------- | 3 | Chinese | tutu | 80 2 | 2 | Chinese | matiler | 70 2 | 1 | Chinese | francs | 70 | 6 | English | tutu | 60 5 | 5 | English | francs | 90 5 | 4 | English | matiler | 75 | 9 | Math | tutu | 65 8 | 8 | Math | matiler | 99 8 | 7 | Math | francs | 80

使用subject字段来排序,却是正常的

# select nth_value(id,2) over(partition by subject order by subject desc ),* from score; nth_value | id | subject | stu_name | score -----------+----+---------+----------+------- 7 | 9 | Math | tutu | 65 7 | 7 | Math | francs | 80 7 | 8 | Math | matiler | 99 4 | 5 | English | francs | 90 4 | 4 | English | matiler | 75 4 | 6 | English | tutu | 60 3 | 1 | Chinese | francs | 70 3 | 3 | Chinese | tutu | 80 3 | 2 | Chinese | matiler | 70 (9 rows)

窗口函数别名

# select sum(score) over (r) ,avg(score) over (r) ,* from score window r as (partition by subject); sum | avg | id | subject | stu_name | score -----+---------------------+----+---------+----------+------- 220 | 73.3333333333333333 | 1 | Chinese | francs | 70 220 | 73.3333333333333333 | 2 | Chinese | matiler | 70 220 | 73.3333333333333333 | 3 | Chinese | tutu | 80 225 | 75.0000000000000000 | 4 | English | matiler | 75 225 | 75.0000000000000000 | 5 | English | francs | 90 225 | 75.0000000000000000 | 6 | English | tutu | 60 244 | 81.3333333333333333 | 7 | Math | francs | 80 244 | 81.3333333333333333 | 8 | Math | matiler | 99 244 | 81.3333333333333333 | 9 | Math | tutu | 65 (9 rows) 🏈8、查看全部参数 [postgres@~]# psql --help psql is the PostgreSQL interactive terminal. Usage: psql [OPTION]... [DBNAME [USERNAME]] General options: -c, --command=COMMAND run only single command (SQL or internal) and exit -d, --dbname=DBNAME database name to connect to (default: "postgres") -f, --file=FILENAME execute commands from file, then exit -l, --list list available databases, then exit -v, --set=, --variable=NAME=VALUE set psql variable NAME to VALUE -V, --version output version information, then exit -X, --no-psqlrc do not read startup file (~/.psqlrc) -1 ("one"), --single-transaction execute command file as a single transaction -?, --help show this help, then exit Input and output options: -a, --echo-all echo all input from script -e, --echo-queries echo commands sent to server -E, --echo-hidden display queries that internal commands generate -L, --log-file=FILENAME send session log to file -n, --no-readline disable enhanced command line editing (readline) -o, --output=FILENAME send query results to file (or |pipe) -q, --quiet run quietly (no messages, only query output) -s, --single-step single-step mode (confirm each query) -S, --single-line single-line mode (end of line terminates SQL command) Output format options: -A, --no-align unaligned table output mode -F, --field-separator=STRING set field separator (default: "|") -H, --html HTML table output mode -P, --pset=VAR[=ARG] set printing option VAR to ARG (see \pset command) -R, --record-separator=STRING set record separator (default: newline) -t, --tuples-only print rows only -T, --table-attr=TEXT set HTML table tag attributes (e.g., width, border) -x, --expanded turn on expanded table output -z, --field-separator-zero set field separator to zero byte -0, --record-separator-zero set record separator to zero byte Connection options: -h, --host=HOSTNAME database server host or socket directory (default: "local socket") -p, --port=PORT database server port (default: "5432") -U, --username=USERNAME database user name (default: "postgres") -w, --no-password never prompt for password -W, --password force password prompt (should happen automatically) For more information, type "\?" (for internal commands) or "\help" (for SQL commands) from within psql, or consult the psql section in the PostgreSQL documentation. Report bugs to <pgsql-bugs@postgresql.org>.

查看全部数据库

# psql -l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+----------+----------+-------------+-------------+----------------------- postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres (3 rows)

与在psql里执行\l是一样的

格式设置

-A显示非对齐模式

# psql -A -c "select * from test order by a " a|b 1|test 2|2 3|test3 4|4 (4 rows) # psql -c "select * from test order by a " a | b ---+---------- 1 | test 2 | 2 3 | test3 4 | 4 (4 rows)

-t 只显示数据

# psql -t -c "select * from test order by a " 1 | test 2 | 2 3 | test3 4 | 4

-q不显示输出信息

# psql -q -c "create table test2(a int) " # psql -c "select * from test2 order by a " a --- (0 rows)

可以看到没有显示"create table"这种信息

-f执行sql文件

传递变量

psql内部定义变量 postgres=# \set value 1 postgres=# select * from test where a=:value; a | b ---+---------- 1 | test (1 row)

取消变量定义

postgres=# \set value postgres=# select * from test where a=:value; ERROR: syntax error at or near ";" LINE 1: select * from test where a=; ^ psql 命令行传递变量 # psql -v value=1 -f 1.sql a | b ---+---------- 1 | test (1 row)

其中1.sql内容是

select * from test where a=:value;

通过传递变量的方式定义脚本

修改或者新建.psqlrc文件,写入如下内容(查询活动会话,记忆pg_stat_activity视图):

\set active_session 'select pid, usename, datname, query, client_addr from pg_stat_activity where pid <> pg_backend_pid() and state=\'active\' order by query;'

这样可以将定义的变量传递到psql中

postgres=# :active_session pid | usename | datname | query | client_addr -----+---------+---------+-------+------------- (0 rows)

其实也可以在psql中直接定义,当前会话中可以一直使用

如下是第二个例子。修改或者新建.psqlrc:

\set wait_event 'select pid,usename,datname,query,client_addr from pg_stat_activity where pid <> pg_backend_pid() ;' \set connections 'select datname, usename, client_addr, count(*) from pg_stat_activity where pid <> pg_backend_pid() group by 1,2,3 order by 1,2,4 desc;'

进入psql执行变量,即执行sql

postgres=# :wait_event pid | usename | datname | query | client_addr -----+---------+---------+-------+------------- (0 rows) postgres=# :connections datname | usename | client_addr | count ----------+----------+-------------+------- postgres | postgres | | 1 (1 row)

psql可以tab自动补全

这个类似于mysql的auto-hash,但比auto-hash更好用

客户端提示符定制

参数解释%M数据库服务器别名,不是指主机名,显示的是psql的-h参数设置的值; 当连接建立在Unix域套字上时则是【local】%>数据库服务器的端口号。%n数据库会话的用户名,在数据库会话期间 这个值可能会因为命令SET SESSION UTHORIZATION的结果而改变。%/当前数据库名称。%#如果是超级用户则显示“#”,其他用户显示“>”,在数据库会话期间, 这个值可能会因为命令 SET SESSION AUTHORIZATION的结果而改变。%p当前数据库连接的后台进程号%R在 PROMPT1中通常显示“=”,如果进程被断开则显示“!”。
postgres=# \echo :PROMPT1 %/%R%# postgres=# \set PROMPT1 %/%M%R%# postgres[local]=#


1.本站遵循行业规范,任何转载的稿件都会明确标注作者和来源;2.本站的原创文章,会注明原创字样,如未注明都非原创,如有侵权请联系删除!;3.作者投稿可能会经我们编辑修改或补充;4.本站不提供任何储存功能只提供收集或者投稿人的网盘链接。

标签: #postgres入门 #MySQL