Snova运维篇(二):GP集群配置和高可用特性

本节主要探讨greenplum集群配置和高可用特性。

目录:

  1. 配置greenplum数据库
  2. 启用高可用特性

基本概念:

raid

独立硬盘冗余阵列(RAID, Redundant Array of Independent Disks)


1.配置greenplum

Greenplum数据的配置文件postgresql.conf位于数据库实例的数据目录之下。

代码语言:javascript
复制
[gpadmin@gp-master etc]$ cd /data/master/gpseg-1/
[gpadmin@gp-master gpseg-1]$ ls
base     gpmetrics   pg_changetracking  pg_distributedxidmap  pg_log        pg_subtrans  pg_utilitymodedtmredo  postgresql.conf
global   gpperfmon   pg_clog            pg_hba.conf           pg_multixact  pg_tblspc    PG_VERSION             postmaster.opts
gp_dbid  gpssh.conf  pg_distributedlog  pg_ident.conf         pg_stat_tmp   pg_twophase  pg_xlog                postmaster.pid
  • 设置本地参数

gpconfig,通过主节点就可以对所有节点的参数文件进行批量修改

代码语言:javascript
复制
[gpadmin@gp-master gpseg-1]$ gpconfig -s max_connections
Values on all segments are consistent
GUC          : max_connections
Master  value: 250
Segment value: 750
代码语言:javascript
复制
[gpadmin@gp-master gpseg-1]$ gpconfig -c max_connections -v 750 -m 150
20191224:16:40:01:031503 gpconfig:gp-master:gpadmin-[INFO]:-completed successfully with parameters '-c max_connections -v 750 -m 150'
代码语言:javascript
复制
[gpadmin@gp-master gpseg-1]$ gpstop -r
20191224:16:43:33:032280 gpstop:gp-master:gpadmin-[INFO]:-Starting gpstop with args: -r
20191224:16:43:33:032280 gpstop:gp-master:gpadmin-[INFO]:-Gathering information and validating the environment...
20191224:16:43:33:032280 gpstop:gp-master:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20191224:16:43:33:032280 gpstop:gp-master:gpadmin-[INFO]:-Obtaining Segment details from master...
20191224:16:43:33:032280 gpstop:gp-master:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 5.21.1 build commit:ca0b8106b893028d18f241dcb858d85f12af90b6'
代码语言:javascript
复制
[gpadmin@gp-master gpseg-1]$ gpconfig -s max_connections
Values on all segments are consistent
GUC          : max_connections
Master  value: 150
Segment value: 750

案例:

gpconfig只能在系统启动的情况下调用,所以如果参数修改不合适,导致系统无法启动时,我们可以用下列方法处理:

1、先把master的参数修改成正常的值

2、gpstart -m 仅启动master进入管理模式

3、gpconfig -r  <参数>  -- 把参数重置成默认值

4、gpstop -a -r -M fast

  • 设置master参数

编辑$MASTER_DATA_DIRECTORY/postgresql.conf文件,找到要设置的参数,取消它的注释(移除前面的#字符),并且输入想要的值。保存并且关闭该文件。

代码语言:javascript
复制
[gpadmin@gp-master gpseg-1]$ pwd
/data/master/gpseg-1
[gpadmin@gp-master gpseg-1]$ ls
base       gpperfmon          pg_distributedlog     pg_log        pg_tblspc              pg_xlog              postmaster.pid
global     gpssh.conf         pg_distributedxidmap  pg_multixact  pg_twophase            postgresql.conf
gp_dbid    pg_changetracking  pg_hba.conf           pg_stat_tmp   pg_utilitymodedtmredo  postgresql.conf.bak
gpmetrics  pg_clog            pg_ident.conf         pg_subtrans   PG_VERSION             postmaster.opts

需要重启生效的参数:

代码语言:javascript
复制
 gpstop -r

不需要重启的参数:

代码语言:javascript
复制
gpstop -u 
  • 在数据库级别设置参数

每一个连接到该数据库的会话都使用该参数设置。数据库级别的设置覆盖系统级别的设置。

代码语言:javascript
复制
=# ALTER DATABASE mydatabase SET search_path TO myschema;
  • 在角色级别设置参数

每一个由该角色启动的会话都使用该参数设置。角色级别的设置覆盖数据库级别的设置。

代码语言:javascript
复制
=# ALTER ROLE bob SET search_path TO bobschema;
  • 会话级别设置参数
代码语言:javascript
复制
=# SET statement_mem TO '200MB';
代码语言:javascript
复制
=# RESET statement_mem;
  • 查看服务器配置参数
代码语言:javascript
复制
[gpadmin@gp-master gpseg-1]$ psql -c 'show all' -d  postgres  //查看master实例参数
                         name                         |                                     setting
       |                                                                                   description

------------------------------------------------------+---------------------------------------------------------------------------
-------+--------------------------------------------------------------------------------------------------------------------------

add_missing_from | off
| Automatically adds missing table references to FROM clauses.
application_name | psql
| Sets the application name to be reported in statistics and logs.
archive_mode | off
| Allows archiving of WAL files using archive_command.
array_nulls | on

代码语言:javascript
复制
[gpadmin@gp-master gpseg-1]$ gpconfig --show max_connections  //查看全局参数
Values on all segments are consistent
GUC : max_connections
Master value: 150
Segment value: 750
  • 配置参数种类

连接和认证参数:

这些参数控制着客户端如何连接到Greenplum数据库以及如何被认证。

gp_connection_send_timeout

gp_vmem_idle_resource_timeout

listen_addresses

max_connections

max_prepared_transactions

superuser_reserved_connections

tcp_keepalives_count

tcp_keepalives_idle

tcp_keepalives_interval

unix_socket_directory

unix_socket_group

unix_socket_permissions

安全性和认证参数:

authentication_timeout

db_user_namespace

krb_caseins_users

krb_server_keyfile

krb_srvname

password_encryption

password_hash_algorithm

ssl

ssl_ciphers

2.启用高可用特性

  • 硬件raid

从硬件存储级别实现的数据冗余。常用raid:

RAID0,RAID1,RAID5,RAID10

常见raid比对
  • segment镜像

Segment 分为主备,称为 Primary 和 Mirror,Mirror 是 Primary 的备。Primary与Mirror之间强同步保证数据一致性和可靠性,其间的监控与切换则由Master的FTS模块负责。当FTS发现Primary宕机、Mirror健康后会激活Mirror,并标记Primary为’d’,Mirror进入 ChangeTracking 状态。

  • master镜像概述

由于Master不保存用户数据,只有系统目录表被在主Master和后备Master之间同步。当这些表被更新时,更改会被自动地复制到后备Master来让它保持与主Master的同步。

  • 故障检测与修复概述

Greenplum数据库服务器名为ftsprobe的(postgres)子进程处理故障检测。ftsprobe会监控Greenplum数据库阵列,它连接并且扫描所有的Segment,并且数据库会按照配置的间隔进行这种处理。

gprecoverseg恢复工具。这个工具定位失效的Segment、验证它们是否有效并且与当前活动的Segment比较事务状态来确定该Segment离线期间所作的更改

代码语言:javascript
复制
[gpadmin@gp-master ~]$ gprecoverseg --help
COMMAND NAME: gprecoverseg

Recovers a primary or mirror segment instance that has //修复被标记为down的segment
been marked as down (if mirroring is enabled).


-i

主要参数,用于指定一个配置文件,该配置文件描述了需要修复的Segment和修复后的目的位置。

-F

可选项,指定后,gprecoverseg会将”-i”中指定的或标记”d”的实例删除,并从活着的Mirror复制一个完整一份到目标位置。

-r

当FTS发现有Primary宕机并进行主备切换,在gprecoverseg修复后,担当Primary的Mirror角色并不会立即切换回来,就会导致部分主机上活跃的Segment过多从而引起性能瓶颈。因此需要恢复Segment原先的角色,称为re-balance。

  • 启用segment镜像 (此工具内容较多后续补充)
代码语言:javascript
复制
[gpadmin@gp-master ~]$ gpaddmirrors --help
COMMAND NAME: gpaddmirrors

Adds mirror segments to a Greenplum Database system that was
initially configured without mirroring.


SYNOPSIS


gpaddmirrors [-p <port_offset>] [-m <datadir_config_file> [-a]] [-s]
[-d <master_data_directory>] [-B <parallel_processes>]
[-l <logfile_directory>] [-v]

gpaddmirrors -i <mirror_config_file> [-s] [-a]
[-d <master_data_directory>] [-B <parallel_processes>]
[-l <logfile_directory>] [-v]

gpaddmirrors -o <output_sample_mirror_config> [-m <datadir_config_file>]

gpaddmirrors -?

  • 启用备节点master镜像
代码语言:javascript
复制
 gpinitstandby -s smdw</code></pre></div></div><div class="rno-markdown-code"><div class="rno-markdown-code-toolbar"><div class="rno-markdown-code-toolbar-info"><div class="rno-markdown-code-toolbar-item is-type"><span class="is-m-hidden">代码语言:</span>javascript</div></div><div class="rno-markdown-code-toolbar-opt"><div class="rno-markdown-code-toolbar-copy"><i class="icon-copy"></i><span class="is-m-hidden">复制</span></div></div></div><div class="developer-code-block"><pre class="prism-token token line-numbers language-javascript"><code class="language-javascript" style="margin-left:0">[gpadmin@gp-master ~] psql komablog -c 'SELECT procpid, state FROM pg_stat_replication;'
procpid | state
---------+-----------
32593 | streaming
(1 row)

列出有关walsender进程的信息,该进程被用于镜像Greenplum数据库的Master。

  • 检测失效的segment

在Greenplum数据库的Master主机上,Postgres的postmaster进程会派生一个故障探测进程ftsprobe。它有时也被称作FTS(容错服务器)进程。如果FTS失败,postmaster进程会重启它。

gpstate 工具

代码语言:javascript
复制
[gpadmin@gp-master ~]$ gpstate -m   //显示镜像segment 实例信息
20191224:18:47:57:022244 gpstate:gp-master:gpadmin-[INFO]:-Starting gpstate with args: -m
20191224:18:47:57:022244 gpstate:gp-master:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.21.1 build commit:ca0b8106b893028d18f241dcb858d85f12af90b6'
20191224:18:47:57:022244 gpstate:gp-master:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.21.1 build commit:ca0b8106b893028d18f241dcb858d85f12af90b6) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jul 24 2019 22:51:37'
20191224:18:47:57:022244 gpstate:gp-master:gpadmin-[INFO]:-Obtaining Segment details from master...
20191224:18:47:57:022244 gpstate:gp-master:gpadmin-[WARNING]:--------------------------------------------------------------
20191224:18:47:57:022244 gpstate:gp-master:gpadmin-[WARNING]:-physical mirroring not used
20191224:18:47:57:022244 gpstate:gp-master:gpadmin-[WARNING]:--------------------------------------------------------------
代码语言:javascript
复制
[gpadmin@gp-master ~]$ gpstate -e  //显示segment 镜像状态
20191224:18:48:25:022371 gpstate:gp-master:gpadmin-[INFO]:-Starting gpstate with args: -e
20191224:18:48:25:022371 gpstate:gp-master:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.21.1 build commit:ca0b8106b893028d18f241dcb858d85f12af90b6'
20191224:18:48:25:022371 gpstate:gp-master:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.21.1 build commit:ca0b8106b893028d18f241dcb858d85f12af90b6) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jul 24 2019 22:51:37'
20191224:18:48:25:022371 gpstate:gp-master:gpadmin-[INFO]:-Obtaining Segment details from master...
20191224:18:48:25:022371 gpstate:gp-master:gpadmin-[INFO]:-Physical mirroring is not configured
  • 检查日志文件
代码语言:javascript
复制
$ gplogfilter -t
[gpadmin@gp-master ~]$ gplogfilter -t
requested timestamp range from beginning of data to end of data
---------- /data/master/gpseg-1/pg_log/gpdb-2019-12-24_153623.csv ----------
in: 57 lines, 57 log entries; timestamps from 2019-12-24 15:36:23.027330 to 2019-12-24 15:36:24.464534
match: 0 lines
out: 0 lines, 0 log entries
---------- /data/master/gpseg-1/pg_log/gpdb-2019-12-23_000000.csv ----------
in: 0 lines, 0 log entries; no timestamps found
match: 0 lines
out: 0 lines, 0 log entries
---------- /data/master/gpseg-1/pg_log/gpdb-2019-12-24_150758.csv ----------
in: 57 lines, 57 log entries; timestamps from 2019-12-24 15:07:58.102589 to 2019-12-24 15:07:59.531255
match: 0 lines
out: 0 lines, 0 log entries
---------- /data/master/gpseg-1/pg_log/gpdb-2019-12-24_164403.csv ----------
in: 57 lines, 57 log entries; timestamps from 2019-12-24 16:44:03.362180 to 2019-12-24 16:44:04.715757
match: 0 lines
out: 0 lines, 0 log entries
代码语言:javascript
复制
//批量检查segment日志
gpssh -f seg_hosts -e 'source /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t /data/primary/gpseg*/pg_log/gpdb*.csv' > seglog.out
代码语言:javascript
复制
[gp-node1] source /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t /data/primary/gpseg*/pg_log/gpdb*.csv
[gp-node1] requested timestamp range from beginning of data to end of data
[gp-node1] ---------- /data/primary/gpseg0/pg_log/gpdb-2019-12-19_164202.csv ----------
[gp-node1] in: 15 lines, 15 log entries; timestamps from 2019-12-19 16:42:02.777733 to 2019-12-19 16:42:03.762678
[gp-node1] match: 0 lines
[gp-node1] out: 0 lines, 0 log entries
[gp-node1] ---------- /data/primary/gpseg0/pg_log/gpdb-2019-12-19_164211.csv ----------
[gp-node1] in: 13 lines, 13 log entries; timestamps from 2019-12-19 16:42:11.012501 to 2019-12-19 16:42:11.791297
[gp-node1] match: 0 lines
[gp-node1] out: 0 lines, 0 log entries
[gp-node1] ---------- /data/primary/gpseg0/pg_log/gpdb-2019-12-20_000000.csv ----------
[gp-node1] in: 3 lines, 3 log entries; timestamps from 2019-12-20 15:03:05.230915 to 2019-12-20 15:03:05.296381
[gp-node1] match: 0 lines
[gp-node1] out: 0 lines, 0 log entries
[gp-node1] ---------- /data/primary/gpseg0/pg_log/gpdb-2019-12-20_151848.csv ----------
[gp-node1] in: 16 lines, 16 log entries; timestamps from 2019-12-20 15:18:48.992510 to 2019-12-20 15:32:28.987426
[gp-node1] match: 0 lines
[gp-node1] out: 0 lines, 0 log entries
[gp-node1] ---------- /data/primary/gpseg0/pg_log/gpdb-2019-12-20_153742.csv ----------