单机换盘
查看组成存储池的磁盘设备:
$ zpool status
pool: storage
state: DEGRADED
status: One or more devices has been removed by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: resilvered 235K in 0h0m with 0 errors on Fri Jan 8 01:49:23 2021
config:
NAME STATE READ WRITE CKSUM
storage DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
sdb ONLINE 0 0 0
sdc ONLINE 0 0 0
sdd REMOVED 0 0 0
可以看到此时 sdd
是处于不可用的状态,此时新加入一块新磁盘,插入后验证在 /dev/
目录下能找到该设备:
$ ls /dev/sde
/dev/sde
用新插入的磁盘 sde
替换不可用的磁盘 sdd
:
$ zpool replace storage sdd /dev/sde
检查存储池状态:
$ zpool status
pool: storage
state: ONLINE
scan: resilvered 1.07M in 0h0m with 0 errors on Fri Jan 8 02:04:29 2021
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sdb ONLINE 0 0 0
sdc ONLINE 0 0 0
sde ONLINE 0 0 0
整机故障
整机故障有两种情况:
- 一、机器本身正常,但 ZFS 存储池中磁盘设备不可用两个以上,则该节点就相当于不可用,此时需要替换不可用的磁盘为新盘,重新格式化原来的磁盘并构建ZFS 存储池;
- 二、机器宕机,此时需要更换一台新机器,在新机器中添加新盘构建存储池,并将新机器设为 Gluster 节点,以替换原来的故障节点;
原机换盘恢复
我这里 gluster 集群由 10.0.1.111、10.0.1.112、10.0.1.113 组成,下面模拟 10.0.1.113 中挂掉两块磁盘,然后更换 10.0.1.113 的这两块坏盘进行恢复。
检查 ZFS 存储池发现两块磁盘挂掉了:
$ zpool status
pool: storage
state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: http://zfsonlinux.org/msg/ZFS-8000-HC
scan: none requested
config:
NAME STATE READ WRITE CKSUM
storage UNAVAIL 0 0 0 insufficient replicas
raidz1-0 UNAVAIL 0 0 0 insufficient replicas
sdb FAULTED 12 0 0 too many errors
sdc FAULTED 9 0 0 too many errors
sdd ONLINE 0 0 0
sde ONLINE 0 0 0
sdf ONLINE 0 0 0
errors: List of errors unavailable: pool I/O is currently suspended
此时由这五块磁盘组成的存储池已经是不可用了,并且此时的 storage
池处于无法 destroy 的状态。
把 sdb
、sdc
这两块坏盘拔掉,查看块设备列表:
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 30G 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 1G 0 part /boot
└─sda3 8:3 0 29G 0 part
└─ubuntu--vg-ubuntu--lv 253:0 0 20G 0 lvm /
sdd 8:48 0 20G 0 disk
├─sdd1 8:49 0 20G 0 part
└─sdd9 8:57 0 8M 0 part
sde 8:64 0 20G 0 disk
├─sde1 8:65 0 20G 0 part
└─sde9 8:73 0 8M 0 part
sdf 8:80 0 20G 0 disk
├─sdf1 8:81 0 20G 0 part
└─sdf9 8:89 0 8M 0 part
sr0 11:0 1 1024M 0 rom
插上新盘,重启后查看块设备列表:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 30G 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 1G 0 part /boot
└─sda3 8:3 0 29G 0 part
└─ubuntu--vg-ubuntu--lv 253:0 0 20G 0 lvm /
sdb 8:16 0 20G 0 disk
├─sdb1 8:17 0 20G 0 part
└─sdb9 8:25 0 8M 0 part
sdc 8:32 0 20G 0 disk
├─sdc1 8:33 0 20G 0 part
└─sdc9 8:41 0 8M 0 part
sdd 8:48 0 20G 0 disk
├─sdd1 8:49 0 20G 0 part
└─sdd9 8:57 0 8M 0 part
sde 8:64 0 20G 0 disk
├─sde1 8:65 0 20G 0 part
└─sde9 8:73 0 8M 0 part
sdf 8:80 0 20G 0 disk
├─sdf1 8:81 0 20G 0 part
└─sdf9 8:89 0 8M 0 part
sr0 11:0 1 1024M 0 rom
重启后会发现原来的 storage
存储池已不存在了:
$ zpool status
no pools available
重新创建存储池:
$ zpool create storage raidz1 sdb sdc sdd sde sdf -f
查看存储池状态:
$ zpool status
pool: storage
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sdb ONLINE 0 0 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 0 0 0
sdf ONLINE 0 0 0
errors: No known data errors
挂载存储池到一个新目录:
$ mkdir -p /data/gluster_brick_new
$ zfs set mountpoint=/data/gluster_brick_new storage
重启 Gluster 服务:
$ systemctl restart glusterd
检查存储卷各 brick 的连通状态:
$ gluster volume heal gv info summary
Brick 10.0.1.111:/data/gluster_brick
/8
/
Status: Connected
Number of entries: 2
Brick 10.0.1.112:/data/gluster_brick
/
/8
Status: Connected
Number of entries: 2
Brick 10.0.1.113:/data/gluster_brick
Status: Transport endpoint is not connected
Number of entries: -
可以看到 10.0.1.113 此时是未连接的状态,此时存储卷依旧是由 10.0.1.113 原来已损坏的 brick 组成:
$ gluster volume info gv
Volume Name: gv
Type: Disperse
Volume ID: 3efefbf1-d29b-4708-8ca2-f0241b030152
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.0.1.111:/data/gluster_brick
Brick2: 10.0.1.112:/data/gluster_brick
Brick3: 10.0.1.113:/data/gluster_brick
Options Reconfigured:
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on
使用 10.0.1.113 的新 brick 替换原来的 brick:
$ gluster volume replace-brick gv 10.0.1.113:/data/gluster_brick 10.0.1.113:/data/gluster_brick_new commit force
volume replace-brick: success: replace-brick commit force operation successful
再次检查存储卷各节点的连通状态:
$ gluster volume heal gv info
Brick 10.0.1.111:/data/gluster_brick
Status: Connected
Number of entries: 0
Brick 10.0.1.112:/data/gluster_brick
Status: Connected
Number of entries: 0
Brick 10.0.1.113:/data/gluster_brick_new
Status: Connected
Number of entries: 0
新机恢复
我这里 gluster 集群由 10.0.1.111、10.0.1.112、10.0.1.113 组成,下面模拟 10.0.1.113 直接挂掉,然后新加机器 10.0.1.114 到 gluster 集群替换 10.0.1.113。
检查节点池发现 10.0.1.113 挂掉:
$ gluster pool list
UUID Hostname State
363e49d5-c2d6-4d76-8eb5-2b1bc5e8c3c9 10.0.1.112 Connected
196374b7-6c4b-4dfc-86ad-35166eccafb5 10.0.1.113 Disconnected
892d9c99-79af-4cfb-99a2-c79aeefc5d36 localhost Connected
找一台新机 10.0.1.114,在 10.0.1.114 上安装好 gluster 服务和 zfs 包:
# gluster 服务的安装参考 GlusterFS 部署文档
# 安装 zfs 包
$ apt-get install zfsutils-linux -y
10.0.1.114 创建 ZFS 存储池:
$ zpool create storage raidz1 sdb sdc sdd sde sdf
启动 10.0.1.114 上的 gluster 服务:
$ systemctl start glusterd
在 10.0.1.114 上挂载 ZFS 存储池用作 gluster 的 brick:
$ mkdir -p /data/gluster_brick
$ zfs set mountpoint=/data/gluster_brick storage
在 10.0.1.111 或 10.0.1.112 上关联 10.0.1.114 到 gluster 集群:
$ gluster peer probe 10.0.1.114
peer probe: success.
检查 gluster 集群列表:
$ gluster pool list
UUID Hostname State
363e49d5-c2d6-4d76-8eb5-2b1bc5e8c3c9 10.0.1.112 Connected
196374b7-6c4b-4dfc-86ad-35166eccafb5 10.0.1.113 Disconnected
9b4a2dc2-86e7-4346-b536-21fbdb6b371c 10.0.1.114 Connected
892d9c99-79af-4cfb-99a2-c79aeefc5d36 localhost Connected
查看 gluster 存储卷信息:
$ gluster volume info gv
Volume Name: gv
Type: Disperse
Volume ID: 3efefbf1-d29b-4708-8ca2-f0241b030152
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.0.1.111:/data/gluster_brick
Brick2: 10.0.1.112:/data/gluster_brick
Brick3: 10.0.1.113:/data/gluster_brick_new
Options Reconfigured:
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on
$ gluster volume heal gv info
Brick 10.0.1.111:/data/gluster_brick
Status: Connected
Number of entries: 0
Brick 10.0.1.112:/data/gluster_brick
Status: Connected
Number of entries: 0
Brick 10.0.1.113:/data/gluster_brick_new
Status: Transport endpoint is not connected
Number of entries: -
可以看到此时存储卷 gv 依旧使用了 10.0.1.113 的 brick 但是 10.0.1.113 已经是处于不可连接的状态了,下面用 10.0.1.114 的 brick 替换 10.0.1.113 的 brick:
$ gluster volume replace-brick gv 10.0.1.113:/data/gluster_brick_new 10.0.1.114:/data/gluster_brick commit force
volume replace-brick: success: replace-brick commit force operation successful
检查存储卷各 brick 的连通状态:
$ gluster volume heal gv info summary
Brick 10.0.1.111:/data/gluster_brick
Status: Connected
Number of entries: 0
Brick 10.0.1.112:/data/gluster_brick
Status: Connected
Number of entries: 0
Brick 10.0.1.114:/data/gluster_brick
Status: Connected
Number of entries: 0
将 10.0.1.113 从存储池移除:
$ gluster peer detach 10.0.1.113
All clients mounted through the peer which is getting detached need to be remounted using one of the other active peers in the trusted storage pool to ensure client gets notification on any changes done on the gluster configuration and if the same has been done do you want to proceed? (y/n) y
peer detach: success
评论区