部署 rook ceph 到 kubernetes 集群内,并为 kubernetes 集群内服务提供持久化存储.
动机
云环境 kubernetes 集群要使用后端存储有很多选择,比如 oss , nas , 云盘 等。但是有时候我们可能会出于其他各种原因需要自建存储服务器来为 kubernetes 提供存储卷,一般我们都会选择 ceph 存储,但如果使用物理机或者 ecs 去部署,需要花费大量人力不说,维护起来也相当割裂,幸好有 rook ceph 这种方案,很好地与云原生环境集成,可以让 ceph 直接跑在 kubernetes 集群上,这样一来便可大大方便维护还是管理。
前置条件
-
生产环境需要至少 3 台以上节点,用于作为 osd 节点存储数据使用。
-
每个 osd 节点,至少拥有一块裸盘,用于部署 rook ceph 时初始化用。
-
我们使用的 rook ceph 版本较高,需要 kubernetes 版本 1.22 或更高。
部署
我们本次的环境 3 台 master 节点,3 台 worker 节点, worker 节点上面分别各有一块裸盘。
克隆仓库
先把配置文件从 GitHub 仓库克隆下来。
- 分支自己选,我部署的时候,比较新的版本是 v1.12.9 ,根据实际情况选择。
1
|
$ git clone --single-branch --branch v1.12.9 https://github.com/rook/rook.git
|
准备镜像(可选)
由于镜像是需要从海外镜像仓库拉取的,如果网络环境不好,对部署过程很不友好,我们这边先提前下载下来后,push 到我们自己的仓库。
修改 operator.yaml 镜像配置,把 quay.io/cephcsi
和 registry.k8s.io/sig-storage
修改为自己镜像仓的前缀。
1
2
3
4
5
6
|
ROOK_CSI_CEPH_IMAGE: "quay.io/cephcsi/cephcsi:v3.9.0"
ROOK_CSI_REGISTRAR_IMAGE: "registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.8.0"
ROOK_CSI_RESIZER_IMAGE: "registry.k8s.io/sig-storage/csi-resizer:v1.8.0"
ROOK_CSI_PROVISIONER_IMAGE: "registry.k8s.io/sig-storage/csi-provisioner:v3.5.0"
ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.k8s.io/sig-storage/csi-snapshotter:v6.2.2"
ROOK_CSI_ATTACHER_IMAGE: "registry.k8s.io/sig-storage/csi-attacher:v4.3.0"
|
修改 cluster.yaml 镜像配置,把 quay.io/ceph
修改为自己镜像仓的前缀。
1
|
image: quay.io/ceph/ceph:v17.2.6
|
修改 cluster.yaml 磁盘设备部分的配置如下,节点名和设备号按场景设置。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
storage:
useAllNodes: false
useAllDevices: false
config:
nodes:
- name: "172.20.7.120"
devices:
- name: "vdc"
- name: "172.20.7.175"
devices:
- name: "vdc"
- name: "172.20.7.237"
devices:
- name: "vdc"
|
部署 rook
-
进入 rook/deploy/examples
目录,先把 crds.yaml, common.yaml, operator.yaml 应用到集群。
-
到这里,如果不出意外,rook ceph 集群就部署完了。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
cd rook/deploy/examples
kubectl create -f crds.yaml -f common.yaml -f operator.yaml
kubectl create -f cluster.yaml
$ kubectl -n rook-ceph get pod
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-provisioner-d77bb49c6-n5tgs 5/5 Running 0 140s
csi-cephfsplugin-provisioner-d77bb49c6-v9rvn 5/5 Running 0 140s
csi-cephfsplugin-rthrp 3/3 Running 0 140s
csi-rbdplugin-hbsm7 3/3 Running 0 140s
csi-rbdplugin-provisioner-5b5cd64fd-nvk6c 6/6 Running 0 140s
csi-rbdplugin-provisioner-5b5cd64fd-q7bxl 6/6 Running 0 140s
rook-ceph-crashcollector-minikube-5b57b7c5d4-hfldl 1/1 Running 0 105s
rook-ceph-mgr-a-64cd7cdf54-j8b5p 2/2 Running 0 77s
rook-ceph-mgr-b-657d54fc89-2xxw7 2/2 Running 0 56s
rook-ceph-mon-a-694bb7987d-fp9w7 1/1 Running 0 105s
rook-ceph-mon-b-856fdd5cb9-5h2qk 1/1 Running 0 94s
rook-ceph-mon-c-57545897fc-j576h 1/1 Running 0 85s
rook-ceph-operator-85f5b946bd-s8grz 1/1 Running 0 92m
rook-ceph-osd-0-6bb747b6c5-lnvb6 1/1 Running 0 23s
rook-ceph-osd-1-7f67f9646d-44p7v 1/1 Running 0 24s
rook-ceph-osd-2-6cd4b776ff-v4d68 1/1 Running 0 25s
rook-ceph-osd-prepare-node1-vx2rz 0/2 Completed 0 60s
rook-ceph-osd-prepare-node2-ab3fd 0/2 Completed 0 60s
rook-ceph-osd-prepare-node3-w4xyz 0/2 Completed 0 60s
|
我们如果要查看集群状态,可以使用官方提供的 toolbox 进行查看,部署 toolbox ,然后进入到这个 pod 内执行命令。
1
2
3
4
5
6
7
8
9
10
11
12
|
kubectl create -f toolbox.yaml
$ ceph status
cluster:
id: a0452c76-30d9-4c1a-a948-5d8405f19a7c
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 3m)
mgr:a(active, since 2m), standbys: b
osd: 3 osds: 3 up (since 1m), 3 in (since 1m)
[]...]
|
dashboard(可选)
服务持久化
服务部署完后,目前我们就可以创建 rbd 的块存储了,如果需要 cephfs ,看完 rbd 后请继续往后看。
rbd
先创建 CephBlockPool 和 StorageClass 。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicapool
namespace: rook-ceph
spec:
failureDomain: host
replicated:
size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
# clusterID is the namespace where the rook cluster is running
clusterID: rook-ceph
# Ceph pool into which the RBD image shall be created
pool: replicapool
imageFormat: "2"
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
allowVolumeExpansion: true
|
创建 mysql 实例,声明 pvc 并挂载。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
|
apiVersion: v1
kind: Service
metadata:
name: wordpress-mysql
labels:
app: wordpress
spec:
ports:
- port: 3306
selector:
app: wordpress
tier: mysql
clusterIP: None
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
labels:
app: wordpress
spec:
storageClassName: rook-ceph-block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: wordpress-mysql
labels:
app: wordpress
tier: mysql
spec:
selector:
matchLabels:
app: wordpress
tier: mysql
strategy:
type: Recreate
template:
metadata:
labels:
app: wordpress
tier: mysql
spec:
containers:
- image: mysql:5.6
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
value: changeme
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-persistent-storage
persistentVolumeClaim:
claimName: mysql-pv-claim
|
cephfs
rbd 只支持一个 pod 读写,有时候,我们需要多个 pod 对 pvc 进行读写,我们需要 cephfs 类型的存储。
先创建 CephFilesystem 实例,系统会生成 cephfs 的 mds Pod 。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
|
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
name: myfs
namespace: rook-ceph
spec:
metadataPool:
replicated:
size: 3
dataPools:
- name: replicated
replicated:
size: 3
preserveFilesystemOnDelete: true
metadataServer:
activeCount: 1
activeStandby: true
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-cephfs
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
# clusterID is the namespace where the rook cluster is running
# If you change this namespace, also change the namespace below where the secret namespaces are defined
clusterID: rook-ceph
# CephFS filesystem name into which the volume shall be created
fsName: myfs
# Ceph pool into which the volume shall be created
# Required for provisionVolume: "true"
pool: myfs-replicated
# The secrets contain Ceph admin credentials. These are generated automatically by the operator
# in the same namespace as the cluster.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
reclaimPolicy: Delete
|
创建 busybox 并挂载 cephfs pvc 。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
|
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cephfs-busybox-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi
storageClassName: rook-cephfs
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
labels:
k8s-app: busybox
kubernetes.io/cluster-service: "true"
spec:
replicas: 3
selector:
matchLabels:
k8s-app: busybox
template:
metadata:
labels:
k8s-app: busybox
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: busybox
image: busybox
imagePullPolicy: Always
resources:
limits:
cpu: 100m
memory: 100Mi
volumeMounts:
- name: store
mountPath: /app
ports:
- containerPort: 80
name: busybox
protocol: TCP
volumes:
- name: store
persistentVolumeClaim:
claimName: cephfs-busybox-pvc
readOnly: false
|
总结
总体来说,如果只是轻度使用,按照官网文档的默认配置进行部署应该是没问题的,如果需要更复杂的场景,那就需要根据官方文档提供的参数进行定制了。