目录

kubernetes应用-rook ceph集群内部署

部署 rook ceph 到 kubernetes 集群内,并为 kubernetes 集群内服务提供持久化存储.

动机

云环境 kubernetes 集群要使用后端存储有很多选择,比如 oss , nas , 云盘 等。但是有时候我们可能会出于其他各种原因需要自建存储服务器来为 kubernetes 提供存储卷,一般我们都会选择 ceph 存储,但如果使用物理机或者 ecs 去部署,需要花费大量人力不说,维护起来也相当割裂,幸好有 rook ceph 这种方案,很好地与云原生环境集成,可以让 ceph 直接跑在 kubernetes 集群上,这样一来便可大大方便维护还是管理。

前置条件

  1. 生产环境需要至少 3 台以上节点,用于作为 osd 节点存储数据使用。

  2. 每个 osd 节点,至少拥有一块裸盘,用于部署 rook ceph 时初始化用。

  3. 我们使用的 rook ceph 版本较高,需要 kubernetes 版本 1.22 或更高。

部署

我们本次的环境 3 台 master 节点,3 台 worker 节点, worker 节点上面分别各有一块裸盘。

克隆仓库

先把配置文件从 GitHub 仓库克隆下来。

  1. 分支自己选,我部署的时候,比较新的版本是 v1.12.9 ,根据实际情况选择。
1
$ git clone --single-branch --branch v1.12.9 https://github.com/rook/rook.git

准备镜像(可选)

由于镜像是需要从海外镜像仓库拉取的,如果网络环境不好,对部署过程很不友好,我们这边先提前下载下来后,push 到我们自己的仓库。

修改 operator.yaml 镜像配置,把 quay.io/cephcsiregistry.k8s.io/sig-storage 修改为自己镜像仓的前缀。

1
2
3
4
5
6
  ROOK_CSI_CEPH_IMAGE: "quay.io/cephcsi/cephcsi:v3.9.0"
  ROOK_CSI_REGISTRAR_IMAGE: "registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.8.0"
  ROOK_CSI_RESIZER_IMAGE: "registry.k8s.io/sig-storage/csi-resizer:v1.8.0"
  ROOK_CSI_PROVISIONER_IMAGE: "registry.k8s.io/sig-storage/csi-provisioner:v3.5.0"
  ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.k8s.io/sig-storage/csi-snapshotter:v6.2.2"
  ROOK_CSI_ATTACHER_IMAGE: "registry.k8s.io/sig-storage/csi-attacher:v4.3.0"

修改 cluster.yaml 镜像配置,把 quay.io/ceph 修改为自己镜像仓的前缀。

1
image: quay.io/ceph/ceph:v17.2.6

修改 cluster.yaml 磁盘设备部分的配置如下,节点名和设备号按场景设置。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
  storage:
    useAllNodes: false
    useAllDevices: false
    config:
    nodes:
      - name: "172.20.7.120"
        devices:
          - name: "vdc"
      - name: "172.20.7.175"
        devices:
          - name: "vdc"
      - name: "172.20.7.237"
        devices:
          - name: "vdc"

部署 rook

  1. 进入 rook/deploy/examples 目录,先把 crds.yaml, common.yaml, operator.yaml 应用到集群。

  2. 到这里,如果不出意外,rook ceph 集群就部署完了。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
cd rook/deploy/examples
kubectl create -f crds.yaml -f common.yaml -f operator.yaml
kubectl create -f cluster.yaml


$ kubectl -n rook-ceph get pod
NAME                                                 READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-provisioner-d77bb49c6-n5tgs         5/5     Running     0          140s
csi-cephfsplugin-provisioner-d77bb49c6-v9rvn         5/5     Running     0          140s
csi-cephfsplugin-rthrp                               3/3     Running     0          140s
csi-rbdplugin-hbsm7                                  3/3     Running     0          140s
csi-rbdplugin-provisioner-5b5cd64fd-nvk6c            6/6     Running     0          140s
csi-rbdplugin-provisioner-5b5cd64fd-q7bxl            6/6     Running     0          140s
rook-ceph-crashcollector-minikube-5b57b7c5d4-hfldl   1/1     Running     0          105s
rook-ceph-mgr-a-64cd7cdf54-j8b5p                     2/2     Running     0          77s
rook-ceph-mgr-b-657d54fc89-2xxw7                     2/2     Running     0          56s
rook-ceph-mon-a-694bb7987d-fp9w7                     1/1     Running     0          105s
rook-ceph-mon-b-856fdd5cb9-5h2qk                     1/1     Running     0          94s
rook-ceph-mon-c-57545897fc-j576h                     1/1     Running     0          85s
rook-ceph-operator-85f5b946bd-s8grz                  1/1     Running     0          92m
rook-ceph-osd-0-6bb747b6c5-lnvb6                     1/1     Running     0          23s
rook-ceph-osd-1-7f67f9646d-44p7v                     1/1     Running     0          24s
rook-ceph-osd-2-6cd4b776ff-v4d68                     1/1     Running     0          25s
rook-ceph-osd-prepare-node1-vx2rz                    0/2     Completed   0          60s
rook-ceph-osd-prepare-node2-ab3fd                    0/2     Completed   0          60s
rook-ceph-osd-prepare-node3-w4xyz                    0/2     Completed   0          60s

我们如果要查看集群状态,可以使用官方提供的 toolbox 进行查看,部署 toolbox ,然后进入到这个 pod 内执行命令。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
kubectl create -f toolbox.yaml

$ ceph status
  cluster:
    id:     a0452c76-30d9-4c1a-a948-5d8405f19a7c
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,c (age 3m)
    mgr:a(active, since 2m), standbys: b
    osd: 3 osds: 3 up (since 1m), 3 in (since 1m)
[]...]

dashboard(可选)

服务持久化

服务部署完后,目前我们就可以创建 rbd 的块存储了,如果需要 cephfs ,看完 rbd 后请继续往后看。

rbd

先创建 CephBlockPool 和 StorageClass 。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
    # clusterID is the namespace where the rook cluster is running
    clusterID: rook-ceph
    # Ceph pool into which the RBD image shall be created
    pool: replicapool

    imageFormat: "2"
    imageFeatures: layering

    csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
    csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
    csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
    csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
    csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
    csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
    csi.storage.k8s.io/fstype: ext4

reclaimPolicy: Delete

allowVolumeExpansion: true

创建 mysql 实例,声明 pvc 并挂载。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
apiVersion: v1
kind: Service
metadata:
  name: wordpress-mysql
  labels:
    app: wordpress
spec:
  ports:
    - port: 3306
  selector:
    app: wordpress
    tier: mysql
  clusterIP: None
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pv-claim
  labels:
    app: wordpress
spec:
  storageClassName: rook-ceph-block
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: wordpress-mysql
  labels:
    app: wordpress
    tier: mysql
spec:
  selector:
    matchLabels:
      app: wordpress
      tier: mysql
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: wordpress
        tier: mysql
    spec:
      containers:
        - image: mysql:5.6
          name: mysql
          env:
            - name: MYSQL_ROOT_PASSWORD
              value: changeme
          ports:
            - containerPort: 3306
              name: mysql
          volumeMounts:
            - name: mysql-persistent-storage
              mountPath: /var/lib/mysql
      volumes:
        - name: mysql-persistent-storage
          persistentVolumeClaim:
            claimName: mysql-pv-claim

cephfs

rbd 只支持一个 pod 读写,有时候,我们需要多个 pod 对 pvc 进行读写,我们需要 cephfs 类型的存储。

先创建 CephFilesystem 实例,系统会生成 cephfs 的 mds Pod 。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
  name: myfs
  namespace: rook-ceph
spec:
  metadataPool:
    replicated:
      size: 3
  dataPools:
    - name: replicated
      replicated:
        size: 3
  preserveFilesystemOnDelete: true
  metadataServer:
    activeCount: 1
    activeStandby: true
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rook-cephfs
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
  # clusterID is the namespace where the rook cluster is running
  # If you change this namespace, also change the namespace below where the secret namespaces are defined
  clusterID: rook-ceph

  # CephFS filesystem name into which the volume shall be created
  fsName: myfs

  # Ceph pool into which the volume shall be created
  # Required for provisionVolume: "true"
  pool: myfs-replicated

  # The secrets contain Ceph admin credentials. These are generated automatically by the operator
  # in the same namespace as the cluster.
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
  csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph

reclaimPolicy: Delete

创建 busybox 并挂载 cephfs pvc 。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cephfs-busybox-pvc
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 2Gi
  storageClassName: rook-cephfs
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: busybox
  labels:
    k8s-app: busybox
    kubernetes.io/cluster-service: "true"
spec:
  replicas: 3
  selector:
    matchLabels:
      k8s-app: busybox
  template:
    metadata:
      labels:
        k8s-app: busybox
        kubernetes.io/cluster-service: "true"
    spec:
      containers:
      - name: busybox
        image: busybox
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: store
          mountPath:  /app
        ports:
        - containerPort: 80
          name: busybox
          protocol: TCP
      volumes:
      - name: store
        persistentVolumeClaim:
          claimName: cephfs-busybox-pvc
          readOnly: false

总结

总体来说,如果只是轻度使用,按照官网文档的默认配置进行部署应该是没问题的,如果需要更复杂的场景,那就需要根据官方文档提供的参数进行定制了。