Add Ceph OSD Nodes to Ceph Cluster<\/a><\/li>\n<\/ul>\n\n\n\nHere are all my nodes, with custom labels, added to cluster already:<\/p>\n\n\n\n
sudo ceph orch host ls<\/code><\/pre>\n\n\n\nHOST ADDR LABELS STATUS \nnode01 192.168.122.78 _admin,admin01 \nnode02 192.168.122.79 admin02 \nnode03 192.168.122.80 osd1 \nnode04 192.168.122.90 osd2 \nnode05 192.168.122.91 osd3 \n5 hosts in cluster\n<\/code><\/pre>\n\n\n\nAlso, OSDs have been created.<\/p>\n\n\n\n
Services currently running on these nodes;<\/p>\n\n\n\n
sudo ceph orch ps<\/code><\/pre>\n\n\n\nNAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID \ncrash.node01 node01 running (80m) 7m ago 80m 7088k - 18.2.2 1c40e0e88d74 17e2319f10dc \ncrash.node02 node02 running (19m) 7m ago 19m 7107k - 18.2.2 1c40e0e88d74 aa78609c1bf7 \ncrash.node03 node03 running (18m) 7m ago 18m 21.5M - 18.2.2 1c40e0e88d74 b957dc6b2dec \ncrash.node04 node04 running (19m) 7m ago 19m 7151k - 18.2.2 1c40e0e88d74 549db30e0271 \ncrash.node05 node05 running (18m) 7m ago 18m 22.5M - 18.2.2 1c40e0e88d74 3b7f1d77461a \nmgr.node01.upxxuf node01 *:9283,8765 running (81m) 7m ago 81m 479M - 18.2.2 1c40e0e88d74 d1a0585d07b1 \nmgr.node02.bqzghl node02 *:8443,8765 running (19m) 7m ago 19m 425M - 18.2.2 1c40e0e88d74 a473f17a57e8 \nmon.node01 node01 running (81m) 7m ago 81m 51.9M 2048M 18.2.2 1c40e0e88d74 85890ac8c45f \nmon.node02 node02 running (19m) 7m ago 19m 35.0M 2048M 18.2.2 1c40e0e88d74 7dd8886151ec \nmon.node03 node03 running (18m) 7m ago 18m 32.0M 2048M 18.2.2 1c40e0e88d74 203db5ded05e \nmon.node04 node04 running (18m) 7m ago 18m 29.3M 2048M 18.2.2 1c40e0e88d74 17bfaae81be9 \nmon.node05 node05 running (18m) 7m ago 18m 31.6M 2048M 18.2.2 1c40e0e88d74 cf38ec8aa5fc\nosd.0 node05 running (15m) 7m ago 15m 48.8M 4096M 18.2.2 1c40e0e88d74 309e5582f1d9 \nosd.1 node03 running (15m) 7m ago 15m 54.4M 4096M 18.2.2 1c40e0e88d74 82da63b2386e \nosd.2 node04 running (15m) 7m ago 15m 53.8M 4096M 18.2.2 1c40e0e88d74 bb605719cb20\n<\/code><\/pre>\n\n\n\nSetup Second Ceph Admin Node<\/h4>\n\n\n\n We will use node02 as our second admin node.<\/p>\n\n\n\n
Therefore, install cephadm as well as other ceph tools;<\/p>\n\n\n\n
wget -q -O- 'https:\/\/download.ceph.com\/keys\/release.asc' | \\\nsudo gpg --dearmor -o \/etc\/apt\/trusted.gpg.d\/cephadm.gpg<\/code><\/pre>\n\n\n\necho 'deb https:\/\/download.ceph.com\/debian-reef\/ $(lsb_release -sc) main' \\\nsudo tee \/etc\/apt\/sources.list.d\/ceph.list<\/code><\/pre>\n\n\n\nsudo apt update<\/code><\/pre>\n\n\n\nsudo apt install cephadm ceph-common<\/code><\/pre>\n\n\n\nDesignate second node as the second admin node by adding the admin label, _admin<\/strong>. Run this command, of course, on the first admin node.<\/p>\n\n\n\nsudo ceph orch host label add node02 _admin<\/code><\/pre>\n\n\n\n\nNote:<\/strong> By default, adding the _admin<\/code> label to additional nodes in the cluster, cephadm<\/code> copies the ceph.conf<\/code> and client.admin<\/code> keyring files to that node.<\/p>\n<\/blockquote>\n\n\n\nYou can confirm that the Ceph config and keyrings have been copied;<\/p>\n\n\n\n
sudo ls -1 \/etc\/ceph\/*<\/code><\/pre>\n\n\n\n\/etc\/ceph\/ceph.client.admin.keyring\n\/etc\/ceph\/ceph.conf\n\/etc\/ceph\/rbdmap\n<\/code><\/pre>\n\n\n\nYou should now be able to administer cluster from the second node.<\/p>\n\n\n\n
See example command below run on node02;<\/p>\n\n\n\n
sudo ceph orch ps<\/code><\/pre>\n\n\n\ncephadmin@node02:~$ sudo ceph orch ps\nNAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID \ncrash.node01 node01 running (3h) 6m ago 3h 7088k - 18.2.2 1c40e0e88d74 17e2319f10dc \ncrash.node02 node02 running (2h) 6m ago 2h 8643k - 18.2.2 1c40e0e88d74 aa78609c1bf7 \ncrash.node03 node03 running (2h) 6m ago 2h 21.4M - 18.2.2 1c40e0e88d74 b957dc6b2dec \ncrash.node04 node04 running (2h) 6m ago 2h 7151k - 18.2.2 1c40e0e88d74 549db30e0271 \ncrash.node05 node05 running (2h) 6m ago 2h 22.5M - 18.2.2 1c40e0e88d74 3b7f1d77461a \nmgr.node01.upxxuf node01 *:9283,8765 running (3h) 6m ago 3h 483M - 18.2.2 1c40e0e88d74 d1a0585d07b1 \nmgr.node02.bqzghl node02 *:8443,8765 running (2h) 6m ago 2h 432M - 18.2.2 1c40e0e88d74 a473f17a57e8 \nmon.node01 node01 running (3h) 6m ago 3h 83.5M 2048M 18.2.2 1c40e0e88d74 85890ac8c45f \nmon.node02 node02 running (2h) 6m ago 2h 69.0M 2048M 18.2.2 1c40e0e88d74 7dd8886151ec \nmon.node03 node03 running (2h) 6m ago 2h 67.4M 2048M 18.2.2 1c40e0e88d74 203db5ded05e \nmon.node04 node04 running (2h) 6m ago 2h 66.0M 2048M 18.2.2 1c40e0e88d74 17bfaae81be9 \nmon.node05 node05 running (2h) 6m ago 2h 66.0M 2048M 18.2.2 1c40e0e88d74 cf38ec8aa5fc\nosd.0 node05 running (2h) 6m ago 2h 48.8M 4096M 18.2.2 1c40e0e88d74 309e5582f1d9 \nosd.1 node03 running (2h) 6m ago 2h 54.4M 4096M 18.2.2 1c40e0e88d74 82da63b2386e \nosd.2 node04 running (2h) 6m ago 2h 53.8M 4096M 18.2.2 1c40e0e88d74 bb605719cb20 \n<\/code><\/pre>\n\n\n\nDeploy Monitoring Stack on Ceph Cluster<\/h4>\n\n\n\n Remember we skipped the default monitoring stack when we boostrapped the cluster. If you need monitoring stack to be highly available as well, then proceed as follows.<\/p>\n\n\n\n
So, we will set the Grafana and Prometheus to run on both admin nodes, and let other components such as node exporter, ceph exporter and alert manager to run on all nodes.<\/p>\n\n\n\n
You can deploy services using service specification file or simply do it from command line. We will use command line here.<\/p>\n\n\n\n
Deploy Grafana and Prometheus on the two admin nodes;<\/p>\n\n\n\n
sudo ceph orch apply grafana --placement=\"node01 node02\"<\/code><\/pre>\n\n\n\nsudo ceph orch apply prometheus --placement=\"node01 node02\"<\/code><\/pre>\n\n\n\nConfirm the same;<\/p>\n\n\n\n
sudo ceph orch ps --service_name grafana<\/code><\/pre>\n\n\n\nNAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID \ngrafana.node01 node01 *:3000 running (12s) 5s ago 45s 58.6M - 9.4.7 954c08fa6188 aea3b826052e \ngrafana.node02 node02 *:3000 running (11s) 5s ago 43s 58.7M - 9.4.7 954c08fa6188 5ad7ee7d8e61\n<\/code><\/pre>\n\n\n\nsudo ceph orch ps --service_name prometheus<\/code><\/pre>\n\n\n\nNAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID \ngrafana.node01 node01 *:3000 running (43s) 36s ago 76s 58.6M - 9.4.7 954c08fa6188 aea3b826052e \ngrafana.node02 node02 *:3000 running (41s) 36s ago 74s 58.7M - 9.4.7 954c08fa6188 5ad7ee7d8e61\n<\/code><\/pre>\n\n\n\nDeploy other services on all the nodes;<\/p>\n\n\n\n
for i in ceph-exporter node-exporter alertmanager; do sudo ceph orch apply $i; done<\/code><\/pre>\n\n\n\nIf you check the services again;<\/p>\n\n\n\n
sudo ceph orch ps<\/code><\/pre>\n\n\n\nNAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID \nalertmanager.node05 node05 *:9093,9094 running (3m) 2m ago 4m 14.8M - 0.25.0 c8568f914cd2 6b1d5898c410 \nceph-exporter.node01 node01 running (4m) 3m ago 4m 7847k - 18.2.2 1c40e0e88d74 3e8f1d4ccbbb \nceph-exporter.node02 node02 running (4m) 3m ago 4m 6836k - 18.2.2 1c40e0e88d74 6303034fea00 \nceph-exporter.node03 node03 running (4m) 3m ago 4m 6536k - 18.2.2 1c40e0e88d74 a77a95ccdc46 \nceph-exporter.node04 node04 running (4m) 3m ago 4m 6431k - 18.2.2 1c40e0e88d74 d258e24b65c0 \nceph-exporter.node05 node05 running (4m) 2m ago 4m 6471k - 18.2.2 1c40e0e88d74 6c9d33d111ed \ncrash.node01 node01 running (4h) 3m ago 4h 7088k - 18.2.2 1c40e0e88d74 17e2319f10dc \ncrash.node02 node02 running (3h) 3m ago 3h 8939k - 18.2.2 1c40e0e88d74 aa78609c1bf7 \ncrash.node03 node03 running (3h) 3m ago 3h 21.4M - 18.2.2 1c40e0e88d74 b957dc6b2dec \ncrash.node04 node04 running (3h) 3m ago 3h 7151k - 18.2.2 1c40e0e88d74 549db30e0271 \ncrash.node05 node05 running (3h) 2m ago 3h 22.5M - 18.2.2 1c40e0e88d74 3b7f1d77461a \ngrafana.node01 node01 *:3000 running (11m) 3m ago 11m 75.3M - 9.4.7 954c08fa6188 aea3b826052e \ngrafana.node02 node02 *:3000 running (11m) 3m ago 11m 74.2M - 9.4.7 954c08fa6188 5ad7ee7d8e61 \nmgr.node01.upxxuf node01 *:9283,8765 running (4h) 3m ago 4h 472M - 18.2.2 1c40e0e88d74 d1a0585d07b1 \nmgr.node02.bqzghl node02 *:8443,8765 running (3h) 3m ago 3h 429M - 18.2.2 1c40e0e88d74 a473f17a57e8 \nmon.node01 node01 running (4h) 3m ago 4h 93.0M 2048M 18.2.2 1c40e0e88d74 85890ac8c45f \nmon.node02 node02 running (3h) 3m ago 3h 77.7M 2048M 18.2.2 1c40e0e88d74 7dd8886151ec \nmon.node03 node03 running (3h) 3m ago 3h 73.0M 2048M 18.2.2 1c40e0e88d74 203db5ded05e \nmon.node04 node04 running (3h) 3m ago 3h 71.6M 2048M 18.2.2 1c40e0e88d74 17bfaae81be9 \nmon.node05 node05 running (3h) 2m ago 3h 72.8M 2048M 18.2.2 1c40e0e88d74 cf38ec8aa5fc \nnode-exporter.node01 node01 *:9100 running (4m) 3m ago 4m 2836k - 1.5.0 0da6a335fe13 e48cacd54ca1 \nnode-exporter.node02 node02 *:9100 running (4m) 3m ago 4m 3235k - 1.5.0 0da6a335fe13 fd0597727467 \nnode-exporter.node03 node03 *:9100 running (4m) 3m ago 4m 4263k - 1.5.0 0da6a335fe13 ce5710d50c54 \nnode-exporter.node04 node04 *:9100 running (4m) 3m ago 4m 3447k - 1.5.0 0da6a335fe13 e74a6bde46a3 \nnode-exporter.node05 node05 *:9100 running (4m) 2m ago 4m 8640k - 1.5.0 0da6a335fe13 d2223d92dcca \nosd.0 node05 running (3h) 3m ago 3h 48.8M 4096M 18.2.2 1c40e0e88d74 309e5582f1d9 \nosd.1 node03 running (3h) 3m ago 3h 54.4M 4096M 18.2.2 1c40e0e88d74 82da63b2386e \nosd.2 node04 running (3h) 3m ago 3h 53.8M 4096M 18.2.2 1c40e0e88d74 bb605719cb20\nprometheus.node01 node01 *:9095 running (3m) 3m ago 11m 20.7M - 2.43.0 a07b618ecd1d ea4cb99f01da \nprometheus.node02 node02 *:9095 running (3m) 3m ago 11m 22.3M - 2.43.0 a07b618ecd1d fa3968b4803b\n<\/code><\/pre>\n\n\n\nYou should now be able to administer your cluster using either of the admin nodes with VIP.<\/p>\n\n\n\n
See access to dashboard via second admin node.<\/p>\n\n\n\n <\/figure>\n\n\n\nSimilarly, you should access cluster also with the VIP address.<\/p>\n\n\n\n
Update the Grafana Prometheus data source address on each admin node to use the VIP to connect to Prometheus service. This is so just in case either of the node is down, it can still pull metrics using a VIP on one of available admin node that has taken cluster managerial roles<\/p>\n\n\n\n
On Node01<\/p>\n\n\n\n
sudo sed -i 's\/\\bnode[0-9]\\+\\b\/192.168.122.200\/g' \/var\/lib\/ceph\/17ef548c-f68b-11ee-9a19-4d1575fdfd98\/grafana.node01\/etc\/grafana\/provisioning\/datasources\/ceph-dashboard.yml<\/code><\/pre>\n\n\n\nsudo systemctl restart ceph-17ef548c-f68b-11ee-9a19-4d1575fdfd98@grafana.node01.service<\/code><\/pre>\n\n\n\nOn Node02<\/p>\n\n\n\n
sudo sed -i 's\/\\bnode[0-9]\\+\\b\/192.168.122.200\/g' \/var\/lib\/ceph\/17ef548c-f68b-11ee-9a19-4d1575fdfd98\/grafana.node02\/etc\/grafana\/provisioning\/datasources\/ceph-dashboard.yml<\/code><\/pre>\n\n\n\nsudo systemctl restart ceph-17ef548c-f68b-11ee-9a19-4d1575fdfd98@grafana.node02.service<\/code><\/pre>\n\n\n\nSee sample Grafana dashboards via the VIP;<\/p>\n\n\n\n <\/figure>\n\n\n\nAnd that is it!<\/p>\n\n\n\n
Updating existing Ceph cluster with Admin Node HA setup<\/h3>\n\n\n\n If you have already deployed Ceph cluster with a single admin node and you want to update the setup such that you have two admin nodes for high availability, then proceed as follows.<\/p>\n\n\n\n
Let's check current Ceph cluster status;<\/p>\n\n\n\n
sudo ceph -s<\/code><\/pre>\n\n\n\n cluster:\n id: 17ef548c-f68b-11ee-9a19-4d1575fdfd98\n health: HEALTH_OK\n \n services:\n mon: 5 daemons, quorum node01,node02,node05,node03,node04 (age 19m)\n mgr: node02.bqzghl(active, since 44h), standbys: node01.upxxuf\n osd: 3 osds: 3 up (since 44h), 3 in (since 44h)\n \n data:\n pools: 4 pools, 97 pgs\n objects: 296 objects, 389 MiB\n usage: 576 MiB used, 299 GiB \/ 300 GiB avail\n pgs: 97 active+clean\n \n io:\n client: 170 B\/s rd, 170 B\/s wr, 0 op\/s rd, 0 op\/s wr\n<\/code><\/pre>\n\n\n\nAlso see current services\/daemon distribution across the cluster nodes.<\/p>\n\n\n\n
sudo ceph orch ps<\/code><\/pre>\n\n\n\nNAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID \nalertmanager.node05 node05 *:9093,9094 running (45h) 4m ago 45h 14.3M - 0.25.0 c8568f914cd2 6b1d5898c410 \nceph-exporter.node01 node01 running (45h) 100s ago 45h 16.7M - 18.2.2 1c40e0e88d74 3e8f1d4ccbbb \nceph-exporter.node02 node02 running (45h) 99s ago 45h 16.3M - 18.2.2 1c40e0e88d74 6303034fea00 \nceph-exporter.node03 node03 running (45h) 4m ago 45h 17.7M - 18.2.2 1c40e0e88d74 a77a95ccdc46 \nceph-exporter.node04 node04 running (45h) 4m ago 45h 17.6M - 18.2.2 1c40e0e88d74 d258e24b65c0 \nceph-exporter.node05 node05 running (45h) 4m ago 45h 17.6M - 18.2.2 1c40e0e88d74 6c9d33d111ed \ncrash.node01 node01 running (2d) 100s ago 2d 6891k - 18.2.2 1c40e0e88d74 17e2319f10dc \ncrash.node02 node02 running (2d) 99s ago 2d 7659k - 18.2.2 1c40e0e88d74 aa78609c1bf7 \ncrash.node03 node03 running (2d) 4m ago 2d 21.1M - 18.2.2 1c40e0e88d74 b957dc6b2dec \ncrash.node04 node04 running (2d) 4m ago 2d 7959k - 18.2.2 1c40e0e88d74 549db30e0271 \ncrash.node05 node05 running (2d) 4m ago 2d 22.8M - 18.2.2 1c40e0e88d74 3b7f1d77461a \ngrafana.node02 node02 *:3000 running (11m) 99s ago 45h 77.7M - 9.4.7 954c08fa6188 fbba2415751e \nmgr.node01.upxxuf node01 *:9283,8765 running (2d) 100s ago 2d 448M - 18.2.2 1c40e0e88d74 d1a0585d07b1 \nmgr.node02.bqzghl node02 *:8443,8765 running (2d) 99s ago 2d 581M - 18.2.2 1c40e0e88d74 a473f17a57e8 \nmon.node01 node01 running (2d) 100s ago 2d 423M 2048M 18.2.2 1c40e0e88d74 85890ac8c45f \nmon.node02 node02 running (2d) 99s ago 2d 420M 2048M 18.2.2 1c40e0e88d74 7dd8886151ec \nmon.node03 node03 running (2d) 4m ago 2d 422M 2048M 18.2.2 1c40e0e88d74 203db5ded05e \nmon.node04 node04 running (2d) 4m ago 2d 428M 2048M 18.2.2 1c40e0e88d74 17bfaae81be9 \nmon.node05 node05 running (2d) 4m ago 2d 423M 2048M 18.2.2 1c40e0e88d74 cf38ec8aa5fc \nnode-exporter.node01 node01 *:9100 running (45h) 100s ago 45h 9484k - 1.5.0 0da6a335fe13 e48cacd54ca1 \nnode-exporter.node02 node02 *:9100 running (45h) 99s ago 45h 9883k - 1.5.0 0da6a335fe13 fd0597727467 \nnode-exporter.node03 node03 *:9100 running (45h) 4m ago 45h 12.6M - 1.5.0 0da6a335fe13 ce5710d50c54 \nnode-exporter.node04 node04 *:9100 running (45h) 4m ago 45h 10.3M - 1.5.0 0da6a335fe13 e74a6bde46a3 \nnode-exporter.node05 node05 *:9100 running (45h) 4m ago 45h 10.4M - 1.5.0 0da6a335fe13 d2223d92dcca \nosd.0 node05 running (44h) 4m ago 44h 146M 4096M 18.2.2 1c40e0e88d74 309e5582f1d9 \nosd.1 node03 running (44h) 4m ago 44h 158M 4096M 18.2.2 1c40e0e88d74 82da63b2386e \nosd.2 node04 running (44h) 4m ago 44h 151M 4096M 18.2.2 1c40e0e88d74 bb605719cb20 \nprometheus.node01 node01 *:9095 running (43h) 100s ago 45h 83.6M - 2.43.0 a07b618ecd1d 3f489e6e48ed\n<\/code><\/pre>\n\n\n\nRandom services\/daemons running on random nodes! In this setup, if the Ceph admin goes down, you wont be able to administer your ceph cluster.<\/p>\n\n\n\n
Therefore:<\/p>\n\n\n\n
Designate Second Node as Admin Node<\/h4>\n\n\n\n To setup another node as admin on a Ceph cluster, you have to add the _admin<\/strong> label to that node.<\/p>\n\n\n\nRefer to how to Setup Second Ceph Admin Node section<\/a> above.<\/p>\n\n\n\nNow, we have designated node02 as admin node as well;<\/p>\n\n\n\n
sudo ceph orch host ls<\/code><\/pre>\n\n\n\nHOST ADDR LABELS STATUS \nnode01 192.168.122.78 _admin,admin01 \nnode02 192.168.122.79 admin02,_admin \nnode03 192.168.122.80 osd1 \nnode04 192.168.122.90 osd2 \nnode05 192.168.122.91 osd3 \n5 hosts in cluster\n<\/code><\/pre>\n\n\n\nUpdate Monitoring Stack<\/h4>\n\n\n\n If you enabled monitoring stack in your Ceph cluster, you need to update them just to ensure that, at least Grafana and Prometheus are reachable via the admin nodes VIP.<\/p>\n\n\n\n
Grafana and Prometheus are scheduled to run on random nodes.<\/p>\n\n\n\n
sudo ceph orch ps --service_name grafana<\/code><\/pre>\n\n\n\nsudo ceph orch ps --service_name prometheus<\/code><\/pre>\n\n\n\nNAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID \ngrafana.node02 node02 *:3000 running (58m) 6m ago 46h 78.3M - 9.4.7 954c08fa6188 fbba2415751e\nprometheus.node01 node01 *:9095 running (44h) 6m ago 46h 85.1M - 2.43.0 a07b618ecd1d 3f489e6e48e\n<\/code><\/pre>\n\n\n\nSee, we only have one instance of each running. So, if one of the nodes go down, one of the services may not be accessible and hence, no visibility into the cluster.<\/p>\n\n\n\n
Therefore, set them to run on both admin nodes.<\/p>\n\n\n\n
sudo ceph orch apply grafana --placement=\"node01 node02\"<\/code><\/pre>\n\n\n\nsudo ceph orch apply prometheus --placement=\"node01 node02\"<\/code><\/pre>\n\n\n\nOnce the updated is complete, verify the same;<\/p>\n\n\n\n
sudo ceph orch ps --service_name prometheus<\/code><\/pre>\n\n\n\nNAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID \nprometheus.node01 node01 *:9095 running (44h) 3m ago 46h 86.5M - 2.43.0 a07b618ecd1d 3f489e6e48ed \nprometheus.node02 node02 *:9095 running (4m) 3m ago 4m 27.6M - 2.43.0 a07b618ecd1d cd686cf8041c\n<\/code><\/pre>\n\n\n\nsudo ceph orch ps --service_name grafana<\/code><\/pre>\n\n\n\nNAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID \ngrafana.node01 node01 *:3000 running (4m) 4m ago 4m 59.1M - 9.4.7 954c08fa6188 673eab1114dc \ngrafana.node02 node02 *:3000 running (4m) 4m ago 46h 57.1M - 9.4.7 954c08fa6188 017761188c91\n<\/code><\/pre>\n\n\n\nNext, on both admin nodes, update the Grafana dashboard Prometheus datasource address to use the VIP address.<\/p>\n\n\n\n
On Node01<\/p>\n\n\n\n
sudo sed -i 's\/\\bnode[0-9]\\+\\b\/192.168.122.200\/g' \/var\/lib\/ceph\/17ef548c-f68b-11ee-9a19-4d1575fdfd98\/grafana.node01\/etc\/grafana\/provisioning\/datasources\/ceph-dashboard.yml<\/code><\/pre>\n\n\n\nsudo systemctl restart ceph-17ef548c-f68b-11ee-9a19-4d1575fdfd98@grafana.node01.service<\/code><\/pre>\n\n\n\nOn Node02<\/p>\n\n\n\n
sudo sed -i 's\/\\bnode[0-9]\\+\\b\/192.168.122.200\/g' \/var\/lib\/ceph\/17ef548c-f68b-11ee-9a19-4d1575fdfd98\/grafana.node02\/etc\/grafana\/provisioning\/datasources\/ceph-dashboard.yml<\/code><\/pre>\n\n\n\nsudo systemctl restart ceph-17ef548c-f68b-11ee-9a19-4d1575fdfd98@grafana.node02.service<\/code><\/pre>\n\n\n\nVerify Access to Ceph Cluster Dashboard and Monitoring Dashboard via VIP<\/h4>\n\n\n\n At this point, you should now be able to access the Ceph cluster dashboard as well as the Grafana\/Prometheus via the VIP. Note that if both admin nodes (manager nodes) are available, if you try to use VIP, you will be redirected to one of the admin nodes.<\/p>\n\n\n\n
Grafana Dashboard via our VIP, 192.168.122.200;<\/p>\n\n\n\n <\/figure>\n\n\n\nPrometheus Dashboard via our VIP, 192.168.122.200;<\/p>\n\n\n\n <\/figure>\n\n\n\nTo test the Ceph functionality with one of the admin nodes down, Let me take down admin node01.<\/p>\n\n\n\n
sudo ip link set enp1s0 down<\/code><\/pre>\n\n\n\nThen node02 becomes our VIP;<\/p>\n\n\n\n
root@node02:~# ip -br a\nlo UNKNOWN 127.0.0.1\/8 ::1\/128 \nenp1s0 UP 192.168.122.79\/24 192.168.122.200\/24 fe80::5054:ff:fe35:3981\/64 \ndocker0 DOWN 172.17.0.1\/16 \n<\/code><\/pre>\n\n\n\nCeph Status from node02;<\/p>\n\n\n\n
root@node02:~# ceph -s\n cluster:\n id: 17ef548c-f68b-11ee-9a19-4d1575fdfd98\n health: HEALTH_WARN\n 1 hosts fail cephadm check\n 1\/5 mons down, quorum node02,node05,node03,node04\n \n services:\n mon: 5 daemons, quorum node02,node05,node03,node04 (age 34s), out of quorum: node01\n mgr: node02.bqzghl(active, since 46h), standbys: node01.upxxuf\n osd: 3 osds: 3 up (since 45h), 3 in (since 45h)\n \n data:\n pools: 4 pools, 97 pgs\n objects: 296 objects, 389 MiB\n usage: 576 MiB used, 299 GiB \/ 300 GiB avail\n pgs: 97 active+clean\n \n io:\n client: 170 B\/s rd, 170 B\/s wr, 0 op\/s rd, 0 op\/s wr\n<\/code><\/pre>\n\n\n\nAccess to Dashboard;<\/p>\n\n\n\n <\/figure>\n\n\n\nI have also integrated my Openstack with Ceph cluster<\/a>. So I tested creating an instance with volume and an image on OpenStack with one admin node down and it worked as usual!<\/p>\n\n\n\nConclusion<\/h3>\n\n\n\n That concludes our guide on configuring a highly available Ceph storage cluster admin node.<\/p>\n\n\n\n
Read more on Ceph host management<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"In this tutorial, you will learn how to setup Ceph Admin node in high availability. In most setups, Ceph cluster usually runs with a single<\/p>\n","protected":false},"author":10,"featured_media":22128,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"rank_math_lock_modified_date":false,"footnotes":""},"categories":[39,1338,121],"tags":[7438,7439,7436,7437],"class_list":["post-22053","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-storage","category-ceph","category-howtos","tag-ceph-cluster","tag-ceph-cluster-ha","tag-configure-ceph-admin-node-in-high-availability","tag-higly-available-ceph-admin-node","generate-columns","tablet-grid-50","mobile-grid-100","grid-parent","grid-50","resize-featured-image"],"_links":{"self":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/22053"}],"collection":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/comments?post=22053"}],"version-history":[{"count":26,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/22053\/revisions"}],"predecessor-version":[{"id":22129,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/22053\/revisions\/22129"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/media\/22128"}],"wp:attachment":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/media?parent=22053"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/categories?post=22053"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/tags?post=22053"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}