{"id":19289,"date":"2023-11-18T15:11:36","date_gmt":"2023-11-18T12:11:36","guid":{"rendered":"https:\/\/kifarunix.com\/?p=19289"},"modified":"2024-03-10T14:51:44","modified_gmt":"2024-03-10T11:51:44","slug":"how-to-start-stop-or-restart-ceph-services","status":"publish","type":"post","link":"https:\/\/kifarunix.com\/how-to-start-stop-or-restart-ceph-services\/","title":{"rendered":"How to Start Stop or Restart Ceph Services"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1054\" height=\"593\" src=\"https:\/\/kifarunix.com\/wp-content\/uploads\/2023\/11\/managing-ceph-services.png?v=1700309246\" alt=\"How to Start Stop or Restart Ceph Services\" class=\"wp-image-19319\" title=\"\" srcset=\"https:\/\/kifarunix.com\/wp-content\/uploads\/2023\/11\/managing-ceph-services.png?v=1700309246 1054w, https:\/\/kifarunix.com\/wp-content\/uploads\/2023\/11\/managing-ceph-services-768x432.png?v=1700309246 768w\" sizes=\"(max-width: 1054px) 100vw, 1054px\" \/><\/figure>\n\n\n\n<p>In this tutorial, you will learn how to start stop or restart Ceph Services. <a href=\"https:\/\/ceph.io\/en\/discover\/\" target=\"_blank\" rel=\"noreferrer noopener\">Ceph<\/a> is a distributed storage system that provides object storage, block storage, and file storage capabilities. It comprises several services that work together to manage and store data across a cluster of nodes. The key Ceph services include:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>MON Service (Monitor Service):<\/strong>\n<ul class=\"wp-block-list\">\n<li>Monitors the health and status of the Ceph cluster.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>MGR Service (Manager Service):<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Overview:<\/strong> Provides a management interface for the Ceph cluster.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>OSD Service (Object Storage Daemon):<\/strong>\n<ul class=\"wp-block-list\">\n<li>Manages storage devices for storing and retrieving data as objects.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>RGW Service (RADOS Gateway Service &#8211; Object Gateway):<\/strong>\n<ul class=\"wp-block-list\">\n<li>Offers a RESTful API gateway interface for Ceph&#8217;s object storage.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>MDS Service (Metadata Server):<\/strong>\n<ul class=\"wp-block-list\">\n<li>Manages metadata for the Ceph File System (CephFS).<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>NFS Service:<\/strong>\n<ul class=\"wp-block-list\">\n<li>Provides Network File System (NFS) access to Ceph storage.<\/li>\n\n\n\n<li>Utilizes the <code>nfs-ganesha<\/code> daemon.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>RBD Service (RADOS Block Device):<\/strong>\n<ul class=\"wp-block-list\">\n<li>Manages block storage devices within the Ceph cluster.<\/li>\n\n\n\n<li>Utilizes the <code>rbd<\/code> component and interacts with the <code>rados<\/code> and <code>ceph-osd<\/code> daemons.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<p>Read more on <a href=\"https:\/\/docs.ceph.com\/en\/latest\/cephadm\/services\/\" target=\"_blank\" rel=\"noreferrer noopener\">Ceph service management<\/a>.<\/p>\n\n\n\n<p>To start, stop or restart Ceph services, proceed as follows.<\/p>\n\n\n\n<div class=\"wp-block-rank-math-toc-block\" id=\"rank-math-toc\"><h2>Table of Contents<\/h2><nav><ul><li><a href=\"#how-to-start-stop-or-restart-ceph-services\">How to Start, Stop or Restart Ceph Services<\/a><ul><li><a href=\"#list-ceph-services\">List Ceph Services<\/a><\/li><li><a href=\"#manage-ceph-services-at-a-cluster-level\">Manage Ceph Services at a Cluster Level<\/a><\/li><li><a href=\"#manage-ceph-services-at-a-node-level\">Manage Ceph Services at a Node Level<\/a><\/li><li><a href=\"#daemon-level\">Daemon Level<\/a><\/li><li><a href=\"#how-to-gracefully-stop-and-start-whole-ceph-cluster-for-maintenance\">How to Gracefully Stop and Start Whole Ceph Cluster for Maintenance<\/a><ul><li><a href=\"#verify-healthy-cluster-state\">Verify healthy cluster state<\/a><\/li><li><a href=\"#backup-your-data\">Backup your Data<\/a><\/li><li><a href=\"#stop-data-writes-on-ceph-cluster\">Stop data writes on Ceph Cluster<\/a><\/li><li><a href=\"#prepare-the-object-storage-devices-os-ds-for-shutdown\">Prepare the Object Storage Devices (OSDs) for Shutdown<\/a><\/li><li><a href=\"#shut-down-the-ceph-cluster-nodes\">Shut down the Ceph cluster Nodes.<\/a><\/li><li><a href=\"#bring-backup-ceph-cluster\">Bring Backup Ceph Cluster<\/a><\/li><\/ul><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"how-to-start-stop-or-restart-ceph-services\">How to Start, Stop or Restart Ceph Services<\/h2>\n\n\n\n<p>In a Ceph cluster, services are organized and managed at different levels:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>cluster<\/li>\n\n\n\n<li>node<\/li>\n\n\n\n<li>daemon<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"list-ceph-services\">List Ceph Services<\/h3>\n\n\n\n<p>You can get a list of Ceph services using <strong><code>ceph orch<\/code><\/strong> command.<\/p>\n\n\n\n<p>To get a general overview of the ceph services, run the command;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo ceph orch ls<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>NAME           PORTS        RUNNING  REFRESHED  AGE  PLACEMENT    \nalertmanager   ?:9093,9094      1\/1  6m ago     2d   count:1      \nceph-exporter                   4\/4  10m ago    2d   *            \ncrash                           4\/4  10m ago    2d   *            \ngrafana        ?:3000           1\/1  6m ago     2d   count:1      \nmgr                             2\/2  6m ago     2d   count:2      \nmon                             4\/5  10m ago    2d   count:5      \nnode-exporter  ?:9100           4\/4  10m ago    2d   *            \nosd                               3  10m ago    -    <unmanaged>  \nprometheus     ?:9095           1\/1  6m ago     2d   count:1 \n<\/code><\/pre>\n\n\n\n<p>To get a detailed listing of the Ceph services, run the command;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo ceph orch ps<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>NAME                      HOST        PORTS             STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID  \nalertmanager.ceph-admin   ceph-admin  *:9093,9094       running (2d)      9m ago   2d    15.1M        -  0.25.0   c8568f914cd2  8c12c81552e5  \nceph-exporter.ceph-admin  ceph-admin                    running (2d)      9m ago   2d    22.5M        -  18.2.0   10237bca3285  1ca71e41cd22  \nceph-exporter.ceph-mon    ceph-mon                      running (8h)      9m ago   2d    17.5M        -  18.2.0   10237bca3285  939f1001e611  \nceph-exporter.ceph-osd1   ceph-osd1                     running (2d)      3m ago   2d    18.0M        -  18.2.0   10237bca3285  a8bb422e2a79  \nceph-exporter.ceph-osd2   ceph-osd2                     running (2d)      2m ago   2d    17.7M        -  18.2.0   10237bca3285  deaaa5c586d1  \ncrash.ceph-admin          ceph-admin                    running (2d)      9m ago   2d    7432k        -  18.2.0   10237bca3285  fac0c03abfa2  \ncrash.ceph-mon            ceph-mon                      running (8h)      9m ago   2d    7084k        -  18.2.0   10237bca3285  c6ad83687a9d  \ncrash.ceph-osd1           ceph-osd1                     running (2d)      3m ago   2d    7119k        -  18.2.0   10237bca3285  f2c57cbaaf3d  \ncrash.ceph-osd2           ceph-osd2                     running (2d)      2m ago   2d    7107k        -  18.2.0   10237bca3285  bf23fe62a3a6  \ngrafana.ceph-admin        ceph-admin  *:3000            running (32h)     9m ago   2d    88.1M        -  9.4.7    2c41d148cca3  d3f2f3edc8e8  \nmgr.ceph-admin.ykkdly     ceph-admin  *:9283,8765,8443  running (2d)      9m ago   2d     646M        -  18.2.0   10237bca3285  9b395d873cf5  \nmgr.ceph-mon.grwzmv       ceph-mon    *:8443,9283,8765  running (8h)      9m ago   2d     431M        -  18.2.0   10237bca3285  a40257127c4f  \nmon.ceph-admin            ceph-admin                    running (2d)      9m ago   2d     497M    2048M  18.2.0   10237bca3285  39a5c79ebe49  \nmon.ceph-mon              ceph-mon                      running (8h)      9m ago   2d     226M    2048M  18.2.0   10237bca3285  69af76467894  \nmon.ceph-osd1             ceph-osd1                     running (2d)      3m ago   2d     442M    2048M  18.2.0   10237bca3285  48e379303841  \nmon.ceph-osd2             ceph-osd2                     running (2d)      2m ago   2d     446M    2048M  18.2.0   10237bca3285  1a5ac19d09c2  \nnode-exporter.ceph-admin  ceph-admin  *:9100            running (2d)      9m ago   2d    9940k        -  1.5.0    0da6a335fe13  f8a22cdbc222  \nnode-exporter.ceph-mon    ceph-mon    *:9100            running (8h)      9m ago   2d    8991k        -  1.5.0    0da6a335fe13  bc7bd68616a8  \nnode-exporter.ceph-osd1   ceph-osd1   *:9100            running (2d)      3m ago   2d    9564k        -  1.5.0    0da6a335fe13  0e26f9a5cd1e  \nnode-exporter.ceph-osd2   ceph-osd2   *:9100            running (2d)      2m ago   2d    9075k        -  1.5.0    0da6a335fe13  b557f82a9e1d  \nosd.0                     ceph-mon                      running (8h)      9m ago   2d    77.4M    4096M  18.2.0   10237bca3285  fbbb2be86316  \nosd.1                     ceph-osd1                     running (2d)      3m ago   2d    99.0M    2356M  18.2.0   10237bca3285  4c930eb2c71e  \nosd.2                     ceph-osd2                     running (2d)      2m ago   2d    96.3M    2356M  18.2.0   10237bca3285  94551a6b5b94  \nprometheus.ceph-admin     ceph-admin  *:9095            running (2d)      9m ago   2d     256M        -  2.43.0   a07b618ecd1d  3b63ed00c55e\n<\/code><\/pre>\n\n\n\n<p>You can list services running on specific node only by specifying the node name.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph orch ps HOST<\/code><\/pre>\n\n\n\n<p>For example;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph orch ps ceph-osd1<\/code><\/pre>\n\n\n\n<pre class=\"scroll-sz\"><code>NAME                     HOST       PORTS   STATUS        REFRESHED  AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID  \nceph-exporter.ceph-osd1  ceph-osd1          running (2d)     6m ago   2d    18.0M        -  18.2.0   10237bca3285  a8bb422e2a79  \ncrash.ceph-osd1          ceph-osd1          running (2d)     6m ago   2d    7119k        -  18.2.0   10237bca3285  f2c57cbaaf3d  \nmon.ceph-osd1            ceph-osd1          running (2d)     6m ago   2d     442M    2048M  18.2.0   10237bca3285  48e379303841  \nnode-exporter.ceph-osd1  ceph-osd1  *:9100  running (2d)     6m ago   2d    9564k        -  1.5.0    0da6a335fe13  0e26f9a5cd1e  \nosd.1                    ceph-osd1          running (2d)     6m ago   2d    99.0M    2356M  18.2.0   10237bca3285  4c930eb2c71e\n<\/code><\/pre>\n\n\n\n<p>To check specific service on a specific node;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph orch ps ceph-osd1 --service_name mon<\/code><\/pre>\n\n\n\n<pre class=\"scroll-sz\"><code>NAME           HOST       PORTS  STATUS        REFRESHED  AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID  \nmon.ceph-osd1  ceph-osd1         running (3m)     3m ago   2d    19.6M    2048M  18.2.0   10237bca3285  b9bf54ad48d9\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"manage-ceph-services-at-a-cluster-level\">Manage Ceph Services at a Cluster Level<\/h3>\n\n\n\n<p>The cluster level represents the overall orchestration and coordination of all nodes and services to create a unified storage infrastructure. At the cluster level, Ceph services collaborate to ensure data redundancy, fault tolerance, and efficient storage management. Key activities at the cluster level include maintaining cluster maps, distributing data across OSDs, handling failover scenarios, and managing the overall health of the Ceph storage system.<\/p>\n\n\n\n<p>To start, stop or restart ceph services at a cluster level, you use <strong><code>ceph orch<\/code><\/strong> command.<\/p>\n\n\n\n<p>The command syntax to start, stop, or restart cluster service is;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph orch &lt;start|stop|restart&gt; &lt;service_name&gt;<\/code><\/pre>\n\n\n\n<p>For example, to stop, start, restart all OSDs in the cluster;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph orch stop &lt;service_name&gt;<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph orch start &lt;service_name&gt;<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph orch restart &lt;service_name&gt;<\/code><\/pre>\n\n\n\n<p>Note that you cannot stop the <strong>mgr<\/strong> or <strong>mon<\/strong> services for the entire cluster. Stopping these services, cluster-wide, would make the cluster inaccessible You can issue the restart command to schedule a node by node restart.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"manage-ceph-services-at-a-node-level\">Manage Ceph Services at a Node Level<\/h3>\n\n\n\n<p>At the node level, Ceph services are associated with individual servers or nodes in the cluster. Each node typically runs multiple daemons, which collaborate to provide the necessary storage services. Nodes can host OSDs, MONs, MGRs, RGWs, or MDSs, depending on the specific role assigned to them in the Ceph cluster.<\/p>\n\n\n\n<p>You can use <code>systemctl<\/code>\u00a0command to start, stop or restart ceph services at a node level.<\/p>\n\n\n\n<p>List Ceph SystemD services running on a specific node;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo systemctl list-units \"*ceph*\"<\/code><\/pre>\n\n\n\n<p>Sample output on my ceph-osd1 node;<\/p>\n\n\n\n<pre class=\"scroll-box\"><code>  UNIT                                                                      LOAD   ACTIVE SUB     DESCRIPTION                                                          \n  ceph-70d227de-83e3-11ee-9dda-ff8b7941e415@ceph-exporter.ceph-osd1.service loaded active running Ceph ceph-exporter.ceph-osd1 for 70d227de-83e3-11ee-9dda-ff8b7941e415\n  ceph-70d227de-83e3-11ee-9dda-ff8b7941e415@crash.ceph-osd1.service         loaded active running Ceph crash.ceph-osd1 for 70d227de-83e3-11ee-9dda-ff8b7941e415\n  ceph-70d227de-83e3-11ee-9dda-ff8b7941e415@mon.ceph-osd1.service           loaded active running Ceph mon.ceph-osd1 for 70d227de-83e3-11ee-9dda-ff8b7941e415\n  ceph-70d227de-83e3-11ee-9dda-ff8b7941e415@node-exporter.ceph-osd1.service loaded active running Ceph node-exporter.ceph-osd1 for 70d227de-83e3-11ee-9dda-ff8b7941e415\n  ceph-70d227de-83e3-11ee-9dda-ff8b7941e415@osd.1.service                   loaded active running Ceph osd.1 for 70d227de-83e3-11ee-9dda-ff8b7941e415                  \n  system-ceph\\x2d70d227de\\x2d83e3\\x2d11ee\\x2d9dda\\x2dff8b7941e415.slice     loaded active active  Slice \/system\/ceph-70d227de-83e3-11ee-9dda-ff8b7941e415              \n  ceph-70d227de-83e3-11ee-9dda-ff8b7941e415.target                          loaded active active  Ceph cluster 70d227de-83e3-11ee-9dda-ff8b7941e415\n  ceph.target                                                               loaded active active  All Ceph clusters and services\n\nLOAD   = Reflects whether the unit definition was properly loaded.\nACTIVE = The high-level unit activation state, i.e. generalization of SUB.\nSUB    = The low-level unit activation state, values depend on unit type.\n8 loaded units listed. Pass --all to see loaded but inactive units, too.\nTo show all installed unit files use 'systemctl list-unit-files'.\n\n<\/code><\/pre>\n\n\n\n<p>From the output above, the UNIT field shows the service names in the format;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph-<em>FSID<\/em>@<em>SERVICE_TYPE<\/em>.<em>ID<\/em>.service<\/code><\/pre>\n\n\n\n<p>Where;<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong><code>FSID<\/code><\/strong> is the Ceph File System Identifier, a unique identifier for the cluster.<\/li>\n\n\n\n<li><strong>SERVICE_TYPE.ID<\/strong> is the ceph systemd service name which corresponds to the NAME field of the <strong><code>ceph orch ps<\/code><\/strong> command.<\/li>\n<\/ul>\n\n\n\n<p>So, you can control and manage each single ceph service on each node using systemctl command;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>systemctl stop <strong>ceph-<em>FSID<\/em>@<em>SERVICE_TYPE<\/em>.<em>ID<\/em><\/strong><\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>systemctl start <strong>ceph-<em>FSID<\/em>@<em>SERVICE_TYPE<\/em>.<em>ID<\/em><\/strong><\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>systemctl restart <strong>ceph-<em>FSID<\/em>@<em>SERVICE_TYPE<\/em>.<em>ID<\/em><\/strong><\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>systemctl status <strong>ceph-<em>FSID<\/em>@<em>SERVICE_TYPE<\/em>.<em>ID<\/em><\/strong><\/code><\/pre>\n\n\n\n<p>If you want to manage all the Ceph clusters services in a node, then use the <strong><code>ceph.target<\/code><\/strong> service unit.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>systemctl stop <strong>ceph.target<\/strong><\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>systemctl start <strong>ceph<strong>.target<\/strong><\/strong><\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>systemctl restart <strong>ceph<strong>.target<\/strong><\/strong><\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>systemctl status <strong>ceph<strong>.target<\/strong><\/strong><\/code><\/pre>\n\n\n\n<p>If you are running multiple clusters, then services associated with the cluster will have their respective cluster IDs. So, if you want to manage all services for a specific cluster, then;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>systemctl &lt;start|stop|restart|status&gt; <strong>ceph<strong>-FSID.target<\/strong><\/strong><\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"daemon-level\">Daemon Level<\/h3>\n\n\n\n<p>Ceph employs a decentralized architecture where various components, called daemons, work together to provide different storage services. These daemons are responsible for specific tasks within the Ceph cluster. Here are some key Ceph daemons:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>OSD (Object Storage Daemon):<\/strong> Manages the storage devices and is responsible for storing and retrieving data as objects.<\/li>\n\n\n\n<li><strong>MON (Monitor Daemon):<\/strong> Maintains maps of the cluster state, including OSD maps and monitor maps. Monitors communicate with each other to reach a consensus on the state of the cluster.<\/li>\n\n\n\n<li><strong>MGR (Manager Daemon):<\/strong> Provides a management interface for the Ceph cluster, offering RESTful APIs and a web-based dashboard for monitoring and managing the cluster.<\/li>\n\n\n\n<li><strong>RGW (RADOS Gateway Daemon):<\/strong> Facilitates access to Ceph object storage through S3 and Swift-compatible APIs.<\/li>\n\n\n\n<li><strong>MDS (Metadata Server Daemon):<\/strong> Manages metadata for Ceph File System (CephFS), facilitating file access and directory operations.<\/li>\n<\/ol>\n\n\n\n<p>The <code>ceph orch daemon<\/code> command in Ceph orchestrator is used to start, stop or restart ceph services at a daemon level. It allows you to interact with and perform various operations on Ceph daemon services deployed in the cluster. The <code>ceph orch daemon<\/code> command provides subcommands for tasks such as starting, stopping, restarting, reconfig daemons, e.t.c.<\/p>\n\n\n\n<p>Thus, the command syntax is;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph orch daemon &lt;start|stop|restart&gt; SERVICE_NAME<\/code><\/pre>\n\n\n\n<p>You can get the SERVICE_NAME from the <strong><code>ceph orch ps<\/code><\/strong> command.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph orch daemon restart grafana.ceph-admin<\/code><\/pre>\n\n\n\n<p>Check more on;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph orch daemon -h<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"how-to-gracefully-stop-and-start-whole-ceph-cluster-for-maintenance\">How to Gracefully Stop and Start Whole Ceph Cluster for Maintenance<\/h3>\n\n\n\n<p><strong>HEADS UP! POTENTIAL DATA LOSS\/CORRUPTION! PROCEED AT YOUR OWN RISK!<\/strong><\/p>\n\n\n\n<p>Stopping the entire Ceph cluster involves stopping all Ceph daemon services across the MONs (Monitors), OSDs (Object Storage Daemons), MGRs (Managers), and other components.<\/p>\n\n\n\n<p>Be cautious when stopping a Ceph cluster, especially in production environments, to avoid potential data loss or corruption. Ensure that you have proper backups, and the cluster is not serving critical workloads.<\/p>\n\n\n\n<p>The specific steps might depend on how Ceph was deployed in your environment (e.g. using cephadm, manual deployment, or other methods). You can check our <a href=\"https:\/\/kifarunix.com\/?s=setup+ceph+storage+cluster\" target=\"_blank\" rel=\"noreferrer noopener\">Ceph cluster deployment guides<\/a>.<\/p>\n\n\n\n<p>If you are sure you want proceed, then proceed as follows.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"verify-healthy-cluster-state\">Verify healthy cluster state<\/h4>\n\n\n\n<p>Before initiating the shutdown process, ensure that the Ceph cluster is in a healthy state. Check for any ongoing maintenance tasks, data replication issues, or OSD failures.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph -s<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>  cluster:\n    id:     70d227de-83e3-11ee-9dda-ff8b7941e415\n    health: HEALTH_OK\n \n  services:\n    mon: 4 daemons, quorum ceph-admin,ceph-mon,ceph-osd1,ceph-osd2 (age 102m)\n    mgr: ceph-admin.ykkdly(active, since 2d), standbys: ceph-mon.grwzmv\n    osd: 3 osds: 3 up (since 38m), 3 in (since 2d)\n \n  data:\n    pools:   2 pools, 33 pgs\n    objects: 45 objects, 14 MiB\n    usage:   191 MiB used, 300 GiB \/ 300 GiB avail\n    pgs:     33 active+clean\n<\/code><\/pre>\n\n\n\n<p>Or simply;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph health<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>HEALTH_OK<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"backup-your-data\">Backup your Data<\/h4>\n\n\n\n<p>Ensures that you have a backup of your data in case of any unexpected issues during the shutdown process.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"stop-data-writes-on-ceph-cluster\">Stop data writes on Ceph Cluster<\/h4>\n\n\n\n<p>Stop any applications or processes that are writing data to the Ceph cluster. This prevents new data from being written while the cluster is shutting down, reducing the risk of data loss or corruption.<\/p>\n\n\n\n<p>If you have any clients using the cluster, stop or power them off before you can proceed.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"prepare-the-object-storage-devices-os-ds-for-shutdown\">Prepare the Object Storage Devices (OSDs) for Shutdown<\/h4>\n\n\n\n<p>Modify configuration parameters of OSDs in the Ceph cluster in preparation for cluster shutdown.<\/p>\n\n\n\n<p>Prevent OSDs from being treated as&nbsp;<code>out<\/code>&nbsp;of the cluster (useful during maintenance);<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph osd set noout<\/code><\/pre>\n\n\n\n<p>This means that OSDs will not be marked as &#8220;out&#8221; even if they are not responding, during shutdown.<\/p>\n\n\n\n<p>Disables backfill operations in the cluster;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph osd set nobackfill<\/code><\/pre>\n\n\n\n<p>This command sets the <code>nobackfill<\/code> flag for an OSD, which prevents the OSD from replicating data from other OSDs.<\/p>\n\n\n\n<p>Disable Cluster OSD recovery operations;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph osd set norecover<\/code><\/pre>\n\n\n\n<p>This command sets the <code>norecover<\/code> flag for an OSD, which prevents the OSD from recovering from failures.<\/p>\n\n\n\n<p>Disable Ceph Cluster rebalance operations;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph osd set norebalance<\/code><\/pre>\n\n\n\n<p>This command sets the <code>norebalance<\/code> flag for an OSD, which prevents the OSD from participating in rebalancing operations.<\/p>\n\n\n\n<p>Prevents OSDs from being marked as &#8220;down&#8221; to avoid unnecessary cluster adjustments.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph osd set nodown<\/code><\/pre>\n\n\n\n<p>Step Ceph Cluster read and write operations.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph osd set pause<\/code><\/pre>\n\n\n\n<p>You can verify that all these  have been effected on Ceph OSDs by checking Ceph cluster status;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph -s<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>  cluster:\n    id:     70d227de-83e3-11ee-9dda-ff8b7941e415\n<strong>    health: HEALTH_WARN\n            pauserd,pausewr,nodown,noout,nobackfill,norebalance flag(s) set<\/strong>\n \n  services:\n    mon: 4 daemons, quorum ceph-admin,ceph-mon,ceph-osd1,ceph-osd2 (age 2h)\n    mgr: ceph-admin.ykkdly(active, since 2d), standbys: ceph-mon.grwzmv\n    osd: 3 osds: 3 up (since 109m), 3 in (since 2d)\n         <strong>flags pauserd,pausewr,nodown,noout,nobackfill,norebalance<\/strong>\n \n  data:\n    pools:   2 pools, 33 pgs\n    objects: 45 objects, 14 MiB\n    usage:   191 MiB used, 300 GiB \/ 300 GiB avail\n    pgs:     33 active+clean\n<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"shut-down-the-ceph-cluster-nodes\">Shut down the Ceph cluster Nodes.<\/h4>\n\n\n\n<p>Login to each Ceph cluster nodes and shut them down in the following order;<\/p>\n\n\n\n<p>(<strong><em>Ensure the IP addresses are assigned permanently to the nodes<\/em><\/strong>)<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Ceph Service nodes<\/strong>: If you are running seperate nodes for services such as RGW nodes or other special services, shut them down first: <strong><code>systemctl poweroff<\/code><\/strong><\/li>\n\n\n\n<li><strong>Ceph OSD nodes<\/strong>: Login to each OSD node and gracefully shut them down: <strong>systemctl poweroff<\/strong><\/li>\n\n\n\n<li><strong>Ceph MON nodes<\/strong>: Login to each MON node and gracefully shut them down: <strong>systemctl poweroff<\/strong><\/li>\n\n\n\n<li><strong>Ceph MGR Nodes<\/strong>: Login to each MGR node and gracefully shut them down: <strong>systemctl poweroff<\/strong><\/li>\n<\/ol>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"bring-backup-ceph-cluster\">Bring Backup Ceph Cluster<\/h4>\n\n\n\n<p>After the maintenance, it is now time to bring up the cluster.<\/p>\n\n\n\n<p>To begin with, power up the Ceph cluster nodes <strong>reverse<\/strong> order with which you shut them down above.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Power on Ceph<strong> MGR<\/strong> Nodes.<\/li>\n\n\n\n<li>Power on Ceph <strong>MON<\/strong> nodes.<\/li>\n\n\n\n<li>Power on <strong>OSD<\/strong> nodes.<\/li>\n\n\n\n<li>Power on Service nodes<\/li>\n<\/ol>\n\n\n\n<p>Once the nodes are up, ensure that timestamp is the same across all the nodes (NTP can be used).<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>date<\/code><\/pre>\n\n\n\n<p>Unset all the flags set above on the OSD nodes, in the reverse order;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph osd unset pause\nceph osd unset nodown\nceph osd unset norebalance\nceph osd unset norecover\nceph osd unset nobackfill\nceph osd unset noout<\/code><\/pre>\n\n\n\n<p>Once all is done, confirm the health of your cluster.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ceph -s<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>  cluster:\n    id:     70d227de-83e3-11ee-9dda-ff8b7941e415\n    health: HEALTH_OK\n \n  services:\n    mon: 4 daemons, quorum ceph-admin,ceph-mon,ceph-osd1,ceph-osd2 (age 60s)\n    mgr: ceph-admin.ykkdly(active, since 2d), standbys: ceph-mon.grwzmv\n    osd: 3 osds: 3 up (since 56s), 3 in (since 2d)\n \n  data:\n    pools:   2 pools, 33 pgs\n    objects: 45 objects, 14 MiB\n    usage:   125 MiB used, 300 GiB \/ 300 GiB avail\n    pgs:     33 active+clean\n\n<\/code><\/pre>\n\n\n\n<p>Verify and validate everything to ensure your cluster is up and running as expected.<\/p>\n\n\n\n<p>That concludes our guide on how to start stop and restart Ceph cluster services.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this tutorial, you will learn how to start stop or restart Ceph Services. Ceph is a distributed storage system that provides object storage, block<\/p>\n","protected":false},"author":10,"featured_media":19319,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"rank_math_lock_modified_date":false,"footnotes":""},"categories":[121,1338,39],"tags":[7313,7312,7311,7308,7310,7309],"class_list":["post-19289","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-howtos","category-ceph","category-storage","tag-ceph-orch-daemon","tag-ceph-orch-ps","tag-ceph-services-systemd","tag-restart-ceph-services","tag-start-ceph-services","tag-stop-ceph-services","generate-columns","tablet-grid-50","mobile-grid-100","grid-parent","grid-50","resize-featured-image"],"_links":{"self":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/19289"}],"collection":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/comments?post=19289"}],"version-history":[{"count":14,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/19289\/revisions"}],"predecessor-version":[{"id":20876,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/19289\/revisions\/20876"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/media\/19319"}],"wp:attachment":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/media?parent=19289"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/categories?post=19289"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/tags?post=19289"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}