Skip to content

Commit

Permalink
New setting to avoid delete pools on <fs>/<os> resources deletion
Browse files Browse the repository at this point in the history
A new CRD property `PreservePoolsOnDelete` has been added to Filesystem(fs) and
Object Store(os) resources in order to increase protection against data loss.
If it is set to `true`, associated pools won't be deleted when the main
resource(fs/os) is deleted. Creating again the deleted fs/os with the same name
 will reuse the preserved pools.

Signed-off-by: Juan Miguel Olmo Martínez <[email protected]>
  • Loading branch information
jmolmo committed Sep 30, 2019
1 parent 91b8a96 commit 2de0787
Show file tree
Hide file tree
Showing 20 changed files with 99 additions and 30 deletions.
2 changes: 2 additions & 0 deletions Documentation/ceph-filesystem-crd.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ spec:
- failureDomain: host
replicated:
size: 3
preservePoolsOnDelete: true
metadataServer:
activeCount: 1
activeStandby: true
Expand Down Expand Up @@ -113,6 +114,7 @@ The pools allow all of the settings defined in the Pool CRD spec. For more detai

- `metadataPool`: The settings used to create the file system metadata pool. Must use replication.
- `dataPools`: The settings to create the file system data pools. If multiple pools are specified, Rook will add the pools to the file system. Assigning users or files to a pool is left as an exercise for the reader with the [CephFS documentation](http://docs.ceph.com/docs/master/cephfs/file-layouts/). The data pools can use replication or erasure coding. If erasure coding pools are specified, the cluster must be running with bluestore enabled on the OSDs.
- `preservePoolsOnDelete`: If it is set to 'true' the pools used to support the filesystem will remain when the filesystem will be deleted. This is a security measure to avoid accidental loss of data. It is set to 'false' by default. If not specified is also deemed as 'false'.

## Metadata Server Settings

Expand Down
5 changes: 4 additions & 1 deletion Documentation/ceph-filesystem.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ spec:
dataPools:
- replicated:
size: 3
preservePoolsOnDelete: true
metadataServer:
activeCount: 1
activeStandby: true
Expand Down Expand Up @@ -219,11 +220,13 @@ To clean up all the artifacts created by the file system demo:
kubectl delete -f kube-registry.yaml
```

To delete the filesystem components and backing data, delete the Filesystem CRD. **Warning: Data will be deleted**
To delete the filesystem components and backing data, delete the Filesystem CRD. **Warning: Data will be deleted if preservePoolsOnDelete=false**
```
kubectl -n rook-ceph delete cephfilesystem myfs
```

Note: If the "preservePoolsOnDelete" filesystem attribute is set to true, the above command won't delete the pools. Creating again the filesystem with the same CRD will reuse again the previous pools.

## Flex Driver

To create a volume based on the flex driver instead of the CSI driver, see the [kube-registry.yaml](https://github.com/rook/rook/blob/{{ branchName }}/cluster/examples/kubernetes/ceph/flex/kube-registry.yaml) example manifest or refer to the complete flow in the Rook v1.0 [Shared File System](https://rook.io/docs/rook/v1.0/ceph-filesystem.html) documentation.
Expand Down
3 changes: 3 additions & 0 deletions Documentation/ceph-object-store-crd.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ spec:
erasureCoded:
dataChunks: 2
codingChunks: 1
preservePoolsOnDelete: true
gateway:
type: s3
sslCertificateRef:
Expand Down Expand Up @@ -79,6 +80,8 @@ The pools allow all of the settings defined in the Pool CRD spec. For more detai

- `metadataPool`: The settings used to create all of the object store metadata pools. Must use replication.
- `dataPool`: The settings to create the object store data pool. Can use replication or erasure coding.
- `preservePoolsOnDelete`: If it is set to 'true' the pools used to support the object store will remain when the object store will be deleted. This is a security measure to avoid accidental loss of data. It is set to 'false' by default. If not specified is also deemed as 'false'.


## Gateway Settings

Expand Down
3 changes: 2 additions & 1 deletion Documentation/ceph-object.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ The below sample will create a `CephObjectStore` that starts the RGW service in

The OSDs must be located on different nodes, because the [`failureDomain`](ceph-pool-crd.md#spec) is set to `host` and the `erasureCoded` chunk settings require at least 3 different OSDs (2 `dataChunks` + 1 `codingChunks`).

See the [Object Store CRD](ceph-object-store-crd.md#object-store-settings), for more detail on the settings availabe for a `CephObjectStore`.
See the [Object Store CRD](ceph-object-store-crd.md#object-store-settings), for more detail on the settings available for a `CephObjectStore`.

```yaml
apiVersion: ceph.rook.io/v1
Expand All @@ -38,6 +38,7 @@ spec:
erasureCoded:
dataChunks: 2
codingChunks: 1
preservePoolsOnDelete: true
gateway:
type: s3
sslCertificateRef:
Expand Down
1 change: 1 addition & 0 deletions PendingReleaseNotes.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
- A new CR property `skipUpgradeChecks` has been added, which allows you force an upgrade by skipping daemon checks. Use this at **YOUR OWN RISK**, only if you know what you're doing. To understand Rook's upgrade process of Ceph, read the [upgrade doc](Documentation/ceph-upgrade.html#ceph-version-upgrades).
- Ceph OSD's admin socket is now placed under Ceph's default system location `/run/ceph`.
- Mon Quorum Disaster Recovery guide has been updated to work with the latest Rook and Ceph release.
- A new CRD property `PreservePoolsOnDelete` has been added to Filesystem(fs) and Object Store(os) resources in order to increase protection against data loss. if it is set to `true`, associated pools won't be deleted when the main resource(fs/os) is deleted. Creating again the deleted fs/os with the same name will reuse the preserved pools.

### EdgeFS

Expand Down
2 changes: 2 additions & 0 deletions cluster/examples/kubernetes/ceph/filesystem-ec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ spec:
- erasureCoded:
dataChunks: 2
codingChunks: 1
# Whether to preserve metadata and data pools on filesystem deletion
preservePoolsOnDelete: true
# The metadata service (mds) configuration
metadataServer:
# The number of active MDS instances
Expand Down
1 change: 1 addition & 0 deletions cluster/examples/kubernetes/ceph/filesystem-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ spec:
- failureDomain: osd
replicated:
size: 1
preservePoolsOnDelete: false
metadataServer:
activeCount: 1
activeStandby: true
2 changes: 2 additions & 0 deletions cluster/examples/kubernetes/ceph/filesystem.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ spec:
- failureDomain: host
replicated:
size: 3
# Whether to preserve metadata and data pools on filesystem deletion
preservePoolsOnDelete: true
# The metadata service (mds) configuration
metadataServer:
# The number of active MDS instances
Expand Down
2 changes: 2 additions & 0 deletions cluster/examples/kubernetes/ceph/object-ec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ spec:
erasureCoded:
dataChunks: 2
codingChunks: 1
# Whether to preserve metadata and data pools on object store deletion
preservePoolsOnDelete: true
# The gaeteway service configuration
gateway:
# type of the gateway (s3)
Expand Down
2 changes: 2 additions & 0 deletions cluster/examples/kubernetes/ceph/object-openshift.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ spec:
failureDomain: host
replicated:
size: 3
# Whether to preserve metadata and data pools on object store deletion
preservePoolsOnDelete: true
# The gaeteway service configuration
gateway:
# type of the gateway (s3)
Expand Down
1 change: 1 addition & 0 deletions cluster/examples/kubernetes/ceph/object-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ spec:
dataPool:
replicated:
size: 1
preservePoolsOnDelete: false
gateway:
type: s3
port: 80
Expand Down
2 changes: 2 additions & 0 deletions cluster/examples/kubernetes/ceph/object.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ spec:
failureDomain: host
replicated:
size: 3
# Whether to preserve metadata and data pools on object store deletion
preservePoolsOnDelete: false
# The gaeteway service configuration
gateway:
# type of the gateway (s3)
Expand Down
6 changes: 6 additions & 0 deletions pkg/apis/ceph.rook.io/v1/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,9 @@ type FilesystemSpec struct {
// The data pool settings
DataPools []PoolSpec `json:"dataPools,omitempty"`

// Preserve pools on filesystem deletion
PreservePoolsOnDelete bool `json:"preservePoolsOnDelete"`

// The mds pod info
MetadataServer MetadataServerSpec `json:"metadataServer"`
}
Expand Down Expand Up @@ -315,6 +318,9 @@ type ObjectStoreSpec struct {
// The data pool settings
DataPool PoolSpec `json:"dataPool"`

// Preserve pools on object store deletion
PreservePoolsOnDelete bool `json:"preservePoolsOnDelete"`

// The rgw pod info
Gateway GatewaySpec `json:"gateway"`
}
Expand Down
24 changes: 18 additions & 6 deletions pkg/daemon/ceph/client/filesystem.go
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ func AllowStandbyReplay(context *clusterd.Context, clusterName string, fsName st
}

// CreateFilesystem performs software configuration steps for Ceph to provide a new filesystem.
func CreateFilesystem(context *clusterd.Context, clusterName, name, metadataPool string, dataPools []string) error {
func CreateFilesystem(context *clusterd.Context, clusterName, name, metadataPool string, dataPools []string, force bool) error {
if len(dataPools) == 0 {
return fmt.Errorf("at least one data pool is required")
}
Expand All @@ -141,6 +141,11 @@ func CreateFilesystem(context *clusterd.Context, clusterName, name, metadataPool

// create the filesystem
args = []string{"fs", "new", name, metadataPool, dataPools[0]}
// Force to use pre-existing pools
if force {
args = append(args, "--force")
logger.Infof("Filesystem %s will reuse pre-existing pools", name)
}
_, err = NewCephCommand(context, clusterName, args).Run()
if err != nil {
return fmt.Errorf("failed enabling ceph fs %s. %+v", name, err)
Expand Down Expand Up @@ -323,7 +328,7 @@ func FailFilesystem(context *clusterd.Context, clusterName, fsName string) error

// RemoveFilesystem performs software configuration steps to remove a Ceph filesystem and its
// backing pools.
func RemoveFilesystem(context *clusterd.Context, clusterName, fsName string) error {
func RemoveFilesystem(context *clusterd.Context, clusterName, fsName string, preservePoolsOnDelete bool) error {
fs, err := GetFilesystem(context, clusterName, fsName)
if err != nil {
return fmt.Errorf("filesystem %s not found. %+v", fsName, err)
Expand All @@ -335,10 +340,15 @@ func RemoveFilesystem(context *clusterd.Context, clusterName, fsName string) err
return fmt.Errorf("Failed to delete ceph fs %s. err=%+v", fsName, err)
}

err = deleteFSPools(context, clusterName, fs)
if err != nil {
return fmt.Errorf("failed to delete fs %s pools. %+v", fsName, err)
if !preservePoolsOnDelete {
err = deleteFSPools(context, clusterName, fs)
if err != nil {
return fmt.Errorf("failed to delete fs %s pools. %+v", fsName, err)
}
} else {
logger.Infof("PreservePoolsOnDelete is set in filesystem %s. Pools not deleted", fsName)
}

return nil
}

Expand All @@ -348,8 +358,9 @@ func deleteFSPools(context *clusterd.Context, clusterName string, fs *CephFilesy
return fmt.Errorf("failed to get pool names. %+v", err)
}

var lastErr error = nil

// delete the metadata pool
var lastErr error
if err := deleteFSPool(context, clusterName, poolNames, fs.MDSMap.MetadataPool); err != nil {
lastErr = err
}
Expand All @@ -360,6 +371,7 @@ func deleteFSPools(context *clusterd.Context, clusterName string, fs *CephFilesy
lastErr = err
}
}

return lastErr
}

Expand Down
2 changes: 1 addition & 1 deletion pkg/daemon/ceph/client/filesystem_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ func TestFilesystemRemove(t *testing.T) {
return "", fmt.Errorf("unexpected rbd command '%v'", args)
}

err := RemoveFilesystem(context, "ns", fs.MDSMap.FilesystemName)
err := RemoveFilesystem(context, "ns", fs.MDSMap.FilesystemName, false)
assert.Nil(t, err)
assert.True(t, metadataDeleted)
assert.True(t, dataDeleted)
Expand Down
5 changes: 5 additions & 0 deletions pkg/operator/ceph/file/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,11 @@ func filesystemChanged(oldFS, newFS cephv1.FilesystemSpec) bool {
logger.Infof("mds active standby changed from %t to %t", oldFS.MetadataServer.ActiveStandby, newFS.MetadataServer.ActiveStandby)
return true
}
if oldFS.PreservePoolsOnDelete != newFS.PreservePoolsOnDelete {
logger.Infof("value of Preserve pools setting changed from %t to %t", oldFS.PreservePoolsOnDelete, newFS.PreservePoolsOnDelete)
// This setting only will be used when the filesystem will be deleted
return false
}
return false
}

Expand Down
50 changes: 35 additions & 15 deletions pkg/operator/ceph/file/filesystem.go
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ func deleteFilesystem(context *clusterd.Context, cephVersion cephver.CephVersion

// Permanently remove the filesystem if it was created by rook
if len(fs.Spec.DataPools) != 0 {
if err := client.RemoveFilesystem(context, fs.Namespace, fs.Name); err != nil {
if err := client.RemoveFilesystem(context, fs.Namespace, fs.Name, fs.Spec.PreservePoolsOnDelete); err != nil {
return fmt.Errorf("failed to remove filesystem %s: %+v", fs.Name, err)
}
}
Expand Down Expand Up @@ -189,33 +189,53 @@ func (f *Filesystem) doFilesystemCreate(context *clusterd.Context, cephVersion c
return fmt.Errorf("Cannot create multiple filesystems. Enable %s env variable to create more than one", client.MultiFsEnv)
}

logger.Infof("Creating filesystem %s", f.Name)
err = client.CreatePoolWithProfile(context, clusterName, *f.metadataPool, appName)
poolNames, err := client.GetPoolNamesByID(context, clusterName)
if err != nil {
return fmt.Errorf("failed to create metadata pool '%s': %+v", f.metadataPool.Name, err)
return fmt.Errorf("failed to get pool names. %+v", err)
}

logger.Infof("Creating filesystem %s", f.Name)

// Make easy to locate a pool by name and avoid repeated searches
reversedPoolMap := make(map[string]int)
for key, value := range poolNames {
reversedPoolMap[value] = key
}

pools_created := false
if _, pool_found := reversedPoolMap[f.metadataPool.Name]; !pool_found {
pools_created = true
err = client.CreatePoolWithProfile(context, clusterName, *f.metadataPool, appName)
if err != nil {
return fmt.Errorf("failed to create metadata pool '%s': %+v", f.metadataPool.Name, err)
}
}

var dataPoolNames []string
for _, pool := range f.dataPools {
dataPoolNames = append(dataPoolNames, pool.Name)
err = client.CreatePoolWithProfile(context, clusterName, *pool, appName)
if err != nil {
return fmt.Errorf("failed to create data pool %s: %+v", pool.Name, err)
}
if pool.Type == model.ErasureCoded {
// An erasure coded data pool used for a filesystem must allow overwrites
if err := client.SetPoolProperty(context, clusterName, pool.Name, "allow_ec_overwrites", "true"); err != nil {
logger.Warningf("failed to set ec pool property: %+v", err)
if _, pool_found := reversedPoolMap[pool.Name]; !pool_found {
pools_created = true
err = client.CreatePoolWithProfile(context, clusterName, *pool, appName)
if err != nil {
return fmt.Errorf("failed to create data pool %s: %+v", pool.Name, err)
}
if pool.Type == model.ErasureCoded {
// An erasure coded data pool used for a filesystem must allow overwrites
if err := client.SetPoolProperty(context, clusterName, pool.Name, "allow_ec_overwrites", "true"); err != nil {
logger.Warningf("failed to set ec pool property: %+v", err)
}
}
}
}

// create the filesystem
if err := client.CreateFilesystem(context, clusterName, f.Name, f.metadataPool.Name, dataPoolNames); err != nil {
// create the filesystem ('fs new' needs to be forced in order to reuse pre-existing pools)
// if only one pool is created new it wont work (to avoid inconsistencies).
if err := client.CreateFilesystem(context, clusterName, f.Name, f.metadataPool.Name, dataPoolNames, !pools_created); err != nil {
return err
}

logger.Infof("created filesystem%s on %d data pool(s) and metadata pool %s", f.Name, len(f.dataPools), f.metadataPool.Name)
logger.Infof("created filesystem %s on %d data pool(s) and metadata pool %s", f.Name, len(f.dataPools), f.metadataPool.Name)
return nil
}

Expand Down
12 changes: 8 additions & 4 deletions pkg/operator/ceph/object/objectstore.go
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ func createObjectStore(context *Context, metadataSpec, dataSpec model.Pool, serv
return nil
}

func deleteRealmAndPools(context *Context) error {
func deleteRealmAndPools(context *Context, preservePoolsOnDelete bool) error {
stores, err := getObjectStores(context)
if err != nil {
return fmt.Errorf("failed to detect object stores during deletion. %+v", err)
Expand All @@ -83,9 +83,13 @@ func deleteRealmAndPools(context *Context) error {
lastStore = true
}

err = deletePools(context, lastStore)
if err != nil {
return fmt.Errorf("failed to delete object store pools. %+v", err)
if !preservePoolsOnDelete {
err = deletePools(context, lastStore)
if err != nil {
return fmt.Errorf("failed to delete object store pools. %+v", err)
}
} else {
logger.Infof("PreservePoolsOnDelete is set in object store %s. Pools not deleted", context.Name)
}
return nil
}
Expand Down
2 changes: 1 addition & 1 deletion pkg/operator/ceph/object/objectstore_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ func deleteStore(t *testing.T, name string, existingStores string, expectedDelet
context := &Context{Context: &clusterd.Context{Executor: executor}, Name: "myobj", ClusterName: "ns"}

// Delete an object store
err := deleteRealmAndPools(context)
err := deleteRealmAndPools(context, false)
assert.Nil(t, err)
expectedPoolsDeleted := 5
if expectedDeleteRootPool {
Expand Down
2 changes: 1 addition & 1 deletion pkg/operator/ceph/object/rgw.go
Original file line number Diff line number Diff line change
Expand Up @@ -296,7 +296,7 @@ func (c *clusterConfig) deleteStore() error {

// Delete the realm and pools
objContext := NewContext(c.context, c.store.Name, c.store.Namespace)
err = deleteRealmAndPools(objContext)
err = deleteRealmAndPools(objContext, c.store.Spec.PreservePoolsOnDelete)
if err != nil {
return fmt.Errorf("failed to delete the realm and pools. %+v", err)
}
Expand Down

0 comments on commit 2de0787

Please sign in to comment.