Skip to content
This repository has been archived by the owner on Oct 19, 2022. It is now read-only.

total data loss :) again #81

Open
rafipiccolo opened this issue Mar 5, 2021 · 11 comments
Open

total data loss :) again #81

rafipiccolo opened this issue Mar 5, 2021 · 11 comments

Comments

@rafipiccolo
Copy link

rafipiccolo commented Mar 5, 2021

i guess i cant use this module in production.
i cant understand what caused that. must be unstable in some situations.

this morning :
docker volume rm xxx
caused all data to be deleted on the the remote server...

what i exactly did was

#> docker-compose -p A up -d
volume created a_xxx
#> docker-compose -p B up -d
volume created b_xxx
#> docker-compose -p A down
#> docker-compose -p B down
#> docker volume rm a_xxx b_xxx
total data loss :)

i did remove other volume later without data loss.
so im not sure this command does shit in the soup.
but i have no data left

@freqmand
Copy link

freqmand commented Aug 18, 2021

me too today, all my data are lost after running docker-compose down -v

@rafipiccolo did you find a way to figure out this issue ?

is there any way to bind the volume in the remote server so it will persist forever ? is readonly volume could fix the issue ? I'm just mounting existing data to the container, so I don't have any new generated data to be stored in my remote server

@rafipiccolo
Copy link
Author

rafipiccolo commented Aug 19, 2021

sadly no. i still consider using another storage driver but it needs time to study.
by the meantime i dont rm sshfs volumes without disconnecting network or shuting down remote host :)

@daviddavo
Copy link

Same problem here.

I'm not able to provide any logs, sorry

@Dr-Shadow
Copy link

I would be happy to help but I don't know how to reproduce this problem

@rafipiccolo
Copy link
Author

Thanks, it would be nice to have some insights / solutions :)

The code source is only 300lines of code. So from reading it through, with my in existent go knowledge, I guess the destructive part is the remove function . Is this code ever needed ?

func (d *sshfsDriver) Remove(r *volume.RemoveRequest) error {

@daviddavo
Copy link

Maybe there is a problem in docker codebase, that it tries to delete every file before deleting the volume?

@dmolesUC
Copy link

What's the rationale for calling os.RemoveAll from Remove?

https://github.com/vieux/docker-volume-sshfs/blob/v1.4/main.go#L128

@danschmidt5189
Copy link

danschmidt5189 commented Mar 10, 2022

I just tested this out and did not lose any data. The setup:

  • Red Hat Enterprise Linux Server release 7.9 (Maipo)
  • Kernel 3.10.0-1160.53.1
  • Docker Server 20.10.12

And here's the test (essentially, had to redact/simplify for obvious reasons):

# Install the plugin with debug mode enabled
$ docker plugin install vieux/sshfs DEBUG=1

# Create a volume with the new driver
$ docker volume create \
    -d vieux/sshfs \
    -o [email protected]:/ \
    -o password='<SNIP>' \
    sftptestaccount_volume

# Verify that the volume can be mounted, listed, has the correct structure, etc.
$ docker run --rm -v sftptestaccount_volume:/data testing-image tree -L 2 /data
/data/
├── testdir1
│   └── file1
└── testdir2
    └── file2

# Remove the volume. Separately, verify that contents were not deleted (FTP server-side).
$ docker volume rm sftptestaccount_volume

# Re-create the volume, re-run, and verify contents are still visible
$ docker volume create \
    -d vieux/sshfs \
    -o [email protected]:/ \
    -o password='<SNIP>' \
    sftptestaccount_volume
$ docker run --rm -v sftptestaccount_volume:/data testing-image tree -L 2 /data
/data/
├── testdir1
│   └── file1
└── testdir2
    └── file2

# Finally, delete the test volume for good…
$ docker volume rm sftptestaccount_volume

The debug logs don't show anything worth mentioning.

The full call chain from os.RemoveAll is long and has quite a few breakpoints, notably on what is essentially Remove(rootpath). My guess—just a guess at this point—is that if you traced all the calls through fuse-sshfs unlink and what it calls, we'd find that it's swallowing some error and returning 0/OK instead of 24/SSH_FX_FILE_IS_A_DIRECTORY.

The question still stands, though — is that call actually necessary? Why not just unmount and not worry about deleting data? (Or, perhaps, making that configurable on the volume, something like DeleteContentsOnRm = false). It's worth noting that Trident, NetApp's high-profile Docker/Kubernetes volume driver, does not actually delete your data when removing a volume.

The code source is only 300lines of code. So from reading it through, with my in existent go knowledge, I guess the destructive part is the remove function . Is this code ever needed ?

@rafipiccolo this implements part of Docker's Volume plugin protocol (docs), so it can't be omitted. It can be modified, however, e.g. in the way I described above (making data deletion optional / configurable per volume).

@dmolesUC
Copy link

Some more discussion and investigation on a similar issue in davfs (forked from sshfs): fentas#6

@dmolesUC
Copy link

FWIW, it looks like Amazon's ECS volume driver unmounts before removing which seems like the right semantics here and wouldn't do any harm — the volume protocol plugin docs for Remove say “delete the specified volume from disk”, but I take that to mean cleaning up any resources the volume's using on the host, not unconditionally deleting data on external systems. (I could be wrong.)

@yyihuan
Copy link

yyihuan commented Apr 28, 2022

I have the same problem.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants