Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: create backup target: location is not accessible #4207

Open
timtimb0t opened this issue Jan 13, 2025 · 5 comments
Open

Error: create backup target: location is not accessible #4207

timtimb0t opened this issue Jan 13, 2025 · 5 comments
Assignees

Comments

@timtimb0t
Copy link

Packages

Scylla version: 6.3.0~dev-20250108.e51b2075dacc with build-id 1ffc83e51d7f78126ce77667ff1140f5f4913518

Kernel Version: 6.8.0-1021-aws

Issue description

During the disrupt_mgmt_backup_specific_keyspaces execution the following error occurred:

Traceback (most recent call last):
  File "/home/ubuntu/scylla-cluster-tests/sdcm/nemesis.py", line 5486, in wrapper
    result = method(*args[1:], **kwargs)
  File "/home/ubuntu/scylla-cluster-tests/sdcm/nemesis.py", line 3015, in disrupt_mgmt_backup_specific_keyspaces
    self._mgmt_backup(backup_specific_tables=True)
  File "/home/ubuntu/scylla-cluster-tests/sdcm/sct_events/group_common_events.py", line 534, in wrapper
    return decorated_func(*args, **kwargs)
  File "/home/ubuntu/scylla-cluster-tests/sdcm/sct_events/group_common_events.py", line 519, in inner_func
    return func(*args, **kwargs)
  File "/home/ubuntu/scylla-cluster-tests/sdcm/nemesis.py", line 3173, in _mgmt_backup
    mgr_task = mgr_cluster.create_backup_task(location_list=[location, ], keyspace_list=non_test_keyspaces)
  File "/home/ubuntu/scylla-cluster-tests/sdcm/mgmt/cli.py", line 671, in create_backup_task
    res = self.sctool.run(cmd=cmd, parse_table_res=False)
  File "/home/ubuntu/scylla-cluster-tests/sdcm/mgmt/cli.py", line 1208, in run
    raise ScyllaManagerError(f"Encountered an error on sctool command: {cmd}: {ex}") from ex
sdcm.mgmt.common.ScyllaManagerError: Encountered an error on sctool command: backup -c e4d8d68d-804e-418a-b109-86a08a23ebc7 --keyspace keyspace1,ks_truncate,mview  --location s3:manager-backup-tests-us-east-1 : Encountered a bad command exit code!

Command: 'sudo sctool backup -c e4d8d68d-804e-418a-b109-86a08a23ebc7 --keyspace keyspace1,ks_truncate,mview  --location s3:manager-backup-tests-us-east-1 '

Exit code: 1

Stdout:



Stderr:

Error: create backup target: location is not accessible
 10.4.23.157: agent [HTTP 400] operation put: s3 upload: 301 Moved Permanently: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint. (code:PermanentRedirect) - make sure the location is correct and credentials are set, to debug SSH to 10.4.23.157 and run "scylla-manager-agent check-location -L s3:manager-backup-tests-us-east-1 --debug"
Trace ID: Y6YlRSxsT52kKzvH4R0Hpw (grep in scylla-manager logs)

Error occurred during creation of backup task, not sure about the root cause of such behaviour

Impact

Backup task failure

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 6 nodes (i4i.4xlarge)

Scylla Nodes used in this run:

  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-9 (34.251.4.141 | 10.4.22.20) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-8 (34.241.26.106 | 10.4.20.131) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-7 (3.254.126.153 | 10.4.22.124) (shards: -1)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-6 (52.208.146.125 | 10.4.22.186) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-5 (54.247.14.114 | 10.4.22.170) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-4 (54.155.14.93 | 10.4.22.97) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-3 (54.194.32.152 | 10.4.20.122) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-2 (54.220.6.171 | 10.4.20.202) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-16 (54.76.98.223 | 10.4.23.157) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-15 (52.19.27.230 | 10.4.22.242) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-14 (54.76.156.247 | 10.4.21.108) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-13 (52.210.61.57 | 10.4.20.224) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-12 (52.212.247.74 | 10.4.21.143) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-11 (52.213.210.215 | 10.4.23.96) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-10 (54.171.113.110 | 10.4.22.53) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-1 (54.247.50.94 | 10.4.23.234) (shards: 14)

OS / Image: ami-0419ef0a7ad763693 (NO RUNNER: NO RUNNER)

Test: longevity-50gb-3days-test
Test id: 2dcd4a4a-e69e-492e-b62c-eb5f73fd311d
Test name: scylla-master/tier1/longevity-50gb-3days-test
Test method: longevity_test.LongevityTest.test_custom_time
Test config file(s):

Logs and commands
  • Restore Monitor Stack command: $ hydra investigate show-monitor 2dcd4a4a-e69e-492e-b62c-eb5f73fd311d
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs 2dcd4a4a-e69e-492e-b62c-eb5f73fd311d

Logs:

Jenkins job URL
Argus

@timtimb0t
Copy link
Author

@VAveryanov8 , @Michal-Leszczynski could you please take a look at this issue? maybe some additional information is needed?

@timtimb0t
Copy link
Author

Packages

Scylla version: 6.3.0~dev-20250108.e51b2075dacc with build-id 1ffc83e51d7f78126ce77667ff1140f5f4913518

Kernel Version: 6.8.0-1021-aws

Installation details

Cluster size: 4 nodes (i4i.4xlarge)

Scylla Nodes used in this run:

  • longevity-50gb-12h-master-db-node-bf0c7579-7 (63.34.7.22 | 10.4.8.238) (shards: 13)
  • longevity-50gb-12h-master-db-node-bf0c7579-6 (34.245.207.49 | 10.4.11.63) (shards: 12)
  • longevity-50gb-12h-master-db-node-bf0c7579-5 (18.201.25.47 | 10.4.10.152) (shards: 12)
  • longevity-50gb-12h-master-db-node-bf0c7579-4 (3.255.173.221 | 10.4.11.204) (shards: 13)
  • longevity-50gb-12h-master-db-node-bf0c7579-3 (54.76.108.105 | 10.4.11.166) (shards: 12)
  • longevity-50gb-12h-master-db-node-bf0c7579-2 (3.250.88.220 | 10.4.11.229) (shards: 12)
  • longevity-50gb-12h-master-db-node-bf0c7579-1 (54.217.164.238 | 10.4.10.126) (shards: 8)

OS / Image: ami-0419ef0a7ad763693 (NO RUNNER: NO RUNNER)

Test: longevity-150gb-asymmetric-cluster-12h-test
Test id: bf0c7579-e48d-4814-9625-8a3f1193c8d7
Test name: scylla-master/tier1/longevity-150gb-asymmetric-cluster-12h-test
Test method: longevity_test.LongevityTest.test_custom_time
Test config file(s):

Logs and commands
  • Restore Monitor Stack command: $ hydra investigate show-monitor bf0c7579-e48d-4814-9625-8a3f1193c8d7
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs bf0c7579-e48d-4814-9625-8a3f1193c8d7

Logs:

Jenkins job URL
Argus

@timtimb0t
Copy link
Author

Packages

Scylla version: 6.3.0~dev-20250108.e51b2075dacc with build-id 1ffc83e51d7f78126ce77667ff1140f5f4913518

Kernel Version: 6.8.0-1021-aws

Installation details

Cluster size: 6 nodes (i4i.4xlarge)

Scylla Nodes used in this run:

  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-9 (34.251.4.141 | 10.4.22.20) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-8 (34.241.26.106 | 10.4.20.131) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-7 (3.254.126.153 | 10.4.22.124) (shards: -1)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-6 (52.208.146.125 | 10.4.22.186) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-5 (54.247.14.114 | 10.4.22.170) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-4 (54.155.14.93 | 10.4.22.97) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-3 (54.194.32.152 | 10.4.20.122) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-2 (54.220.6.171 | 10.4.20.202) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-16 (54.76.98.223 | 10.4.23.157) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-15 (52.19.27.230 | 10.4.22.242) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-14 (54.76.156.247 | 10.4.21.108) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-13 (52.210.61.57 | 10.4.20.224) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-12 (52.212.247.74 | 10.4.21.143) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-11 (52.213.210.215 | 10.4.23.96) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-10 (54.171.113.110 | 10.4.22.53) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-2dcd4a4a-1 (54.247.50.94 | 10.4.23.234) (shards: 14)

OS / Image: ami-0419ef0a7ad763693 (NO RUNNER: NO RUNNER)

Test: longevity-50gb-3days-test
Test id: 2dcd4a4a-e69e-492e-b62c-eb5f73fd311d
Test name: scylla-master/tier1/longevity-50gb-3days-test
Test method: longevity_test.LongevityTest.test_custom_time
Test config file(s):

Logs and commands
  • Restore Monitor Stack command: $ hydra investigate show-monitor 2dcd4a4a-e69e-492e-b62c-eb5f73fd311d
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs 2dcd4a4a-e69e-492e-b62c-eb5f73fd311d

Logs:

Jenkins job URL
Argus

@VAveryanov8
Copy link
Collaborator

It looks to me that:

  1. Scylla manager agent was running in eu-west region
  2. I assume from bucket name manager-backup-tests-us-east-1 that it's created in us-east-1 region ( correct me if I'm wrong)
  3. scylla-manager-agent.yaml regarding s3 look empty -
auth_token: ***
prometheus: :5090
s3: {}
tls_cert_file: null
tls_key_file: null

Accordingly to https://manager.docs.scylladb.com/stable/backup/setup-amazon-s3.html#config-file

If the S3 bucket is not running in the same region as the AWS EC2 instance uncomment and set the region to the S3 bucket’s region.

I think changing scylla-manager-agent.yaml to smth like:

auth_token: ***
prometheus: :5090
s3: 
  region: us-east-1
tls_cert_file: null
tls_key_file: null

should fix the issue.

In the meantime, I think we need to discuss with @Michal-Leszczynski and @karol-kokoszka if it's possible to handle this redirect automatically and do we want to do it.

@Michal-Leszczynski
Copy link
Collaborator

In the meantime, I think we need to discuss with @Michal-Leszczynski and @karol-kokoszka if it's possible to handle this redirect automatically and do we want to do it.

I think it's possible, but I would give it a low priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants