-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WARN Utils: An error occurred while trying to read the S3 bucket lifecycle configuration java.lang.NullPointerException #346
Comments
The same here:
I thought it had to do with not setting a bucket prefix when configuring the lifecycle policy but even after setting it, it keeps showing (although the operation succeeds) |
+1 |
3 similar comments
+1 |
+1 |
+1 |
`getPrefix` method on `Rule` [got deprecated](https://github.com/aws/aws-sdk-java/blob/355424771b951ef0066b19c3eab4b4356e270cf4/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/model/BucketLifecycleConfiguration.java#L145-L153) It seems that reponse on the wire was also changed so this method no longer returns the prefix even on older versions of AWS SDK (as the one used by this project). I've bumped the AWS SDK dependencies version and implemented the check using new visitor pattern. I am not sure it is the nicest scala code, but I think it works. Tests stil pass. I believe this fixes databricks#346.
I think I have a PR that fixes this (you need to upgrade AWS SDK dependencies). See: #357 |
Disclaimer: New to Scala |
+1 seeing this in pyspark |
+1 |
It seems to me that this issue has something to do with the fact that
|
I agree that this is a super annoying error, since the stack trace is so long. This solution worked for me:
I got the suggestion from here. |
+1 |
For us it turned out "the file is not there" - that is being attempted to be read and thus "An error occurred while trying to read the S3 bucket lifecycle configuration and a subsequent "S3ServiceException:Access Denied,Status 403,Error AccessDenied," It would seem we are reading before the file is available - parallel processing woes? Object not found results in 403 (access denied) rather than 404 (not found) because different return codes would provide an attacker with useful information - it leaks information that an object of a given name actually exists. A simple dictionary-style attack could then enumerate all of the objects in someone's bucket. For a similar reason, a login page should never emit "Invalid user" and "Invalid password" for the two authentication failure scenarios; it should always emit "Invalid credentials". A fix would then be Check the regions. For example: It was because the region was set to "us-west-2" that was visible on the aws console link. However the contents were hosted on ap-southeast-1. Check Permissions. By default, permissions are given to the AWS user only. If you use IAM authentication with access keys, you must add permissions to "authenticated users" in S3. "...If the object you request does not exist, the error Amazon S3 returns depends on whether you also have the s3:ListBucket permission. If you have the s3:ListBucket permission on the bucket, Amazon S3 will return an HTTP status code 404 ("no such key") error. if you don’t have the s3:ListBucket permission, Amazon S3 will return an HTTP status code 403 ("access denied") error." Keep your role policy as in the helloV post. Your architecture chooses the right solution, hope this helps |
By now, I have implemented multiple Spark applications with this library and the issue does not affect anything. |
i solve the problem inverting this params
after:
|
+1 |
+1 Just saw this happening using databricks runs (using Spark 3.2.1). |
I was able to silence this by setting this piece of code's logger to ERROR
|
Hello guys, I am getting this warn
I have seen this issue here before, but it still occurs for me.
I do have a lifecycle configuration for my bucket. I've traced this warn to this piece of code
I believe the exception is thrown because of this
key.startsWith(rule.getPrefix)
I checked the Amazon SDK documents, the method getPrefix returns null if the prefix wasn't set using the setPrefix method, therefore it will always return null in this case.
I have a very limited knowledge of the Amazon SDK and Scala, so I'm not really sure about this.
The text was updated successfully, but these errors were encountered: