You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When run the sample python spark code to write DF into Vertica using S3, I got following errors;
Problem Description
ERROR VerticaBatchReader: Error when trying to create table. Check 'target_table_sql' option for issues.
py4j.protocol.Py4JJavaError: An error occurred while calling o59.save.
: com.vertica.spark.util.error.ConnectorException: Error when trying to create table. Check 'target_table_sql' option for issues.
at com.vertica.spark.util.error.ErrorHandling$.logAndThrowError(ErrorHandling.scala:78)
at com.vertica.spark.datasource.v2.VerticaBatchWrite.(VerticaDatasourceV2Write.scala:71)
at com.vertica.spark.datasource.v2.VerticaWriteBuilder.buildForBatch(VerticaDatasourceV2Write.scala:51)
at org.apache.spark.sql.connector.write.WriteBuilder$1.toBatch(WriteBuilder.java:44)
at org.apache.spark.sql.execution.datasources.v2.V2ExistingTableWriteExec.run(WriteToDataSourceV2Exec.scala:332)
Note: I have tested S3 connection in this spark, also the Vertica connection. The Vertica table was created but data failed to load.
Source Code:
Environment
When run the sample python spark code to write DF into Vertica using S3, I got following errors;
Problem Description
: com.vertica.spark.util.error.ConnectorException: Error when trying to create table. Check 'target_table_sql' option for issues.
at com.vertica.spark.util.error.ErrorHandling$.logAndThrowError(ErrorHandling.scala:78)
at com.vertica.spark.datasource.v2.VerticaBatchWrite.(VerticaDatasourceV2Write.scala:71)
at com.vertica.spark.datasource.v2.VerticaWriteBuilder.buildForBatch(VerticaDatasourceV2Write.scala:51)
at org.apache.spark.sql.connector.write.WriteBuilder$1.toBatch(WriteBuilder.java:44)
at org.apache.spark.sql.execution.datasources.v2.V2ExistingTableWriteExec.run(WriteToDataSourceV2Exec.scala:332)
Note: I have tested S3 connection in this spark, also the Vertica connection. The Vertica table was created but data failed to load.
Source Code:
spark = SparkSession.builder.master("local[1]").appName("Vertica Connector Pyspark Example") .getOrCreate()
hadoop_conf = spark._jsc.hadoopConfiguration()
hadoop_conf.set("fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
hadoop_conf.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
hadoop_conf.set("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider")
hadoop_conf.set("fs.s3a.path.style.access", "true")
hadoop_conf.set("fs.s3a.access.key", "XXX")
hadoop_conf.set("fs.s3a.secret.key", "XXX")
hadoop_conf.set("fs.s3a.endpoint", "https://host:4443")
cols = ["language", "users_count"]
data = [("Java", 20000), ("Python", 100000), ("Scala", 3000)]
df = spark.createDataFrame(data).toDF(*cols)
df.write.mode('overwrite').save(
format="com.vertica.spark.datasource.VerticaSource",
host=host_vertica,
user=userid,
password=password,
db=db_name,
staging_fs_url='s3a://bucket/path',
table=table_name
)
The text was updated successfully, but these errors were encountered: