Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] [Hive source connector] FileConnectorException: ErrorCode:[FILE-08], ErrorDescription:[File read failed] #8427

Open
2 of 3 tasks
Light-Towers opened this issue Jan 2, 2025 · 0 comments
Labels

Comments

@Light-Towers
Copy link

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

Hive source connector 出现文件读取异常,但文件确实存在(见截图)。业务表将近有4亿条数据,共200个文件。请问为什么会出现这个问题?
image

SeaTunnel Version

2.3.8

SeaTunnel Config

env {
  execution.parallelism = 4
  job.mode = "BATCH"
  read_limit.rows_per_second = 50000
}

source {
 Hive {
   table_name = "dw.app_enterprise_info"
   metastore_uri = "thrift://master02:9083"
   hive_site_path = "/etc/hive/conf/hive-site.xml"
   hdfs_site_path = "/etc/hadoop/conf/hdfs-site.xml"
   read_partitions = ["dt=2024-12-31"]
   compress_codec = "orc"
 }
}

sink {
  Console { }
}

Running Command

/opt/seatunnel/apache-seatunnel-2.3.8/bin/seatunnel.sh --config ./hive2mongo -m local

Error Exception

2025-01-02 16:47:04,888 INFO  [.a.s.c.s.c.s.ConsoleSinkWriter] [st-multi-table-sink-writer-1] - subtaskIndex=0  rowIndex=1806171:  SeaTunnelRow#tableId=dw.app_enterprise_info SeaTunnelRow#kind=INSERT : d783b7bc34975a, 银川市建材销售部, , , , , , xxx, 一般项目:, 银川市商住楼105, 银川市商住楼105, 2022-02-24, 926xxxxNW4Y, 001, 注销, [106.30129, 38.494615], 640000, 宁夏回族自治区, 640100, 银川市, , , , , , , , , , , , , , , , , , , [1], 18100000066, , , [{"category":"状态","tagName":"续"}], 6, 2022-02-24, 9999-12-31, , , 人民币, , MAxxxxx4, 2024-04-29, 1, 0, 2024-07-18, 2024-12-31
2025-01-02 16:47:04,890 INFO  [c.h.i.s.t.TcpServerConnection ] [hz.main.IO.thread-in-1] - [localhost]:5801 [seatunnel-930048] [5.1] Connection[id=1, /127.0.0.1:5801->/127.0.0.1:55002, qualifier=null, endpoint=[127.0.0.1]:55002, remoteUuid=412e8da3-b9de-4db3-b558-bf654a01e6af, alive=false, connectionType=JVM, planeIndex=-1] closed. Reason: Connection closed by the other side
2025-01-02 16:47:04,890 INFO  [.c.i.c.ClientConnectionManager] [main] - hz.client_1 [seatunnel-930048] [5.1] Removed connection to endpoint: [localhost]:5801:a3575d69-8d06-49d7-b52e-c16135fdeb48, connection: ClientConnection{alive=false, connectionId=1, channel=NioChannel{/127.0.0.1:55002->localhost/127.0.0.1:5801}, remoteAddress=[localhost]:5801, lastReadTime=2025-01-02 16:47:04.622, lastWriteTime=2025-01-02 16:47:01.020, closedTime=2025-01-02 16:47:04.886, connected server version=5.1}
2025-01-02 16:47:04,890 INFO  [c.h.c.LifecycleService        ] [main] - hz.client_1 [seatunnel-930048] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is CLIENT_DISCONNECTED
2025-01-02 16:47:04,892 INFO  [c.h.c.i.ClientEndpointManager ] [hz.main.event-3] - [localhost]:5801 [seatunnel-930048] [5.1] Destroying ClientEndpoint{connection=Connection[id=1, /127.0.0.1:5801->/127.0.0.1:55002, qualifier=null, endpoint=[127.0.0.1]:55002, remoteUuid=412e8da3-b9de-4db3-b558-bf654a01e6af, alive=false, connectionType=JVM, planeIndex=-1], clientUuid=412e8da3-b9de-4db3-b558-bf654a01e6af, clientName=hz.client_1, authenticated=true, clientVersion=5.1, creationTime=1735805435983, latest clientAttributes=lastStatisticsCollectionTime=1735807621018,enterprise=false,clientType=JVM,clientVersion=5.1,clusterConnectionTimestamp=1735805435971,clientAddress=127.0.0.1,clientName=hz.client_1,credentials.principal=null,os.committedVirtualMemorySize=24853250048,os.freePhysicalMemorySize=113494515712,os.freeSwapSpaceSize=0,os.maxFileDescriptorCount=100000,os.openFileDescriptorCount=69,os.processCpuTime=1177350000000,os.systemLoadAverage=0.64,os.totalPhysicalMemorySize=270185119744,os.totalSwapSpaceSize=0,runtime.availableProcessors=88,runtime.freeMemory=186469144,runtime.maxMemory=513802240,runtime.totalMemory=513802240,runtime.uptime=2188449,runtime.usedMemory=327335416, labels=[]}
2025-01-02 16:47:04,893 INFO  [c.h.c.LifecycleService        ] [main] - hz.client_1 [seatunnel-930048] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is SHUTDOWN
2025-01-02 16:47:04,894 INFO  [s.c.s.s.c.ClientExecuteCommand] [main] - Closed SeaTunnel client......
2025-01-02 16:47:04,894 INFO  [c.h.c.LifecycleService        ] [main] - [localhost]:5801 [seatunnel-930048] [5.1] [localhost]:5801 is SHUTTING_DOWN
2025-01-02 16:47:04,898 INFO  [c.h.i.p.i.MigrationManager    ] [hz.main.cached.thread-16] - [localhost]:5801 [seatunnel-930048] [5.1] Shutdown request of Member [localhost]:5801 - a3575d69-8d06-49d7-b52e-c16135fdeb48 this master is handled
2025-01-02 16:47:04,904 INFO  [c.h.i.i.Node                  ] [main] - [localhost]:5801 [seatunnel-930048] [5.1] Shutting down connection manager...
2025-01-02 16:47:04,906 INFO  [c.h.i.i.Node                  ] [main] - [localhost]:5801 [seatunnel-930048] [5.1] Shutting down node engine...
2025-01-02 16:47:04,919 INFO  [.c.c.DefaultClassLoaderService] [main] - close classloader service
2025-01-02 16:47:04,919 INFO  [o.a.s.c.u.RetryUtils          ] [event-forwarder-0] - Failed to execute due to java.lang.NullPointerException: Target cannot be null!
	at com.hazelcast.internal.util.Preconditions.checkNotNull(Preconditions.java:59)
	at com.hazelcast.spi.impl.operationservice.impl.OperationServiceImpl.createInvocationBuilder(OperationServiceImpl.java:300)
	at org.apache.seatunnel.engine.server.utils.NodeEngineUtil.sendOperationToMasterNode(NodeEngineUtil.java:37)
	at org.apache.seatunnel.engine.server.EventService.lambda$initEventForwardService$1(EventService.java:72)
	at org.apache.seatunnel.common.utils.RetryUtils.retryWithException(RetryUtils.java:48)
	at org.apache.seatunnel.engine.server.EventService.lambda$initEventForwardService$2(EventService.java:70)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
. Retrying attempt (1/2) after backoff of 0 ms
2025-01-02 16:47:04,920 INFO  [o.a.s.c.u.RetryUtils          ] [event-forwarder-0] - Failed to execute due to java.lang.NullPointerException: Target cannot be null!
	at com.hazelcast.internal.util.Preconditions.checkNotNull(Preconditions.java:59)
	at com.hazelcast.spi.impl.operationservice.impl.OperationServiceImpl.createInvocationBuilder(OperationServiceImpl.java:300)
	at org.apache.seatunnel.engine.server.utils.NodeEngineUtil.sendOperationToMasterNode(NodeEngineUtil.java:37)
	at org.apache.seatunnel.engine.server.EventService.lambda$initEventForwardService$1(EventService.java:72)
	at org.apache.seatunnel.common.utils.RetryUtils.retryWithException(RetryUtils.java:48)
	at org.apache.seatunnel.engine.server.EventService.lambda$initEventForwardService$2(EventService.java:70)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
. Retrying attempt (2/2) after backoff of 0 ms
2025-01-02 16:47:04,920 WARN  [o.a.s.e.s.EventService        ] [event-forwarder-0] - Event forward failed, discard events 1
java.lang.RuntimeException: Execute given execution failed after retry 2 times
	at org.apache.seatunnel.common.utils.RetryUtils.retryWithException(RetryUtils.java:75) ~[seatunnel-starter.jar:2.3.8]
	at org.apache.seatunnel.engine.server.EventService.lambda$initEventForwardService$2(EventService.java:70) ~[seatunnel-starter.jar:2.3.8]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_181]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: java.lang.NullPointerException: Target cannot be null!
	at com.hazelcast.internal.util.Preconditions.checkNotNull(Preconditions.java:59) ~[seatunnel-starter.jar:2.3.8]
	at com.hazelcast.spi.impl.operationservice.impl.OperationServiceImpl.createInvocationBuilder(OperationServiceImpl.java:300) ~[seatunnel-starter.jar:2.3.8]
	at org.apache.seatunnel.engine.server.utils.NodeEngineUtil.sendOperationToMasterNode(NodeEngineUtil.java:37) ~[seatunnel-starter.jar:2.3.8]
	at org.apache.seatunnel.engine.server.EventService.lambda$initEventForwardService$1(EventService.java:72) ~[seatunnel-starter.jar:2.3.8]
	at org.apache.seatunnel.common.utils.RetryUtils.retryWithException(RetryUtils.java:48) ~[seatunnel-starter.jar:2.3.8]
	... 6 more
2025-01-02 16:47:04,931 INFO  [c.h.i.i.NodeExtension         ] [main] - [localhost]:5801 [seatunnel-930048] [5.1] Destroying node NodeExtension.
2025-01-02 16:47:04,931 INFO  [c.h.i.i.Node                  ] [main] - [localhost]:5801 [seatunnel-930048] [5.1] Hazelcast Shutdown is completed in 34 ms.
2025-01-02 16:47:04,932 INFO  [c.h.c.LifecycleService        ] [main] - [localhost]:5801 [seatunnel-930048] [5.1] [localhost]:5801 is SHUTDOWN
2025-01-02 16:47:04,932 INFO  [s.c.s.s.c.ClientExecuteCommand] [main] - Closed HazelcastInstance ......
2025-01-02 16:47:04,932 INFO  [s.c.s.s.c.ClientExecuteCommand] [main] - Closed metrics executor service ......
2025-01-02 16:47:04,932 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - 

===============================================================================


2025-01-02 16:47:04,932 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Fatal Error, 

2025-01-02 16:47:04,932 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Please submit bug report in https://github.com/apache/seatunnel/issues

2025-01-02 16:47:04,932 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Reason:SeaTunnel job executed failed 

2025-01-02 16:47:04,933 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Exception StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
	at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:213)
	at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
	at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: org.apache.seatunnel.connectors.seatunnel.file.exception.FileConnectorException: ErrorCode:[FILE-08], ErrorDescription:[File read failed] - Read data from this file [dw.app_enterprise_info_hdfs://nameservice1/user/hive/warehouse/app/app_enterprise_info/dt=2024-12-31/part-00009-7f77f045-c85d-4e50-9f98-51d2874ecce2.c000] failed
	at org.apache.seatunnel.connectors.seatunnel.hive.source.reader.MultipleTableHiveSourceReader.pollNext(MultipleTableHiveSourceReader.java:87)
	at org.apache.seatunnel.engine.server.task.flow.SourceFlowLifeCycle.collect(SourceFlowLifeCycle.java:159)
	at org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask.collect(SourceSeaTunnelTask.java:127)
	at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
	at org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask.call(SourceSeaTunnelTask.java:132)
	at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:693)
	at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:1018)
	at org.apache.seatunnel.api.tracing.MDCRunnable.run(MDCRunnable.java:39)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException

	at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:205)
	... 2 more
 
2025-01-02 16:47:04,934 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - 
===============================================================================



Exception in thread "main" org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
	at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:213)
	at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
	at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: org.apache.seatunnel.connectors.seatunnel.file.exception.FileConnectorException: ErrorCode:[FILE-08], ErrorDescription:[File read failed] - Read data from this file [dw.app_enterprise_info_hdfs://nameservice1/user/hive/warehouse/app/app_enterprise_info/dt=2024-12-31/part-00009-7f77f045-c85d-4e50-9f98-51d2874ecce2.c000] failed
	at org.apache.seatunnel.connectors.seatunnel.hive.source.reader.MultipleTableHiveSourceReader.pollNext(MultipleTableHiveSourceReader.java:87)
	at org.apache.seatunnel.engine.server.task.flow.SourceFlowLifeCycle.collect(SourceFlowLifeCycle.java:159)
	at org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask.collect(SourceSeaTunnelTask.java:127)
	at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
	at org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask.call(SourceSeaTunnelTask.java:132)
	at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:693)
	at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:1018)
	at org.apache.seatunnel.api.tracing.MDCRunnable.run(MDCRunnable.java:39)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException

	at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:205)
	... 2 more
2025-01-02 16:47:04,935 INFO  [s.c.s.s.c.ClientExecuteCommand] [ForkJoinPool.commonPool-worker-100] - run shutdown hook because get close signal

Zeta or Flink or Spark Version

Zeta

Java or Scala Version

java version "1.8.0_181"

Screenshots

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant