You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Form my logs it would seem that during the prefill refresh if the zone file download fails it sometimes causes the cache FS to fails for some unknown reason see logs
Aug 02 19:18:05 HOSTNAME kresd[584737]: [prefil] root zone file valid for 13 seconds, reusing data from disk
Aug 02 19:18:05 HOSTNAME supervisord[584725]: captured stdio output from kresd1[584737] (stderr):
Aug 02 19:18:05 HOSTNAME kresd[584747]: [prefil] downloading root zone to file root.zone ...
Aug 02 19:18:05 HOSTNAME supervisord[584725]: captured stdio output from kresd1[584737] (stderr): SIGBUS received; this is most likely due to filling up the filesystem where cache resides.
Aug 02 19:18:05 HOSTNAME supervisord[584725]: captured stdio output from kresd1[584737] (stderr):
Aug 02 19:18:05 HOSTNAME supervisord[584725]: captured stdio output from kresd1[584737] (stderr):
Aug 02 19:18:05 HOSTNAME supervisord[584725]: captured stdio output from kresd1[584737] (stderr):
Aug 02 19:18:05 HOSTNAME supervisord[584725]: exited: kresd1 (exit status 121; not expected)
Aug 02 19:18:05 HOSTNAME aiohttp.server[584733]: Error handling request
Traceback (most recent call last):
File "/usr/lib64/python3.9/site-packages/aiohttp/web_protocol.py", line 452, in _handle_request
resp = await request_handler(request)
File "/usr/lib64/python3.9/site-packages/aiohttp/web_app.py", line 543, in _handle
resp = await handler(request)
File "/usr/lib64/python3.9/site-packages/aiohttp/web_middlewares.py", line 114, in impl
return await handler(request)
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/server.py", line 62, in error_handler
return await handler(request)
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/server.py", line 243, in _handler_metrics
body=await statistics.report_stats(),
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/statistics.py", line 338, in report_stats
await _resolver_collector.collect_kresd_stats()
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/statistics.py", line 96, in collect_kresd_stats
stats_raw = await command_registered_workers(cmd)
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/kresd_controller/registered_workers.py", line 34, in command_registered_workers
pairs = await asyncio.gather(*(single_pair(inst) for inst in _REGISTERED_WORKERS.values()))
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/kresd_controller/registered_workers.py", line 32, in single_pair
return sub.id, await sub.command(cmd)
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/kresd_controller/interface.py", line 173, in command
reader, writer = await asyncio.open_unix_connection(f"./control/{int(self.id)}")
File "/usr/lib64/python3.9/asyncio/streams.py", line 111, in open_unix_connection
transport, _ = await loop.create_unix_connection(
File "/usr/lib64/python3.9/asyncio/unix_events.py", line 244, in create_unix_connection
await self.sock_connect(sock, path)
File "/usr/lib64/python3.9/asyncio/selector_events.py", line 500, in sock_connect
return await fut
File "/usr/lib64/python3.9/asyncio/selector_events.py", line 505, in _sock_connect
sock.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
Aug 02 19:18:05 HOSTNAME aiohttp.access[584733]: 10.157.2.182 [02/Aug/2024:19:18:05 +0000] "GET /metrics HTTP/1.1" 500 244 "-" "Prometheus/2.53.1"
Aug 02 19:18:06 HOSTNAME supervisord[584725]: spawned: 'kresd1' with pid 596546
Aug 02 19:18:06 HOSTNAME kresd[596546]: [system] path = /run/knot-resolver/control/1
Aug 02 19:18:06 HOSTNAME kresd[596546]: [rules ] cleared 1 stale reader locks
Aug 02 19:18:06 HOSTNAME kresd[584748]: [prefil] root zone file valid for 09 minutes 55 seconds, reusing data from disk
Aug 02 19:18:06 HOSTNAME kresd[584748]: [prefil] empty zone file
Aug 02 19:18:06 HOSTNAME kresd[584748]: [prefil] error parsing zone file `root.zone`
Aug 02 19:18:06 HOSTNAME kresd[584748]: [prefil] root zone import failed (/usr/lib64/knot-resolver/kres_modules/prefill.lua:97: [prefil] zone import failed: cdata<const char *>: 0x7f11ac7c532d), retry in 09 minutes 52 seconds
Aug 02 19:18:08 HOSTNAME kresd[584746]: [rules ] cleared 1 stale reader locks
Aug 02 19:18:08 HOSTNAME kresd[584738]: [prefil] downloading root zone to file root.zone ...
Aug 02 19:18:08 HOSTNAME kresd[584753]: [system] assertion "env->is_cache" failed in cdb_write@../lib/cache/cdb_lmdb.c:689
Aug 02 19:18:08 HOSTNAME kresd[584753]: [cache ] LMDB refusing writes (ignored for 5-9s): Invalid argument
Aug 02 19:18:08 HOSTNAME kresd[596546]: [system] error while loading config: kresd1.conf:263: can't open cache path '/var/cache/knot-resolver'; working directory '/run/knot-resolver'; No space left on device (workdir '/run/knot-resolver')
Aug 02 19:18:08 HOSTNAME kresd[584753]: [system] assertion "env->is_cache" failed in cdb_write@../lib/cache/cdb_lmdb.c:689
Aug 02 19:18:08 HOSTNAME kresd[584753]: [tasign] signalling query triggered: _ta-4f66.
Aug 02 19:18:08 HOSTNAME kresd[584753]: [system] assertion "env->is_cache" failed in cdb_write@../lib/cache/cdb_lmdb.c:689
Aug 02 19:18:08 HOSTNAME kresd[584753]: [system] assertion "env->is_cache" failed in cdb_write@../lib/cache/cdb_lmdb.c:689
Aug 02 19:18:08 HOSTNAME kresd[584753]: [tasign] signalling query triggered: _ta-4f66.
Aug 02 19:18:08 HOSTNAME supervisord[584725]: exited: kresd1 (exit status 1; not expected)
Still the 7G free should be a sufficient pillow. Could it be that you were running out of RAM? (I think that can cause tmpfs fail to fill a hole in the file and thus SIGBUS in LMDB.)
Form my logs it would seem that during the prefill refresh if the zone file download fails it sometimes causes the cache FS to fails for some unknown reason see logs
Please here is my current config
and the output of my file system structure
The text was updated successfully, but these errors were encountered: