io-class setting for passthrough big IO; test data validation with vdbench detected error. #1444

phyorat · 2023-03-08T01:47:08Z

Description

IO-class can passthrough big IO, like bigger than 128KB, to skip cache and writing to HDD directly; this can get higher performance and cache efficience. IO-class configure file example:

IO class id,IO class name,Eviction priority,Allocation
0,unclassified,22,0
1,request_size:le:131072,1,1

After load this IO-class, 128K IO data will write into HDD directly; and also if no cached data in cache, no data will read from cache but from HDD.
But on the other hand, if part of requested data, for example, head-64K data of 128K is cached; then read should taking 64K from cache and the left 64K from HDD.
64K cached <> 64K from HDD
|-------------------------------------|--------------------------------------|

This is expected data "splicing". The Acctual IO pathern is:
1, 64K write IO writing into cache, the first time with key = 1;
2, Read 64K and verified OK;
3, 64K write IO (with key = 2) merged two as one, writing 128KB into HDD directly;
4, Read 64K but verified failed (got key = 1, which indicating old data)
(We guess miss-read old 64K from cache)

step 3-4 may also be:
3, 64K write IO (with key = 2); plus two IO-write;
4, Two-read 64K IO merge as one, reading 128K from HDD directly(because of IO-class rule [1]);
5, Verified failed (got key = 1, which indicating old data)
(We guess miss-read old 128K data directly from HDD)

We verified data validation on this scenario, with vdbench; and found data validation error occured.

21:05:24.364 hd2-0: dvpost: /dev/vdb sd4 sd4 0x00000000 0x234520000 131072 0x0 0x5ecf4d1ed319c 0x11 0x2 0x70 0x0 0 36028797018963971
21:05:24.364 hd2-0:
21:05:24.364 hd2-0: Data Validation error for sd=sd4,lun=/dev/vdb
21:05:24.364 hd2-0: Block lba: 0x234520000; sector lba: 0x234520000; xfersize: 131072; relative sector in block: 0x00 ( 0)
21:05:24.364 hd2-0: ===> Data Validation Key miscompare.
21:05:24.364 hd2-0: ===> Data miscompare.
21:05:24.364 hd2-0: The sector below was written Tuesday, November 8, 2022 20:38:41.711 CST
21:05:24.364 hd2-0: 0x000 00000002 34520000 ........ ........ 00000002 34520000 0005ecf4 d1ed319c
21:05:24.364 hd2-0: 0x010 02..0000 73643420 20202020 00000000 01700000 20346473 20202020 00000000
21:05:24.364 hd2-0: Key miscompare always implies Data miscompare. Remainder of data suppressed.

This error shows that, tool wrote data-key "02xxxx" but read data from core is "01xxxx".
The key point is, rgiht after error occured, we read data from core-dev direactly, the data is correct - "02xxxx". So there should be data align/validation issue between cache and HDD, in a very tiny time interval (serval miliseconds)?

After cancel/clear this IO-class, and test again, no data validation error occured any more.

In addition, this configure <Sequential cutoff policy: always; --threshold 128KB> can also trigger data validation error.

Expected Behavior

No data align/validation issue between cache and HDD, when setting IO-class for skiping big IO written into cache.

Actual Behavior

There should be data align/validation issue between cache and HDD, if IO-read data partially cached.

Steps to Reproduce

set IO-class to passthrough block IO bigger and equal than 128KB
use vdbench to test data write and read validation; data is build on distributed block-storage system, lower storage is opencas nvme-cached HDD.
vdbench report data validation error

Context

Base block storage for distributed block-system; need to guarantee that data validatoin is OK.

Possible Fix

Maybe meta-data are not strongly aligned or expired between different IO stage (in miliseconds).

Logs

No evidence until now; but reverse verification(remove that IO-class) can be a clue.

Your Environment

OpenCAS version (commit hash or tag):
22.03.0.0666.release
Operating System:
CentOS Linux release 7.6.1810 (Core)
Kernel version:
5.10.38-21.hl02.el7.x86_64
Cache device type (NAND/Optane/other):
NAND
Core device type (HDD/SSD/other):
HDD
Cache configuration:
- Cache mode: wb
- Cache line size: 8
- Promotion policy: always
- Cleaning policy: alru
- Sequential cutoff policy: never
Other (e.g. lsblk, casadm -P, casadm -L)

mmichal10 · 2024-08-14T08:30:58Z

Hi @phyorat,

thank you for posting the issue. Do you happen to still have the vbdench config you used for your test?

mmichal10 · 2024-08-20T11:12:32Z

I came up with a fio config to mimic the vbdench's behaviour:

[dc_repro]
filename=/dev/cas1-1
ioengine=libaio
iodepth=1
direct=1
numjobs=1

# Generate new offset for every second write
rw=randwrite:2
rw_sequencer=identical

bssplit=64k/50:256k/50

# This ensures that every 64K write will be followed by 256K write
number_ios=2
loops=10000

verify=md5
# Verify after every write
verify_backlog=1
# Stop FIO if DC
verify_fatal=1

phyorat added the bug Something isn't working label Mar 8, 2023

katlapinka added the v24.9 label Jul 16, 2024

katlapinka self-assigned this Jul 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

io-class setting for passthrough big IO; test data validation with vdbench detected error. #1444

io-class setting for passthrough big IO; test data validation with vdbench detected error. #1444

phyorat commented Mar 8, 2023 •

edited

Loading

mmichal10 commented Aug 14, 2024

mmichal10 commented Aug 20, 2024

io-class setting for passthrough big IO; test data validation with vdbench detected error. #1444

io-class setting for passthrough big IO; test data validation with vdbench detected error. #1444

Comments

phyorat commented Mar 8, 2023 • edited Loading

Description

IO-class can passthrough big IO, like bigger than 128KB, to skip cache and writing to HDD directly; this can get higher performance and cache efficience. IO-class configure file example:

IO class id,IO class name,Eviction priority,Allocation 0,unclassified,22,0 1,request_size:le:131072,1,1

Expected Behavior

Actual Behavior

Steps to Reproduce

Context

Possible Fix

Logs

Your Environment

mmichal10 commented Aug 14, 2024

mmichal10 commented Aug 20, 2024

phyorat commented Mar 8, 2023 •

edited

Loading

IO class id,IO class name,Eviction priority,Allocation
0,unclassified,22,0
1,request_size:le:131072,1,1