You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
IO-class can passthrough big IO, like bigger than 128KB, to skip cache and writing to HDD directly; this can get higher performance and cache efficience. IO-class configure file example:
IO class id,IO class name,Eviction priority,Allocation
0,unclassified,22,0
1,request_size:le:131072,1,1
After load this IO-class, 128K IO data will write into HDD directly; and also if no cached data in cache, no data will read from cache but from HDD.
But on the other hand, if part of requested data, for example, head-64K data of 128K is cached; then read should taking 64K from cache and the left 64K from HDD.
64K cached <> 64K from HDD
|-------------------------------------|--------------------------------------|
This is expected data "splicing". The Acctual IO pathern is:
1, 64K write IO writing into cache, the first time with key = 1;
2, Read 64K and verified OK;
3, 64K write IO (with key = 2) merged two as one, writing 128KB into HDD directly;
4, Read 64K but verified failed (got key = 1, which indicating old data)
(We guess miss-read old 64K from cache)
step 3-4 may also be:
3, 64K write IO (with key = 2); plus two IO-write;
4, Two-read 64K IO merge as one, reading 128K from HDD directly(because of IO-class rule [1]);
5, Verified failed (got key = 1, which indicating old data) (We guess miss-read old 128K data directly from HDD)
We verified data validation on this scenario, with vdbench; and found data validation error occured.
21:05:24.364 hd2-0: dvpost: /dev/vdb sd4 sd4 0x00000000 0x234520000 131072 0x0 0x5ecf4d1ed319c 0x11 0x2 0x70 0x0 0 36028797018963971
21:05:24.364 hd2-0:
21:05:24.364 hd2-0: Data Validation error for sd=sd4,lun=/dev/vdb
21:05:24.364 hd2-0: Block lba: 0x234520000; sector lba: 0x234520000; xfersize: 131072; relative sector in block: 0x00 ( 0)
21:05:24.364 hd2-0: ===> Data Validation Key miscompare.
21:05:24.364 hd2-0: ===> Data miscompare.
21:05:24.364 hd2-0: The sector below was written Tuesday, November 8, 2022 20:38:41.711 CST
21:05:24.364 hd2-0: 0x000 00000002 34520000 ........ ........ 00000002 34520000 0005ecf4 d1ed319c
21:05:24.364 hd2-0: 0x010 02..0000 73643420 20202020 00000000 01700000 20346473 20202020 00000000
21:05:24.364 hd2-0: Key miscompare always implies Data miscompare. Remainder of data suppressed.
This error shows that, tool wrote data-key "02xxxx" but read data from core is "01xxxx".
The key point is, rgiht after error occured, we read data from core-dev direactly, the data is correct - "02xxxx". So there should be data align/validation issue between cache and HDD, in a very tiny time interval (serval miliseconds)?
After cancel/clear this IO-class, and test again, no data validation error occured any more.
In addition, this configure <Sequential cutoff policy: always; --threshold 128KB> can also trigger data validation error.
Expected Behavior
No data align/validation issue between cache and HDD, when setting IO-class for skiping big IO written into cache.
Actual Behavior
There should be data align/validation issue between cache and HDD, if IO-read data partially cached.
Steps to Reproduce
set IO-class to passthrough block IO bigger and equal than 128KB
use vdbench to test data write and read validation; data is build on distributed block-storage system, lower storage is opencas nvme-cached HDD.
vdbench report data validation error
Context
Base block storage for distributed block-system; need to guarantee that data validatoin is OK.
Possible Fix
Maybe meta-data are not strongly aligned or expired between different IO stage (in miliseconds).
Logs
No evidence until now; but reverse verification(remove that IO-class) can be a clue.
Your Environment
OpenCAS version (commit hash or tag):
22.03.0.0666.release
Operating System:
CentOS Linux release 7.6.1810 (Core)
Kernel version:
5.10.38-21.hl02.el7.x86_64
Cache device type (NAND/Optane/other):
NAND
Core device type (HDD/SSD/other):
HDD
Cache configuration:
Cache mode: wb
Cache line size: 8
Promotion policy: always
Cleaning policy: alru
Sequential cutoff policy: never
Other (e.g. lsblk, casadm -P, casadm -L)
The text was updated successfully, but these errors were encountered:
I came up with a fio config to mimic the vbdench's behaviour:
[dc_repro]
filename=/dev/cas1-1
ioengine=libaio
iodepth=1
direct=1
numjobs=1
# Generate new offset for every second write
rw=randwrite:2
rw_sequencer=identical
bssplit=64k/50:256k/50
# This ensures that every 64K write will be followed by 256K write
number_ios=2
loops=10000
verify=md5
# Verify after every write
verify_backlog=1
# Stop FIO if DC
verify_fatal=1
Description
IO-class can passthrough big IO, like bigger than 128KB, to skip cache and writing to HDD directly; this can get higher performance and cache efficience. IO-class configure file example:
IO class id,IO class name,Eviction priority,Allocation
0,unclassified,22,0
1,request_size:le:131072,1,1
After load this IO-class, 128K IO data will write into HDD directly; and also if no cached data in cache, no data will read from cache but from HDD.
But on the other hand, if part of requested data, for example, head-64K data of 128K is cached; then read should taking 64K from cache and the left 64K from HDD.
64K cached <> 64K from HDD
|-------------------------------------|--------------------------------------|
This is expected data "splicing". The Acctual IO pathern is:
1, 64K write IO writing into cache, the first time with key = 1;
2, Read 64K and verified OK;
3, 64K write IO (with key = 2) merged two as one, writing 128KB into HDD directly;
4, Read 64K but verified failed (got key = 1, which indicating old data)
(We guess miss-read old 64K from cache)
step 3-4 may also be:
3, 64K write IO (with key = 2); plus two IO-write;
4, Two-read 64K IO merge as one, reading 128K from HDD directly(because of IO-class rule [1]);
5, Verified failed (got key = 1, which indicating old data)
(We guess miss-read old 128K data directly from HDD)
We verified data validation on this scenario, with vdbench; and found data validation error occured.
21:05:24.364 hd2-0: dvpost: /dev/vdb sd4 sd4 0x00000000 0x234520000 131072 0x0 0x5ecf4d1ed319c 0x11 0x2 0x70 0x0 0 36028797018963971
21:05:24.364 hd2-0:
21:05:24.364 hd2-0: Data Validation error for sd=sd4,lun=/dev/vdb
21:05:24.364 hd2-0: Block lba: 0x234520000; sector lba: 0x234520000; xfersize: 131072; relative sector in block: 0x00 ( 0)
21:05:24.364 hd2-0: ===> Data Validation Key miscompare.
21:05:24.364 hd2-0: ===> Data miscompare.
21:05:24.364 hd2-0: The sector below was written Tuesday, November 8, 2022 20:38:41.711 CST
21:05:24.364 hd2-0: 0x000 00000002 34520000 ........ ........ 00000002 34520000 0005ecf4 d1ed319c
21:05:24.364 hd2-0: 0x010 02..0000 73643420 20202020 00000000 01700000 20346473 20202020 00000000
21:05:24.364 hd2-0: Key miscompare always implies Data miscompare. Remainder of data suppressed.
This error shows that, tool wrote data-key "02xxxx" but read data from core is "01xxxx".
The key point is, rgiht after error occured, we read data from core-dev direactly, the data is correct - "02xxxx". So there should be data align/validation issue between cache and HDD, in a very tiny time interval (serval miliseconds)?
After cancel/clear this IO-class, and test again, no data validation error occured any more.
In addition, this configure <Sequential cutoff policy: always; --threshold 128KB> can also trigger data validation error.
Expected Behavior
No data align/validation issue between cache and HDD, when setting IO-class for skiping big IO written into cache.
Actual Behavior
There should be data align/validation issue between cache and HDD, if IO-read data partially cached.
Steps to Reproduce
Context
Base block storage for distributed block-system; need to guarantee that data validatoin is OK.
Possible Fix
Maybe meta-data are not strongly aligned or expired between different IO stage (in miliseconds).
Logs
No evidence until now; but reverse verification(remove that IO-class) can be a clue.
Your Environment
22.03.0.0666.release
CentOS Linux release 7.6.1810 (Core)
5.10.38-21.hl02.el7.x86_64
NAND
HDD
The text was updated successfully, but these errors were encountered: