-
Notifications
You must be signed in to change notification settings - Fork 3
Snapshot storage on NFS dedicated area #68
Comments
After some preliminary test, it turns out that if a smile file is further compressed into a zip, it consumes around ten times less space. On practical terms, the DAQAggregator for cdaq could possibly write out around a GB per day (and minidaqs would add another GB or so). |
How about the following scheme: the snapshot is written uncompressed, i.e. you don't add latency for the live view. After some time, e.g. 24 hours or a week, e.g. a cron job zips the snapshots. The deserializer would need to look for both kind of files: if no unzipped snapshot is available, it tries to find the corresponding zipped one and decompresses it. Like this you don't introduce latency for newer snapshots, which are more likely to be requested, and save space for the long term storage. |
Yes, this sounds like a nice hybrid scheme and it keeps compression on individual snapshot level, which is the important part for DAQView replays. |
If we only implement the simple solution of zipping/unzipping every snapshot individually, there will not be a significant delay. I have tested the times on my PC for an hour last Saturday (~1000 files), during ongoing run with around 3/4 of the partitions in and running. Based on this I assume that there was a large variety of values within each snapshot, which is usually the case during normal runs. So the task's difficulty was realistic enough. Overall, the time to read a smile file, zip it, write it, read the zipped, unzip it and write it again as a smile (4 I/Os, 1 compression, 1 decompression) was estimated to be less than 20ms. This should be fine to use during real-time monitoring. A further micro-optimized implementation could possibly save few more milliseconds by pipelining smile to zip on the fly, without doing all the I/Os which were done during the test. There was not much deviation in time, because there was not much deviation in snapshot sizes either. Snapshots in .smile were around 369kB, while their .zip counterparts were 57kB (thus with zipping you save ~85% in space). The snapshot directories need not to be changed at all, they could just contain zip files after the implementation goes into production. For backwards compatibility, the deserializer should always check whether a file is actually a zip before applying the unzip function. The utility libraries to implement this come already with Java, there is no need of external library. |
DAQAggregator has been running almost without interruptions since the beginning of this month. When there exists a matching L0 static flashlist row, there is one new snapshot every three seconds, on average. Each snapshot file is slightly less than 300KB.
For the cdaq that means that DAQAggregator consumes almost 8GB of space per day. Given that there is currently 147GB left, there will be space shortage in less than three weeks from now, unless we ask for extension or we delete redundant data (dev or prod-2016).
The text was updated successfully, but these errors were encountered: