Cited time in webofscience Cited time in scopus

DeepSketch: A New Machine Learning-Based Reference Search Technique for Post-Deduplication Delta Compression

Title
DeepSketch: A New Machine Learning-Based Reference Search Technique for Post-Deduplication Delta Compression
Author(s)
Park, JisungKim, JeonggyunKim, YeseongLee, SungjinMutlu, Onur
Issued Date
2022-02-22
Citation
USENIX Conference on File and Storage Technologies, pp.247 - 263
Type
Conference Paper
ISBN
9781939133267
Abstract
Data reduction in storage systems is becoming increasingly important as an effective solution to minimize the management cost of a data center. To maximize data-reduction efficiency, existing post-deduplication delta-compression techniques perform delta compression along with traditional data deduplication and lossless compression. Unfortunately, we observe that existing techniques achieve significantly lower data-reduction ratios than the optimal due to their limited accuracy in identifying similar data blocks. In this paper, we propose DeepSketch, a new reference search technique for post-deduplication delta compression that leverages the learning-to-hash method to achieve higher accuracy in reference search for delta compression, thereby improving data-reduction efficiency. DeepSketch uses a deep neural network to extract a data block's sketch, i.e., to create an approximate data signature of the block that can preserve similarity with other blocks. Our evaluation using eleven real-world workloads shows that DeepSketch improves the data-reduction ratio by up to 33% (21% on average) over a state-of-the-art post-deduplication delta-compression technique. © AST 2022.All rights reserved.
URI
http://hdl.handle.net/20.500.11750/46868
Publisher
USENIX Association
Related Researcher
  • 김예성 Kim, Yeseong
  • Research Interests Embedded Systems for Edge Intelligence; Brain-Inspired HD Computing for AI; In-Memory Computing
Files in This Item:

There are no files associated with this item.

Appears in Collections:
Department of Electrical Engineering and Computer Science Computation Efficient Learning Lab. 2. Conference Papers
Department of Electrical Engineering and Computer Science Data-Intensive Computing Systems Laboratory 2. Conference Papers

qrcode

  • twitter
  • facebook
  • mendeley

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE