Abstract : Deduplication is an approach of avoiding storing data blocks with identical content, and has been shown to effectively reduce the disk space for storing multi-gigabyte virtual machine (VM) images. However, it remains challenging to deploy deduplication in a real system, such as a cloud platform, where VM images are regularly inserted and retrieved. We propose LiveDFS, a live deduplication file system that enables deduplication storage of VM images in an open-source cloud that is deployed under low-cost commodity hardware settings with limited memory footprints. LiveDFS has several distinct features, including spatial locality, prefetching of metadata, and journaling. LiveDFS is POSIX-compliant and is implemented as a Linux kernel-space file system. We deploy our LiveDFS prototype as a storage layer in a cloud platform based on OpenStack, and conduct extensive experiments. Compared to an ordinary file system without deduplication, we show that LiveDFS can save at least 40% of space for storing VM images, while achieving reasonable performance in importing and retrieving VM images. Our work justifies the feasibility of deploying LiveDFS in an open-source cloud.
https://hal.inria.fr/hal-01597754 Contributor : Hal IfipConnect in order to contact the contributor Submitted on : Thursday, September 28, 2017 - 5:11:00 PM Last modification on : Thursday, September 28, 2017 - 5:16:56 PM Long-term archiving on: : Friday, December 29, 2017 - 3:18:19 PM
Chun-Ho Ng, Mingcao Ma, Tsz-Yeung Wong, Patrick Lee, John Lui. Live Deduplication Storage of Virtual Machine Images in an Open-Source Cloud. 12th International Middleware Conference (MIDDLEWARE), Dec 2011, Lisbon, Portugal. pp.81-100, ⟨10.1007/978-3-642-25821-3_5⟩. ⟨hal-01597754⟩