Sunday, April 2, 2017

De-duplication of backup storage

Backup storage is one of the top consumers of storage infrastructure, so storage optimization techniques such as compression and de-duplication have always been priorities for backup IT administrators. De-duplication involves locating duplicate blocks of storage and replacing them with a reference and a single instance of the duplicate block. Depending on the workload that is writing to the storage and the block sizes used to perform the de-duplication, storage savings can range anywhere from 50 to 90 percent

In typical enterprise storage deployments, backup storage is provisioned on SAN devices, which have built-in block-level de-duplication capabilities, and DPM works seamlessly in this deployment. With Storage Spaces and SOFS in Windows Server 2012 R2, customers can create commodity storage built natively on a Windows-based server and JBODs, which can be a viable alternative to traditional SANs. In this deployment, it is important for DPM to inter-operate with the native Windows de-duplication in Windows Server 2012 R2. There are some deployment best practices that need to be followed to ensure maximum storage savings.

One more note about different kinds of de-duplication when it comes to backup storage. There are generally two ways to do this—inline and offline. Inline de-duplication is done as part of the backup process where every backup set is de-duplicated as it is stored in the backup storage. Offline de-duplication is done as a post-processing step after the backup has completed. De-duplication is an I/O- and performance-intensive operation, so the number of backups running and the size of the backup storage pool determines which approach works better. For large scale deployments with a large number of backups, offline de-duplication gives better backup throughput (number of backup jobs/hour). It also affords better storage savings since you can aggregate the de-duplication across a backup data sources set as opposed to individual data sources in inline de-duplication. The approach used in DPM is offline de-duplication that provides excellent throughput and storage savings up to 70 percent in large-scale private cloud deployments.

Source of Information : Microsoft System Center

No comments: