It is an established industry observation that the data loses its value as it gets older. With the passage of time, the reference data which is less frequently used grows exponentially as compared to active, frequently accessed data. This poses a challenge to manage all such type of data as it may lead to higher costs and sub-optimal utilization of investment into storage infrastructure. Treating all this data alike will have an extremely negative impact on the costs of storing it. So, the real need is to align your storage options with the data growth requirements completely in line with its frequency of access. A single-tier based, monolithic storage system will start to yield diminishing returns with increasing volumes of aging and stale data, as the associated costs of storage will begin to increase prohibitively.
A fair estimate reveals that in a typical enterprise environment, the active data constitutes only 2-4% at a certain point in time. Aging data contributes close to 10% and the rest of it rarely used. Same is true for a SharePoint based Enterprise Content Management system as well. The active BLOB content base is only a small proportion of the total TBs of content that it carries.
Figure 1: A Typical Taxonomy of BLOB Data in a SharePoint Environment
Considering the costs that the storage of the aging or stale BLOBs in primary, high-end storage can incur due to utilization of valuable storage space, it is unjust to treat all the BLOBS the same in terms of its storage. Simply offloading the SharePoint BLOB payload from expensive and high-end transactional SQL Server storage to less expensive external storage may yield visible cost savings. The primary contributor to this saving is the fact that aging and non-active BLOB data, which though is part of SharePoint content base but is hardly accessed, no longer occupies the large tracts of expensive storage space. However, the externalization of BLOBs is only one important aspect of BLOB storage management. How you manage the physical storage of externalized BLOBs on external storage is the other one.
If you keep all BLOBs, whether active, aging or stale, in your primary, single tier of external storage, there is a fair chance you will get only a marginal benefit out of externalization, as the cost of storage in your primary, relatively higher-end storage tier will keep on increasing. So even if you have offloaded 90-95% of data to a so-called lesser expensive tier of storage and you are still able to manage only marginal cost savings, you need to seriously rethink your storage strategy.
An effective approach for the storage of externalized BLOBs is to structure the external storage as a hierarchy of multiple tiers in the form of a hierarchical storage management system (HSM), with one storage tier corresponding to one age-based category of BLOBs.
Figure 2: A Multi-tiered Hierarchical Storage System
Figure 2 above shows a Multi-tiered Hierarchical Storage System. This hierarchical structure is based on the cost vs. activeness of data principle. Naturally, more active the data, the more you are willing to spend on its storage and access performance. So, It has been structured in such a way that the most active BLOBs should reside at Tier-1 which is a high-end, faster-access storage which may be a File System or a SAN. Similarly, the aging data should be kept at Tier-2 which is typically a NAS based storage and the archived/seldom-accessed data at Tier-3 which can be a Cloud. So based on this strategy, you effectively push lesser active content to the cheaper tiers.
StorageEdge has been built keeping the aforementioned, important BLOB storage concept in perspective. It provides multi-tiered storage that allows you to keep your active content in the most expensive storage and archives older content out to less expensive storage. It has intelligent archiving facility which effectively ensures that the primary storage is not over-burdened with millions of documents. It provides for a fine-grained approach to BLOB archiving on multiple tiers storage which allow managing movement and placement of BLOBs at an atomic level.
Figure 3: Configure a Storage Tier in StorageEdge
StorageEdge does SharePoint archiving of these documents based on two criteria:
For Cloud Storage, you may want to control bandwidth, for which StorageEdge provides throttling. Further, StorageEdge allows you to edit these criteria right from within SharePoint Central Administration (CA) to the control the movement of BLOBs across tiers.
Figure 4: StorageEdge Tiers in a Storage Profile
So by having multi-tiered storage, you are able to control your storage cost in alignment with the business needs as you are able to grow you storage options in an incremental manner. You can expect exponentially larger cost savings if your storage system can continue to move offloaded BLOBs to even less expensive tiers of storage with respect to their age. Surprisingly, it also improves SharePoint performance because the most active storage (meaning your Tier-1) no longer contains all those huge amount of documents that would have overwhelmed it.
StorageEdge combines all the things discussed above and automatically improves SharePoint performance and scalability for you. Here is how it works:
What to Do Next?