When you use the cloud for backups paired with best practices into your data protection strategy, you maximize the value of the scalability, flexibility, resilience and cost-effectiveness of the cloud. Using the right practices will help you meet your specific data protection needs while minimizing your storage costs.
This post on using the cloud for backup covers trends, techniques and best practices in optimizing cloud object storage for backup, disaster recovery and long-term data retention. It’s a methodical overview you can follow to use the cloud for backup to protect, preserve and secure your applications and data.
1. Identify the data and applications to include in your backups
The first best practice is to evaluate your data and applications in the broader context of your business. For example, which is more important to your company: an image of a print server or the Oracle database that holds your point-of-sale transactions? Obviously, it’s the latter. So, why spend time and effort on protecting files that don’t matter very much? This is where the term “business-critical” comes in. It refers to applications and data whose absence or loss keeps you from making money, and that’s your primary criterion.
Next is the people factor. Suppose you ask one of your company’s application owners, “Which parts are critical? Which parts don’t need to be backed up?” The usual answer is that they want everything backed up, but backing up everything is prohibitively expensive. That’s why it’s smarter to approach those application owners and say, “We’ve got budget constraints this year. What is the best way to protect the data you want?”
Then, there’s the age factor. Some data goes stale quickly. With something as transactional as databases, for instance, why keep a version that’s more than a week old? Depending on the amount of churn, the data will be so far out of date as to be pointless. Static data is at the other end of the spectrum. Consider files that nobody will look at for years. You can use the cloud for backup and put data in tiers that are cold, cheap and slow. By understanding the business use of the data, you can adjust your backup cycle for each particular data set.
Your best practice is to determine what data is in your IT landscape, how it affects the business and how you want to recover it.
2. Evaluate the cost and risk of backing up those identified items
Lower risk means higher cost. A classic example is your recovery point objective (RPO): The closer you get to the RPO of right now, the more it will cost you to achieve and maintain it. That’s because you’ll need more storage to save all the data changes. Conversely, the further into the past you set your RPO, the less it will cost you.
The same applies to cloud storage. The easier it becomes to use the cloud for backup, the more you’ll store there and, consequently, the more it will cost you. If you try to mitigate some of the risk, you’ll actually give yourself more work on the other end when the time comes to restore your data.
Note also the capital/operational difference in how you’ll soon pay for storage. Traditionally, if you bought storage and tape libraries for on-premises backup, it was CapEx. You knew how much it was going to cost and you could estimate the useful life of the assets. Your only OpEx might be when you purchased new tapes. The cloud, however, is all about OpEx: billing based on a subscription or on usage. You’ll receive a bill every month and your costs can easily fluctuate from month to month, which means you’ll have to approach budgeting completely differently. You don’t want to burn through your money at the beginning of the year and get caught short later on.
Determining your acceptable risk is a matter of working out your ideal recovery point, which is the answer to the question “How many minutes/hours/days of data can we afford to lose?” You’ll likely have different answers for different types of data — customer-facing, back-office, databases, correspondence — so you specify different recovery point objectives and backup technologies accordingly.
3. Prioritize backups based on recovery plans
Based on cost and risk, you can set priorities on the data to recover first in the event of an outage or loss. Those priorities form the basis of your disaster recovery plan, in which you decide which data you can recover quickly and which data you need to recover first. For instance, you might recover Active Directory first because it’s relatively simple; your Oracle database, on the other hand, is not simple, so you might leave that for later.
Naturally, you’ll need to recover both of those and many more objects. But your disaster recovery plan represents the list of priorities you’ve set for recovery.
4. Figure out ways to optimize and secure data backups
Two other factors loom large in your disaster recovery: the size of your pipe and the speed of your connection. You’ll have to pull the data down to recover it, but remember that other operations in your business will need that pipe and connection too. Don’t assume that you can devote all of the bandwidth to backup and recovery.
You can avoid bottlenecks in the pipe and connection by optimizing the data before you send it to the cloud. Compression technology has been part of the backup landscape since the earliest tape drives, and it still plays a role in optimizing cloud backup.
Another dangerous assumption is that your backed-up data will be secure in the cloud. The reality is that if your cloud credentials are compromised, your backups will no longer be secure. As in the early days of tape, the smart step is to encrypt the backups you store in the cloud with encryption software. That way, you can preserve your data in a relatively stable format for a long time.
Preserving data is a matter not only of how long it must last, but also of how secure it must remain. As we continue to see with ransomware attacks, preserving data is about both retention and encryption.
5. Estimate the cost of using the cloud for backup
It’s not easy, but at some point you’ll have to estimate what you’ll spend to use the cloud for backups. Even though it’s fractions of a cent per gigabyte for storage, that soon adds up, especially on a monthly basis. Also, depending on your plan, you’ll be charged when you move the data to a different tier, and when you touch, scan and pull down the data.
Public cloud providers usually offer a calculator, but calculation is not simple unless you really understand what your application or solution is doing to access the data. A more effective way to estimate is to store small amounts of different data types in the cloud as a proof of concept. With time, you’ll gradually develop a more accurate picture of your costs.
An important factor is whether you store data more often than you recover it, or vice versa. Providers’ pricing models tend to favor one or the other, and if you shop around, you can find a match for your data needs.
6. Focus on data retention
To return to the issue of data types, given that everything you store takes space and costs money, how long do you plan to keep your data? Do you need to keep it for seven years, like email concerning a merger? Or will it be obsolete in a few weeks, like a sales transaction database?
A seven-year data retention period is usually based on a regulation or industry standard; it’s unlikely you’ll need to access that data except in an audit. That makes it an ideal candidate for cold storage like glacier, deep glacier, archive and deep archive. Generally, colder storage means lower cost to retain data, but higher cost and more time (hours or days) to retrieve it.
At the other end of the spectrum is business-critical data that you want to recover quickly in a crisis. That’s better suited to warm or hot storage with immediate access for faster recovery.
It’s natural and sensible to spend more to retain data that’s important to your daily business and to spend less on data you have to keep for seven years. The storage tiers offered by providers are set up with cloud backup best practices — both technical and financial — in mind.
7. Budget for data restoration
In fact, it’s also sensible to earmark part of your budget for data restoration.
Restoration is a matter of degrees. Every now and then, you’ll need to recover files because of problems like accidental deletion. That won’t put a big dent in your budget, but don’t assume it’s the only restoration scenario. If you need to recover from a disaster, you may incur big-time egress charges for retrieving gigabytes and terabytes of your data from the cloud. Not only that, but recovery will take a long time.
8. Evaluate intelligent recovery
Given how long data backup and recovery solutions have been around, it can feel as though you’re stuck in an IT infrastructure framework that’s been there forever. Nevertheless, innovative technology is evolving to make existing infrastructure more agile.
Agility and alternate approaches to storage are important, and those approaches needn’t be brand-new solutions. You can repurpose infrastructure that is still useful by coupling it to something new, leading to a more cost-effective, agile solution, which you’ll need in an environment of constant change.
In the context of cloud recovery best practices, intelligent recovery can help you accelerate recovery. By extending your storage solution into the cloud — as opposed to simply copying data into the cloud — then think about a system that intelligently formulates your data set for recovery. Instead of making you wait for all the data to be recovered from cloud, the system starts to use more localized data initially and only filling in the gaps from cloud storage, even while recovery is actively in progress.
That way, your time to recovery starts right away, instead of you sitting there waiting for a few hours for an entire data set to come back down from the cloud and then extract it with the backup software. That first step is almost like waiting for somebody to come and deliver a tape for you. The bit you want to get rid of is sitting at your desk waiting for something to happen. If you can get rid of that piece, then your recovery time starts to improve and the new technology starts to deliver on what it promises.
Protect all your systems, applications and data.
9. Automate anywhere you can
IT infrastructure has become complex in the way things interact. So, the simpler you can make using the cloud for backup and recovery, the easier it will be to protect your data. If you spend less time and effort on data protection, and if recovery is transparent, then you’ll know you’re on the right path.
The products you use should enable you to automate almost everything associated with backup and recovery. Why would you have any human interaction at all in an operation you can automate? Save the human effort for other things, not for backing up and restoring data.
Conclusion
Cloud storage offers scalability, flexibility and — if you follow best practices — low storage costs, strong data protection and cyber-resilience. Those best practices cover three operations: backup, storage and recovery.
Smart system administrators think in terms of the different types of data they must protect. They also stay mindful not only of the monetary cost of backup, storage and recovery, but also of the time element for all three operations.