The fire at data centre operator OVH this month highlighted the importance, and also the complexity of effective backup strategies. OVH has responded by offering free backups to all its cloud customers in the future.
“Is it worth making a backup on each device? This is almost a philosophical question, because is it worth checking every car, even one that we only use once a month? Therefore, I would answer that it depends whether what we have written down is valuable to us,” said Robert Paszkiewicz, sales director at OVH Poland.
“The backup market is constantly growing. The global backup and data recovery market is expected to reach a dizzying value of $ 11.59 billion by 2022, of which the cloud backup market alone is to be worth $ 4.13 billion by then,” he said.
“Backup media are different and the time of its execution may be different. The basic copy is a full copy (that is, a one-to-one restore), including all specified files. Files that have changed from the most recent full backup are archived in a differential backup. In an incremental backup, only what has arrived (increased) after the latest backup is made. An optimally made backup should follow the 3-2-1 rule, where 3 refers to the original and two backups, 2 is the recommended number of independent storage media, and 1 is a copy stored in another location, e.g. in the cloud.”
He also points to the lifetime of components. “Most data carriers, such as memory cards, disks or flash drives, have a fairly short life and are fail-safe, so saving only important information on them is simply unwise,” he said.
UK business continuity and IT disaster recovery provider Databarracks points to a range of backup techniques. The company set up the UK’s first managed online backup services over 15 years ago.
A full backup is an entire copy of data. The disadvantage of a full backup is the length of time it takes to complete within a backup window. This means it is very rarely possible to run full backups over a wide area network (WAN).
Differential backups record the changes between full backups. The first captures the changed data from the initial full backup. After that, they’re cumulative, so the second also contains the changes in the first, and so on. This means full restorations are quick because the recovery needs a maximum of two backups. However, it requires more storage than other methods.
Incremental backups are similar to differential backups except they’re not cumulative, so only the changes from the previous backup are captured. Incremental backups are therefore smaller and faster to complete than differential, although restores can take longer because the individual archives must be merged. As individual backups are smaller, they are the most efficient to run over the WAN. Online backup is sometimes referred to as ‘incremental-forever backup’ because after one initial backup only incremental backups are ever taken.
Synthetic full backups are not strictly a methodology but rather a technology that sits on top of the methods described above. This is a synthetic full backup is the server-side construction of a ‘full backup’, comprised of smaller individual backups. This means when a full restore is needed, the reconstruction of files or file parts into a usable whole is already done.
A hot backup, also known as dynamic or active backup, is performed while a database is online and accessible to users.
Hierarchical storage management is a policy-based management layer placed over backup and archive operations. It invisibly moves files between backup and archive storage depending on its age and user-demand. This ensures the most economical use of expensive higher performance storage, whilst automating the migration of old and unused data to the archive. From the user perspective, there is no administration required to restore from the archive – it’s simply one connected environment in which everything is available. Backup solutions capture and transmit source data using either an agent-based, or agentless architecture.
Agent-based backup systems install a software instance on every protected component in the network. Agent-based backup is normally loaded in the OS stack. This means more control and visibility of the host system. Agent-based also uses local resources and won’t hamper bandwidth. Agentless backups use one centralised installation, usually onsite, that captures all the target infrastructure in one place.
Continuous data protection is a form of constant backup that continuously scans the environment for changes and sends them to the backup environment in near real-time. Snapshotting is best likened to taking a photograph of the target environment. It’s a static image that represents how the live environment looked at a given point in time. However the environment must pause operations whilst the snapshot is taken in order to ensure accuracy.
Deduplication removes superfluous duplicate files from backup sets at file or block level. By saving just one file copy, storage and transmission become faster and cheaper. To guarantee their integrity, it’s advisable to perform backups during non-peak hours. Network traffic can lead to inconsistencies between the source data and the backup as the operation is taking place. As such, system administrators tend to schedule backup windows overnight, outside of regular office hours.
“Copy data” is the collective set of all data not currently being used in production (e.g., a snapshot, backup, vault, or replica of a version made for various IT or business functions—data recovery, Dev-Test, analytics or other business or operational functions). Copy Data Management (CDM) is designed to manage the creation, use, distribution, retention, and clean-up of copies of production data – or “copy data.”
“So let’s remember that the system is as secure as its weakest link,” added Paszkiewicz at OVH.
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.