A recent survey by the magazine InformationWeek revealed that roughly half of the companies out there are managing between 1TB and 99TB of company data. This amount of data is not getting any smaller and in fact is growing rapidly. For IT managers who are in charge of projects that make use of this valuable company data this brings up a very important question: do you have the IT manager skills to figure out just exactly how are you going to back up too much data?
Say Hello To Tape
Before we can talk about how best to solve the challenge of trying to back up a Big Data data set that could be as large as a petabyte or more, perhaps we should first spend some time talking about how this data behaves. Do you have any IT manager training in this area? One of the most important things for an IT manager to understand is that not all data is created equally.
Instead, most of your data is probably going to be referenced rarely. That means that you are going to have to both store and backup text, numbers, video, audio, and images that will sit around doing nothing most of the time. You are facing a situation where backing up this much data is almost impossible; however, the data is so important that you have to.
Some of you younger IT managers may be thinking that backup up a Big Data store can’t be all that hard: just allocate more disk space. Hang on a minute: disk storage is not free and creating a humongous storage system for your growing data and then replicating that in order to handle its backup makes no economic sense: you’d be wasting a lot of your company’s money. There is a better solution: tape.
Don’t Say Goodbye To Disk
Here’s what an IT manager needs to do. When the Big Data set is being created, analyze the data. Some of the data will be accessed often. This data needs to be kept on disk. Some of the data will be used rarely. This data can be placed onto tape. All of the data can be copied to tape when the big data set is being created as a part of your backup creation process. Going forward, any changes that are made to the data set can then be copied to the backup tapes so that a nightly backup run does not need to occur.
As an IT manager you are going to have to be constantly reevaluating what parts of your data should be stored on disk and what parts should be stored on tape. Generally speaking, the more active parts of the data along with the most recent data should be moved to disk. However, things change and so you’ll need to keep an eye on what data is actually being referenced and make the appropriate storage changes as needed.
You goal needs to be to find a way to cost effectively store and backup your Big Data store. Moving the entire data set at one time is not really possible even with today’s large bandwidth networks. Make use of tape systems to hold the data that you don’t need every day and to hold your backup and you’ll have solved your storage problems for both today and tomorrow.
What All Of This Means For You
In today’s era of “Big Data”, the amount of data that an IT manager and their team are dealing with as a part of almost every project has quickly grown to become very large. After you get done doing all of that IT team building, a plan for how to back up all of this immense data needs to be crafted.
Not all data is created the same. In fact, much of the data that that the IT team is responsible for will probably be accessed very infrequently. This means that a cost effective way of both storing it and creating a backup for it is required. The old standby of using tape storage is economically the best way to go about doing this. Disk storage still plays a role for the parts of the data that will be accessed more often, but the majority of the infrequently accessed data should be stored on tape.
All too often in today’s big data environments not enough time is being spent thinking about how to back the data up. This is a hard problem – just making the big data available for applications that need it is a challenge, let alone creating a separate backup.
Financial concerns need to play a key role in any backup solution. The simplest way to go about doing a backup is simply to buy more disk storage. However, when you take a look at how infrequently the data will be accessed this quickly becomes not cost effective. Take the time to find the right balance between tape and disk and you will have solved the problem of backing up your big data once and for all.
- Dr. Jim Anderson
Blue Elephant Consulting –
Your Source For Real World IT Management Skills™
Question For You: Do you think one backup up solution is the right way to go for big data or should you use multiple solutions?
P.S.: Free subscriptions to The Accidental IT Leader Newsletter are now available. Learn what you need to know to do the job. Subscribe now: Click Here!
What We’ll Be Talking About Next Time
I’d like you to think about a traffic light for a minute. When you approach a traffic light you automatically take a look at it to determine what color it is. If it’s green, then you’ll keep going. If it’s red or amber, then you’ll start to press on the breaks and prepare to stop. It turns out that IT managers are a little bit like a traffic light to the rest of their department. One of the IT manager skills that you need is the ability to be aware of your mood and the impact that it can have on your ability to manage your IT team.