Data Management, Backup Solution: For Photographers

First and foremost, I want to mention that this article/blogpost/guide is a long time in the making.  Not that I have been working and preparing my comments for a number of months, but that it has been on my mind for probably one year now.  A lot of friends and colleagues have asked me a lot about my current data management and backup system and I feel the desire to help others better understand the options that are available today.

Although the intended audience for this post is mainly for photographers or anyone who wants a little more insight about digital asset management, it is a good read for anyone who wants to understand how important it is to have a photographer (or graphic designer, printer, etc etc) who takes their digital assets very seriously.  If you were a bride, how would you feel if your photographer had to explain to you that your images were lost, deleted, destroyed.  You probably would not like it and you would probably wonder what happened.  A photographer that does NOT take digital asset management seriously, should not be toying with events as important as weddings.  More after the break.

I’m a self-professed computer nerd. I’ve grown up with computers my whole life, of which I’m grateful for.  I’m a DIY (do-it-yourself) kind of person.  I’m mechanically minded.  If I want to learn how something works, I take it apart and figure out the parts and how everything connects.  With computers, I’ve always been fascinated about the technology, the software, all the little bits that make our lives so comfortable (and stressful) today.  I will be the first one to admit that I do not know everything.  In fact, there are probably quite a few things that I know incorrectly, and will be the first one to find out why I’m wrong and enlighten my understanding.  But, what I do know is generally sound.  I make this assumption with my approach to data management and backup strategies for photographers, and other creative (or non-creative) professionals that rely on digital information to earn a living.

Let’s face it.  As a photographer in the digital age, my money is made by creating and selling my images to clients.  I shoot 100% digital (at the time of this article) and I MUST protect my images or else I will be out of business for good.  If I photograph a wedding, which is generally a once-in-a-lifetime event that cannot be recreated, I need to not only make sure I’m capable and qualified to photograph such an event, but even more so with the post-event business practices.  Losing the digital photographs of such an event would be just as bad as never showing up to photograph the event; probably even worse.  This is why it is crucial that you are able to capture the important event, but also manage the digital assets so that you can deliver your final product, and in the end keep your job.

I’m sure that everyone has heard the adage: “Never keep your eggs in one basket”.  I hope that everyone follows this principle with their own digital assets, but this is only a small part of managing your cash crop as a photographer.  Other points of topic such as data redundancy, data integrity, system up-time, system compatibility and content management are just a few more things that make up the digital management mix.  I hope that some of this information is helpful to those looking for tips, answers and clarification on a subject that carries a wide variety of opinions and approaches. For an in-depth publication about digital assets, don’t forget The DAM Book. I recently purchased it but have yet to read it, looking forward to learning more.

BACKUP PHILOSOPHY

Never keep your eggs in one basket.  I’m sure we have all heard this since we were knee-high to a grass-hopper.  You keep hearing it because it is sound business advice. As a photographer, this can be applicable in a number of different ways.

First: If you rely on selling digital images to clients to make money, then you MUST have a backup.  Period.  Having a backup is not enough.  A backup of your backup is a very wise practice that will save your bacon some day.  How about backing-up your data before you leave the event (or location) where you took the photographs?  Modern technology allows you to do this with portable hard drives (from Epson and other manufacturers), or even using a camera that incorporates dual memory cards for redundant capture.  Finally, backups are not limited to data, but also to necessary equipment to perform the job.  Backup cameras, lenses, flashes, memory cards, stands, bags, etc etc . . . all necessary to perform the job and have peace of mind.

As for data backups, keeping your working files and backup files all at the same location is still keeping your eggs in one basket.  If you manage both at home, or an office, there is still a possibility that both your working files and backup files could be wiped out at the same time.  Fire, burglary, earthquake, etc . . . anything could happen!  Plan for the worst and hope for the best.  Maintaining a backup that is off-site is the only true way to make sure your data is safe at all times.  Managing that data and keeping it all up to date can be a struggle for a lot of people, and it is for this reason that I wish to share my approach.

GENERAL APPROACH

My computer and data setup consists of many different parts, and my decision to use this setup is very deliberate.  Here is a quick list of parts:

-Working Drives
-Backup Drives
-Redundant Backup Drives

All of these drives are non-operating system drives, and even non-personal data drives.  My operating system (plus applications) is it’s own drive, separate from everything else in my system.  That in itself could be a discussion for another day.  My own personal documents, music, personal photographs and other non-business related data is also stored on separate drives.  Those are less important to me and I don’t need to be as vigilant and cautious with that information, that’s just my personal choice.  Still, it has it’s own backup because I don’t want to lose them.

Hard drives come in many different shapes, sizes, formats, speeds and costs.  Hard drives fail all the time and they are very unpredictable.  For the most part, there is never a way to predict how long a hard drive is going to last, or when it is about to die.  Hard drives are very intricate devices and should be well cared for if you want them to last.

Hard drives die for a couple of reasons.  First reason: heat.  If your hard drives operate at high temperatures for a long time, their life will be premature.  Shock is also a hard drive killer.  If a hard drive is turned on and the discs are spinning, a sudden shock or jolt can easily kill the device for good.  There are other reasons (static, power cycles, operating hours, etc) that hard drives die, but for the most part these are the top two reasons.

The first part of my data management consists of “working drives”.  These are the hard drives from which I work from.  When I photograph an event, the images go straight to the working drives.  Any edits are done on working drives.  It is my one point of contact when I need to work with my digital assets.  My working drives are all internal drives, 7200rpm SATA drives.  I have always used Western Digital drives, I have yet to be let down (knock on wood).  These drives and technology has been around a LONG time now, so they have been tried and tested.  Drives that spin at 7200rpm are, for the most part, the fastest that you will really need to use.  Drive density is also an important factor to look into, because a 500GB drive with 2 platters is not as fast as a 640GB drive with 2 platters.  The latter has a denser array of “blocks” on the disc, which allows your computer to access the data on those drives faster than the former.  Many manufacturers make different models of the same size drive.  Some are “green”, others are the middle-grade and others server-grade.  I will generally buy drives that are a step down from the server-grade (top of their product line).  For Western Digital, their top-line model is the “Black” line, and “Blue” line is their step down.  I’m sure companies like Seagate, Fujitsu and other hard drive manufacturers do the same thing.

My working drives are always on.  This is both good and bad.  Keeping the drives spinning isn’t a bad thing.  Most hard drives are rated for thousands and thousands of hours before their expected failure, so don’t be worried about keeping your hard drives running.  Power cycles are good for any electronic device, but allowing your hard drives to “spin down” and “spin up” again and again, in short periods of time can be taxing and fatal on a hard drive.  This is why keeping your working drives consistent is a good thing.  If a new hard drive lasts past the first week of normal use, chances are that it will last a long, long time.

I have 3 working drives. I use two 750GB drives, and one 1TB drive.  The two 750GB drives are split between my current work: weddings and non-weddings (portraits, commercial, etc).  My 1TB drive is an “archive” drive.  It holds any and all client data that no longer fits on the first 2 drives.  When one of my 750GB drives fills up and I no longer have room, my oldest client data (generally data that I no longer need to access) is moved to the final 1TB drive.  This practice isn’t necessarily the best way to do it, but I wanted to keep my two 750GB drives in my current system, so I was forced to add additional storage space and this is just the route I chose to take.  Of course you could use 2TB working drives, you will just be able to hold more space before moving on to different storage setups.

Each of these three working drives will receive its own backup, and it’s own redundant backup.  These backups are EXACT clones of the working drives.  A lot of people use RAID (of which I highly discourage), Drobo systems or other methods of backup, but my setup is very deliberate.  Here is why.  I prefer to use exact clones, because if one of my working drives were to die, I can take one of the backups, put it into my system and I am up and running in just a few minutes.  I don’t have to copy any data or use special software to convert that data into usable form.  I would experience no system downtime.  This is important if you were working on a time sensitive assignment and out of nowhere your drive dies.  Using RAID setups, Time Machine, DVDs or other setups can slow down the recovery process quite a bit.  This method is also simple.  Every evening, my system performs an entire backup of each working drive without having to think about it.  There is no special hardware required to make this happen, just a simple $30 application that performs the backup and runs on its own schedule.

I always have an “on-site” backup (one that stays at the office with my computer).  The redundant backups are also exact clones of the working drive.  They are ALL identical.  My on-site backup and off-site backup are simply rotated.  Once a week, or usually after an important assignment, I will take my backup drives home and exchange them for the redundant backups.  The redundant backup becomes the primary backup, and this cycle repeats itself.  Because each backup is an exact clone of the working drive, I don’t have to worry about missing any data that needed to be backed-up.  ALL of my data is always in multiple locations (well, for the most part, until the redundant backups are swapped with the primary backups) in case of a real disaster.  And if a real disaster were to happen at my office, I don’t really have any system downtime: all I need to do is plug in my redundant backups and away I go, never skip a beat.

SOFTWARE / HARDWARE

SOFTWARE:  To make my backups as seamless as possible, the application I use is called SuperDuper.  It is a Mac-only application.  It is very inexpensive, simple and easy to use.  Other applications such as Carbon Copy Cloner will do the same thing, but I prefer SuperDuper for its speed and simplicity.  I have programmed the application to backup my working drives at certain times in the evening.  All that I need to worry about is making sure my working drives are powered on, and also my backup drives.  SuperDuper takes care of the rest.  For Windows, the applications are a bit different (at least from my experience), but before I switched over to Mac, I used a program called Second Copy.  It worked similar to SuperDuper and it did a great job.  SuperDuper incorporates a method of backup called “Smart Backup”, which essentially is just an incremental backup.  The first time I need to backup a working drive to a backup drive, 750GB of data can take a while.  But once the first backup has been made, you really are only adding small amounts of data or changes that later need to be backed up.  SuperDuper will only back up new or updated data every day, saving time and strain on your hard drive.  If you were to re-image your drive every single night, this can significantly increase the time and also wear and tear on your hard drive.  It is generally not advised to do so, but incremental backups are great.

superduper data backup software apple osx

HARDWARE:  My Mac Pro computer can hold a total of four standard hard drives (if you really want to tweak it you can add more easily, but for simplicity’s sake, the 4 is good enough for me).  I have 6 drives in my system, but the four main drives are working drives.  Three of them are my client data files.  To make continuous and daily backups of my data, I use an external hard drive enclosure for my backups (and the redundant backups are then put into the same enclosure when swapped with the primary backups).  There are hundreds of different external enclosures, all varying on size, quantity of drives, transfer method (if: USB, FireWire, Ethernet, eSATA, etc), price and features.  USB is great because just about any computer manufactured in the last 10 years will probably have USB connectivity.  USB can be slow though, and copying 1TB of data over.  FireWire and Ethernet are both faster than USB, but still are no where near the potential of eSATA (external SATA, which is the same speed as your internal data transfers).  Adding an eSATA card to my Apple computer cost me $50, but I now get to use an external enclosure that can hold eight standard hard drives.  Incorporating a Port Multiplier into the eSATA card and external enclosure also saves me cables and clutter.

My external enclosure is made by SansDigital.  It holds eight hard drives (plenty for most people) and transfers data very fast through an eSATA connection.  I believe I spent $350 on the drive enclosure.  To me this is a steal of a deal, as singular external hard drive cases can run you $80 each (also add 8 power cables for each, and 8 USB cables for each).  I now have one large case where my hard drives can rest.  They are all attached to a removable tray, which makes for swapping between my primary backups and redundant backups very, very quickly.  Maybe a little too quickly.  No more hassle with individual external hard drives.  Of course there are obviously benefits to using individual external hard drive cases, I just prefer this setup for my needs.

sansdigital hard drive backup cage

What about the Drobo?  The Drobo is quite a revolutionary product.  It takes all the guesswork and hassle out of an effective RAID setup.  If you are looking at the Drobo, I’m sure you know what you are getting in to.  But, the main downside I see with the Drobo is that it is still somewhat “proprietary”.  The Drobo creates one large disk.  It allows for easy expansion when you need to increase your archive’s size, but if I wanted to take my data with me, I need to take the WHOLE thing.  The other limitation of the Drobo is access speed.  USB isn’t terribly fast (at least compared to internal SATA drive transfers), but the latest generation with FireWire doesn’t really change much on the speed side.  Also, what about a redundant backup?  To keep it simple, a clone of a Drobo would make the most sense.  Since the Drobo is one large drive, a redundant backup that is one large drive would work; except if your Drobo is 5TB in size, obtaining a single drive backup is not possible.  You can always get another Drobo and have it be a redundant backup of the primary Drobo, but you can probably imagine that the expense and hassle just adds up.

There are always downsides or cons to any given setup.  Using my current setup, potential pitfalls are the rare hours when certain data is not backed up.  For instance, I photograph a client of mine and immediately load my data on to my working drives.  I never delete or use my camera’s memory cards until I know those images have been transferred to the backup and redundant backup drives.  If by chance I am working on my images from the working drive and doing custom manipulations inside Photoshop, my drive could fail and I could lose those already manipulated images.  I still have my backups and generally also my memory cards, so the original is not lost, just my time that I have spent while doing custom image manipulations.  To me this is not a big deal.  One way to remedy this is to use working drives that guarantee system uptime (to an extent).  Some people can incorporate a RAID5 setup (to learn more about RAID, there is a great wikipedia article about them), which would guarantee system uptime.  Essentially, RAID5 (with a 5 drive array) would allow me to lose two drives before I have to stop and replace the drives and rebuild the array.  If you work in an environment where system access is critical at any given moment, it would be wise to invest in such a system.  For me, I’m just a single photographer.  If I lose two hours of my time having to recreate the manipulations or other tasks I had been working on, that’s not too big a deal for me.  The cost and hassle of maintaining a proper RAID5 setup is not worth it to me.  I prefer transparent modularity with my working drives: I don’t need special software (like Time Machine, some setups of Acronis backup software), RAID cards or tricks to read my data; it can be read by any Mac or Windows machine.  Without any real effort, if I need to, my working drives can be removed, installed in a new machine and nothing on my setup has changed. No data copying or porting to a new system, it’s all there ready to go.

Incorporating RAID systems into your data management can be good, but for the general population, it’s a bad idea.  I do not recommend RAID for the faint at heart, mainly because people don’t really know what they are getting in to.  RAID, Redundant Array of Independent Disks, is primarily (used in the server and advanced computing world) as a means to always have your system running.  It allows your data to be dispersed redundantly across a number of hard drives, protecting yourself from being shut out of your data if your hard drive were to fail.  Minor performance benefits are not worth the headache and hassle.  If you really know what you are doing you can get a HUGE performance benefit, but for photographers, accessing your data at incredibly fast speeds with high bandwidth is NOT our problem. For the video folks, a RAID5 or even RAID0 setup can significantly speed up rendering times, because a large part of that process is bottlenecked with drive speeds.

As a photographer, incorporating important client information in to a single large RAID unit is simple keeping your eggs in one basket.  You now only have ONE data drive.  If your single data drive (now a RAID array) were to fail, ALL of your images are gone.  This is where segmenting your data is a good thing.  Going back to my three working drives, if one were to fail and I did not properly back up my data (heaven forbid), then I would only lose a small portion of my entire collection; not the whole kaboodle.  I hope I have made my point here.  Developing a strategy that works for you and that is protecting you in multiple different ways, the better you will be able to sleep at night.  If you do decide to use RAID, use multiple arrays; don’t keep your golden eggs in one basket.

A relatively new method and approach to backup is software developed by Apple called Time Machine.  Time Machine debuted with OSX 10.5 and came free with the operating system.  Time Machine is cool, I will give it that.  But at the same time it is not cool.  Time Machine makes incremental backups of your data without having you to think.  Select your drive, tell it where to back up, then an hour or so later you have a new backup.  Here is the problem (or at least to my understanding): Time Machine saves to a large .dmg file (disk image).  It is basically a disk inside a disk.  If your working drive were to fail, you cannot simply plug in your Time Machine backup and start working from it.  It requires you to first “restore” the data, and then you can work on it.  This is just one additional step that adds time and inconvenience to your data restoration (which hopefully does not happen often).  Time Machine is also Apple specific, so, you must have an Apple computer with the latest operating system to access it.  I’m not a fan of proprietary data backup, so this option is a no-go in my book.  The cool thing about Time Machine is that it creates multiple backups of the same file, as the file “progresses” in age.  For instance, I copy a photograph titled PICTURE at 10am, it will create a backup of PICTURE at 10am and save it in Time Machine.  If I then edit that file and save my changes one hour later, it will also save a copy of PICTURE at 11am and save it to Time Machine.  The same file, copied twice.  This can be beneficial if you are not careful with your data, and you saved your image in a state that you do not want anymore, you can go back to the original PICTURE file, saved from 10am.  The downside is that because it is saving multiple instances of the same file, your back up can now easily double, triple or quadruple in size versus your original working drive’s size.  If you have the space for it then it’s usually not a big hassle.  If you don’t, you will soon realize even after working on your files for a couple of days that your Time Machine drive is no longer large enough to backup all of your data.  Again, I prefer simplicity for my management.

In the end, with three working drives means three backup drives and three redundant backups.  Adding an additional set of redundant backups to be held at a third off-site location will only ensure a more concise backup solution for your needs.
I’ll admit.  My entire archive of client data is not very large and I’m sure I’ll need to make some changes down the road.  But for me, this solution works and it works great.  Not exactly bullet proof, but, neither is any system (unless you pay a lot of money for it).  It is very modular.  It is expandable.  It is easy (almost too easy) to maintain.  I don’t have to think much about it, just be sure I make consistent backups and manage the swaps between my backups and redundant backups.

I welcome any comments or even suggestions to my current system.  I’m always trying to help other photographers in any way that I can, and if I can pass along a system that works better for me, then I’ll be happy to concede to a better and more practical approach.  I have saved my shameless plug for the end.

A big thanks goes to FredMiranda.com.  In my opinion it is the best photography forum available to the public.  There are a lot of extremely talented and smart individuals who consistently contribute free knowledge and advice.  I can honestly say that I have become the photographer and business person that I am today because of this community.  Thanks family!

If you feel that this is a worthwhile read, I’d appreciate you leave a comment below, or use the links below to share on other social websites and forums. I’d prefer comments are left here so that the general population has one source to go to if I may be able to answer questions or add further comments. Thanks!

Leave a comment:

12 Comments

  1. September 14, 2009 at 11:32 pm ·

    Spencer, great write up! Very informative, in-depth explanation of technical details that many folks don’t have the time to learn on their own. One suggestion / request I have is for you to consider covering a few more bits and pieces that were left out.

    How do you organize the files on your personal / business drives as far as folder hierarchy goes (personal documents, contracts, invoices, RAW / edited jpg / blog-sized photo organization, music, etc.)?
    How long do you keep client images for, and, if longer than the life of your drives, what do you do with them afterward for archiving?
    Where do you designate your scratch disk for Photoshop / Lightroom catalog and cache? How do you organize / maintain / backup your LR catalogs?

    I’ve got a system that works well for me, this is just info that I think a lot of people could benefit from on top of this well-versed article. Also being a hands on do-it-yourselfer, I’m also curious to hear about other folks’ workflows and systems to find out if there’s a better or more efficient way to do something =). Thanks for the write up!

  2. Spencer
    September 14, 2009 at 11:52 pm ·

    Danny,

    Very valid points made! I’ll have to keep track of all the additional comments / questions that people have and respond to those in a future article. I think the PS scratch and also LR catalog are very good points to cover, things that most people are clueless too but can make a pretty big difference.

  3. September 15, 2009 at 12:49 am ·

    I am but a mere mortal in the face of you Specer in understanding all this stuff. Thanks for the dope write-up my friend. I will refer all that ask about storage here. I’ve copied this exact set-up

    I want Spencer’s babies! I’m sure a firm hand shake will do however.

  4. September 15, 2009 at 2:45 am ·

    Spencer,

    This is way too complicated. I’m giving
    up photography forever.

    :) But you sure seem to know your stuff!

  5. September 15, 2009 at 7:45 am ·

    Spencer,
    Thank you for such a valuable write up. I am fairly new to the complexities of backing up as a photographer. It’s a very intense process because of the overwhelming amount of space that raw files, tiffs and full size jpegs take. I use my iMac hard drive as my operating and applications hard drive, I have a Drobo filled with 4 1TB drives that I use as a working drive (all images). I have a single 1tb drive that I run Time Machine on for my applications/system drive. I have another 1tb drive that I backup image files to from the Drobo. Once that is full, I move on to another 1tb drive as a backup.

    I know this system won’t work forever. I am really interested in the Sans Digital setup you mention.

    I have already bookmarked this and am sure I will return to your article many times.

  6. September 15, 2009 at 9:57 am ·

    Spencer, great post, very informative! I’d like to mention three things:

    1. On a Windows box the probably best software to do what you are doing is SyncbackSE. I’ve been using it for a few years and it’s great. It also supports versioning which you mention as one of the benefits of Time Machine

    2. I either didn’t read correctly or I see a hole in your strategy that would be RIDICULOUSLY EASY (and very inexpensive) to close up. Is it true that off-site backups only really happen about 1x a week when you swap one offsite for another and take it home? If so, wait for a sale on a Western Digital Passport drive (I got a 320gb for about $60 after rebates) and then set up your sync software to sync a few predetermined folders to the WDP … I keep all the recent work in a folder called NEW (subdirectories for the different clients inside) which gets sync’d to the WDP… then I hook the WDP to my offsite PC every day and the sync software makes a copy onto the offsite PC. This way new work is on the working drive, on my WDP (which is pretty much always on me) and on the offsite PC.

    3. You don’t mention how you keep older stuff safe. If you rotate backups every week, presumably overwriting the two week old backup with the latest and then wash-rinse-repeat in a week, you run the VERY REAL RISK of data corruption making it into your backups without realizing it. A file, or folder you’re done working with goes bad on your Mac … you then backup and the old (good) version gets overwritten… once two weeks have gone by it’s bye bye data. I was dealing with this issue by burning onsite and offsite DVDs in addition to the backup strategy. Now I think I’m going to stop that and rely intstead on Zenfolio’s data warehousing for that purpose.

  7. Pingback: Raw Workflow with Photo Mechanic, Photoshop, and Lightroom » Daniel Valente Photography | Omaha NE Wedding Photography

  8. August 8, 2011 at 2:00 pm ·

    great article.. lots of useful information. Thanks!

  9. August 9, 2011 at 2:38 pm ·

    Holy crap Spencer. Thanks for the post!

  10. August 21, 2011 at 11:10 am ·

    Hey Spencer,

    Great info thanks for it. I just have one quick question. I do this pretty much the same way you do, except I use my tower of my macpro for my HD’s instead of an externel enclosure. Obviously having the external enclosure would be much more convenient. But by using the enclosure, even though it says RAID, it doesn’t necessarily mean you have to be running a RAID set up. Or maybe thats already what I am doing with super duper. Basically, can I just use the enclosure the same way I use my macpro tower? Thanks man!

    -Chris

  11. April 17, 2012 at 10:38 pm ·

    Hey Spencer, nice write up! Do you still use this same system or has time created a new data management setup?

  12. Pingback: Links van 30/10/2012 tot 04/11/2012 | Nunc est Bibendum!

Leave a Comment