Approve the Cookies
This website uses cookies to improve your user experience. By using this site, you agree to our use of cookies and our Privacy Policy.
OK
Index  •   • New posts  •   • RTAT  •   • 'Best of'  •   • Gallery  •   • Gear  •   • Reviews
Guest
New posts  •   • RTAT  •   • 'Best of'  •   • Gallery  •   • Gear  •   • Reviews
Register to forums    Log in

 
FORUMS General Gear Talk Data Storage, Memory Cards & Backup 
Thread started 10 Apr 2018 (Tuesday) 04:43
Search threadPrev/next
sponsored links
(this ad will go away when you log in as a registered member)

★ Best software to check data integrity of backups

 
the.forumer
Senior Member
412 posts
Likes: 1
Joined Oct 2011
     
Apr 10, 2018 04:43 |  #1

I have over 500,000 pictures/videos backed up on several drives. What is the best software that I can run to help me check whether none of them is corrupt on a quarterly basis?

Thanks!




  
  LOG IN TO REPLY
sponsored links
(this ad will go away when you log in as a registered member)
John ­ from ­ PA
Cream of the Crop
8,062 posts
Likes: 502
Joined May 2003
Location: Southeast Pennsylvania
Post edited 4 months ago by John from PA.
     
Apr 10, 2018 06:34 |  #2

If you mean some software package that can run on a folder and report back damaged files; then that's a tall order and I'm not sure that such software exists. And one has to ask just how often does a file become corrupted and what is your overall backup strategy. Is this just a concern or are you actually experiencing a need for such software?

Having said that there are some good products in the marketplace that are more or less dedicated to image type files. Unfortunately they usually can only handle jpegs, not RAW. One that comes to mind is Hetman File Repair (see https://hetmanrecovery​.com/file_repair/softw​are-2.htm (external link)).

I have to say, in about 15 years of doing digital stuff, I don't think I've ever encountered a corrupt file. But I'm also not a pro so if I did I might not even know it (or care). But I do maintain backups.




  
  LOG IN TO REPLY
Left ­ Handed ­ Brisket
That's my line!
Avatar
8,578 posts
Gallery: 10 photos
Likes: 1747
Joined Jun 2011
Location: The Uwharrie Mts, NC
     
Apr 10, 2018 07:44 |  #3

I'm with John.

When copying files, the software can check that the copy is identical to the original, this is the best point (possibly the only reasonable point?) to check data integrity.

If a file is just sitting there on a regular (non-SSD) hard drive the greatest risk by far is mechanical failure. I've been working with digital media since the early 1990's and I'm struggling to remember a time that I had files go bad that wasn't tied to a hard drive failing. This is the reason people keep 2-3 copies of important files.

I'm actually dealing with copying data off a failing HD now. Carbon Copy Cloner tells me what files are corrupt at the end of the copy process. Thankfully this drive is pretty much filled with junk ... no work files.


PSA: The above post may contain sarcasm, reply at your own risk | Not in gear database: Auto Sears 50mm 2.0 / 3x CL-360, Nikon SB-28, SunPak auto 322 D, Minolta 20

  
  LOG IN TO REPLY
Tom ­ Reichner
"I am a little creepy"
Avatar
12,154 posts
Gallery: 140 photos
Best ofs: 1
Likes: 2923
Joined Dec 2008
Location: Omak, in north-central Washington state, USA
     
Apr 10, 2018 07:47 |  #4

the.forumer wrote in post #18603723 (external link)
I have over 500,000 pictures/videos backed up on several drives. What is the best software that I can run to help me check whether none of them is corrupt on a quarterly basis?

.
I have never heard of such a thing.

Are you sure that what you are looking for actually exists?


.


"Your" and "you're" are different words with completely different meanings - please use the correct one.
"They're", "their", and "there" are different words with completely different meanings - please use the correct one.
"Fare" and "fair" are different words with completely different meanings - please use the correct one. The proper expression is "moot point", NOT "mute point".

  
  LOG IN TO REPLY
mike_d
Cream of the Crop
Avatar
5,141 posts
Likes: 402
Joined Aug 2009
     
Apr 10, 2018 14:57 |  #5

If you're dealing with an actual backup, not just drives full of files, then your backup program should be able to perform a verification. I use Cloudberry backup and it has such an option.




  
  LOG IN TO REPLY
Bcaps
I was a little buzzed when I took this
Avatar
782 posts
Gallery: 64 photos
Best ofs: 16
Likes: 1589
Joined Jun 2003
Location: Bay Area, CA
Post edited 4 months ago by Bcaps. (4 edits in all)
     
Apr 10, 2018 16:14 |  #6

If I understand what you are asking for you can do this by comparing the checksum of the local and the remote file. If the remote file differs from the local file by even one byte the file will be flagged. There are various checksum algorithms. One popular example is MD5. A google search (external link) for MD5 software programs has some hits that may work for you.

Most good backup software will have an option to perform this check during (and even after if you didn't do it when you first backed up the data) the backup process. If your backups exist in the same state as the original files (ie, not in a proprietay file type or encrypted), one that will likely work on the local and remote data that you have now is GoodSync (external link). You tell it what folders to compare on the local and remote drives and in the options there is a setting to "Compare checksums of all files (slow)". It will then list any files that have different checksums which would suggest a corrupt file. They have a free evaluation period so you can check it out first.


- Dave | flickr (external link)
Nikon D810
14-24mm f/2.8 | 16-35mm F/4 | 24-70mm f/2.8 | 70-200mm f/4 | Sigma 150-600mm

  
  LOG IN TO REPLY
John ­ from ­ PA
Cream of the Crop
8,062 posts
Likes: 502
Joined May 2003
Location: Southeast Pennsylvania
     
Apr 10, 2018 16:32 |  #7

Bcaps wrote in post #18604122 (external link)
If I understand what you are asking for you can do this by comparing the checksum of the local and the remote file. If the remote file differs from the local file by even one byte the file will be flagged. There are various checksum algorithms. One popular example is MD5.

Jpeg images are inherently difficult for a checksum comparison. Explore the process at https://www.controlled​vocabulary.com/imageda​tabases/de-dupe.html (external link) to see what is involvedto check for duplicates.




  
  LOG IN TO REPLY
the.forumer
THREAD ­ STARTER
Senior Member
412 posts
Likes: 1
Joined Oct 2011
     
Apr 10, 2018 17:44 as a reply to  @ Bcaps's post |  #8

I currently manage backups manually because there are so many drives and so many permutations (e.g. my 1TB 3.5" stores pics from 2007-2010, my 4TB 2.5" stores videos from 2009-2017 etc). Is there any backup software that is flexible enough to let me program to the fullest extent for automatic backups? (talking about offline backups in this scenario)

Separately, i have personally encountered corrupt image files before, although that was the point after I transferred files from my A7Rii to my PC, and not after X years of storage (I rarely access my backups unless I need to), hence the question for data integrity check.




  
  LOG IN TO REPLY
CyberDyneSystems
Admin (type T-2000)
Avatar
48,453 posts
Gallery: 84 photos
Likes: 4548
Joined Apr 2003
Location: Rhode Island USA
Post edited 4 months ago by CyberDyneSystems.
     
Apr 10, 2018 19:00 |  #9

So there are a few approaches here.

- Use the verification tool included in your back up software of choice. (For the enlightened, this means " /v " )
Most back up software will have this as an option. Most strangely do not use it by default, favoring speed over integrity, ( which seems counter intuitive for a back up app.)
There are different methods used by differing apps.

- Use a file/directory comparison tool after the fact. I've never done it this way, but the tools do exist both as stand alone, or your backup app might be able to do it.

eg:
https://backbox.com …ions/backup-verification/ (external link)
A wiki study;
https://en.wikipedia.o​rg …_of_file_compar​ison_tools (external link)


GEAR LIST
CDS' HOT LINKS
Jake Hegnauer Photography (external link)

  
  LOG IN TO REPLY
Moonshiner
Senior Member
Avatar
527 posts
Likes: 482
Joined Jul 2013
Location: Mil-yucky, Whiskonsin
Post edited 4 months ago by Moonshiner.
     
Apr 10, 2018 19:18 |  #10

You could probably just use PowerShell to do it if you have any coding skills... For each file, create a hash and store that off in another file or database. The quarterly, run the same cmdlet again the compare the hashes. If the file is corrupted, it SHOULD present a different hash especially when you read the file into a binary stream and then calculate it. Pretty simple to do and probably pretty quick. I can compare the hashes of around 15K objects in about 30 seconds when doing it in parallel.

Good luck with your search.




  
  LOG IN TO REPLY
tim
Light Bringer
Avatar
50,923 posts
Likes: 336
Joined Nov 2004
Location: Wellington, New Zealand
     
Apr 12, 2018 02:55 |  #11

What you have is copies of your files, not backups. Read my article on backups (external link). I'll wait here until you're done.

.
.
.
.
.
.

Ok, now you know about incremental backups, the 3-2-1 rule, and the need for backup software, cloud vs local, etc. I suggest you choose backup / archive software and get everything into a good system. Have your backup software run integrity checks say every three months. If anything fails you have at least two other copies of any file.

I keep my data in my house on hard disk, nearby on hard disk, at work in my drawer on hard disk, archived in Amazon Glacier (medium res jpg / 720p video not full res), and recent files go into a version controlled S3 bucket until they're into my standard backup system. Usually I leave them on S3, it costs little at my current volume of data. What I like about AWS S3 / Glacier is 99.9999999% durability (or something like that), as the data is stored in three different data centers and they do regular integrity checks. Glacier isn't exactly cheap, 10TB of data would cost US$40 per month, which is why I store smaller images - 120 weddings plus 15 years of family photos costs me about $1 a month to keep in Glacier.


Professional wedding photographer, solution architect and general technical guy with multiple Amazon Web Services certifications.
Read all my FAQs (wedding, printing, lighting, books, etc)

  
  LOG IN TO REPLY
John ­ from ­ PA
Cream of the Crop
8,062 posts
Likes: 502
Joined May 2003
Location: Southeast Pennsylvania
Post edited 4 months ago by John from PA.
     
Apr 12, 2018 04:19 |  #12

tim wrote in post #18605114 (external link)
What you have is copies of your files, not backups.


If anything fails you have at least two other copies of any file.

Copies or backups, what's the difference?

If I use a copy >>> paste operation to another "destination" is it a copy?

If I use dedicated backup software, is it a backup?

Personally I use a batch file to accomplish it all, doing full and incremental backups and retaining the full file name to make life easier when I need something, which hasn't occurred in 15 years of digital. But then again I don't have 1/2 million images!




  
  LOG IN TO REPLY
mike_d
Cream of the Crop
Avatar
5,141 posts
Likes: 402
Joined Aug 2009
     
Apr 12, 2018 10:32 |  #13

John from PA wrote in post #18605146 (external link)
Copies or backups, what's the difference?

If I use a copy >>> paste operation to another "destination" is it a copy?

If I use dedicated backup software, is it a backup?

Personally I use a batch file to accomplish it all, doing full and incremental backups and retaining the full file name to make life easier when I need something, which hasn't occurred in 15 years of digital. But then again I don't have 1/2 million images!

True backups give you a version history. What happens if you do a backup, then discover that the one you needed was a prior version of a file?




  
  LOG IN TO REPLY
Nethawked
Senior Member
791 posts
Gallery: 24 photos
Best ofs: 1
Likes: 236
Joined Oct 2014
Location: Virginia, USA
     
Apr 12, 2018 16:38 |  #14

As others have stated, integrity checks in the purest form is difficult. Look for backup solutions that compare hash values, which in my experience has never failed.

I have several complex backup scenarios both from and to multiple devices. I've started using SyncBackSE last year and it easily accomplishes everything I need.




  
  LOG IN TO REPLY
Archibald
You must be quackers!
Avatar
6,680 posts
Gallery: 235 photos
Best ofs: 1
Likes: 7754
Joined May 2008
Location: Calgary
     
Apr 12, 2018 17:11 |  #15

The issue here IMO is not so much that a backup will suffer corruption, but that your working files will. Errors happen over time. There are software crashes, power failures, MS can commandeer your box for a forced update, and so on, not even talking about user error or viruses. These things can corrupt files. Some of those corrupt files look fine to the operating system, but the content is bad. You won't know the file is corrupt unless you open the file. And you are not going to open 500,000 files, so you won't know if an old file has gone bad.

Then when you do your backup (or copy), the bad file is archived. Over time and after consecutive backups, you will eventually lose all copies of the good file.

It happened to me... opened two old *.JPG files and the images were scrambled. My backups all had the corrupted file too. In this case it didn't matter because I still had the CR2 files, but it is scary anyway to see that files can go like this, unnoticed. This was on my Win 10 box.

Another time I was backing up the contents of my wife's Android tablet. I copied the files over to removal media (flash drive). About 1/3 of the files were corrupted. The only way I knew was by opening them. No warnings or error messages. Jpgs, PDFs and other file types were involved. There was nothing wrong with the files on the tablet, and copying them over multiple times eventually gave me good copies of most. So the OS may not tell you that there is corruption going on.

You might have corrupt files on your system and not know it.

After my experience with the corrupt JPGs, I was worried that my HD was full of corrupt files. I wasn't able to find any software to automatically check the integrity of all my files. But there was one that could check JPGs. I ran it and it only found the files already known to be bad. From that I concluded that there probably weren't many bad non-JPG files in my system either.


Pentax Spotmatic F with 28/3.5, 50/1.4, 50/1.8, 135/3.5; Canon digital gear
C&C always welcome.
Picture editing OK
Donate to POTN here

  
  LOG IN TO REPLY
sponsored links
(this ad will go away when you log in as a registered member)

1,157 views & 2 likes for this thread
★ Best software to check data integrity of backups
FORUMS General Gear Talk Data Storage, Memory Cards & Backup 
AAA
x 1600
y 1600

Jump to forum...   •  Rules   •  Index   •  New posts   •  RTAT   •  'Best of'   •  Gallery   •  Gear   •  Reviews   •  Member list   •  Polls   •  Image rules   •  Search   •  Password reset

Not a member yet?
Register to forums
Registered members may log in to forums and access all the features: full search, image upload, follow forums, own gear list and ratings, likes, more forums, private messaging, thread follow, notifications, own gallery, all settings, view hosted photos, own reviews, see more and do more... and all is free. Don't be a stranger - register now and start posting!


COOKIES DISCLAIMER: This website uses cookies to improve your user experience. By using this site, you agree to our use of cookies and to our privacy policy.
Privacy policy and cookie usage info.


POWERED BY AMASS forum software 2.1forum software
version 2.1 /
code and design
by Pekka Saarinen ©
for photography-on-the.net

Latest registered member is mycreditfocus1
755 guests, 236 members online
Simultaneous users record so far is 6430, that happened on Dec 03, 2017

Photography-on-the.net Digital Photography Forums is the website for photographers and all who love great photos, camera and post processing techniques, gear talk, discussion and sharing. Professionals, hobbyists, newbies and those who don't even own a camera -- all are welcome regardless of skill, favourite brand, gear, gender or age. Registering and usage is free.