• Verifying Integrity of Copied Files

    Home » Forums » AskWoody support » Windows » Windows 10 » Questions: Win10 » Verifying Integrity of Copied Files

    Author
    Topic
    #2595511

    I am using the copy & paste function in File Explorer to ‘copy’ some large backup files (9GB+ in size) to another drive but want to ensure the integrity of the final copied file and am currently at a bit of a loss as to how to go about this!?  (Simple single files is one thing, but these contain multiple folders and files).

    Any suggestions would be much appreciated – thank you.

    (Windows 10 Pro)

    Viewing 15 reply threads
    Author
    Replies
    • #2595558

      First use robocopy as it does a better job.  Barring that you can have the system to a sha1 value comparison.

      https://www.nirsoft.net/utils/hash_my_files.html

      Note this url may freak out some browsers, it is a safe site.

      Susan Bradley Patch Lady/Prudent patcher

    • #2595610

      Hashing a 9GB file is not fast, but if the data is valuable….
      Don’t forget to check the file on disk is readable (disk is OK) regularly.
      https://www.askwoody.com/forums/topic/from-the-lounge-simple-and-cheap-data-backup-and-storage/#post-2318643

      cheers, Paul

    • #2595705

      I periodically use a piece of software called Hashdeep to check for any data degradation. Unfortunately, the software is no longer being updated, but it does still work on Windows 10 (I’ve not tried it on Windows 11). Hashdeep is a piece of software that can create and audit checksums of every file in a directory you specify. It’s simple, but it does the job of telling you when a file has been corrupted so you can replace the corrupted file with an uncorrupted copy.

      1 user thanked author for this post.
      • #2595823

        If you copy the file to an HDD/SSD and verify it you don’t need to verify it again. The disk will do that job for you – see the link I posted above.
        If you copy to USB stick, all bets are off.

        cheers, Paul

        • #2595934

          Unfortunately, simply reading files and/or monitoring SMART will not alert you to the loss of magnetic orientation in HDDs nor the loss of electrical charges in SSDs, hence why checksums are useful. Files can be in a readable state but still have data loss, as these are silent failures.

          • #2596083

            Sky, read the newsletter article in the post linked above to find out why this is not true.

            cheers, Paul

            1 user thanked author for this post.
            Sky
            • #2596211

              That’s interesting, thanks for the infomation. I was relying on things I’d learnt 20 years ago, and it seems that technology has moved on since then!

              Having done some further research on the subject following reading the article, it does seem that, while Error Correction Codes are helpful, they’re not entirely foolproof, as you can still get silent failures and checksum mismatches for varying data path and write-related reasons. I read a paper just now that described a 41-month study of 1.5 million disks, which had a combined 400,000 checksum failures despite the use of Error Correction Codes, so it does seem like checksums are still helpful if you want to be sure of data integrity.

            • #2596292

              Care to share the paper / link?

              cheers, Paul

            • #2596455

              Sure, here you go:

              https://www.cs.toronto.edu/~bianca/papers/fast08.pdf

              (In case the link dies, the paper is ‘An Analysis of Data Corruption in the Storage Stack’)

            • #2596566

              The paper suggests that the most serious issue is “latent sector errors” (9.5% of disks) where the internal ECC fails and the error is reported to the OS and that once errors appear they multiply. This is consistent with the premise that modern disks are very reliable until they start to fail.
              A backup is the cheapest protection and regular data reads (of both the original and backup) the easiest test for problems.
              Regularly reading SMART data is also an essential tools to detect early signs of impending failure.

              For the other errors reported you need RAID and a better file system (XFS etc) to limit problems, so not for home users, except the seriously paranoid.  🙂

              cheers, Paul

            • #2596714

              The paper does mention latent sector errors in subsection 4.6 as a point of comparison, but that is not what the paper is about, and they are not included in the statistics of silent failure checksum mismatches mentioned above. As the paper says, “we do not describe techniques used [to detect] other errors, such as … latent sector errors.” These are, therefore, two different types of error, and I believe that both should be accounted for. Regular data reads and SMART data don’t catch checksum mismatches, as they’re not necessarily hardware failures.

              One could use RAID to overcome the silent failures, yes, as under the surface RAID controllers can use checksums to verify file integrity; but, as you say, this isn’t practical for most people.
              Running a simple program or script to audit checksums every so often is definitely practical for most people, however, and also solves the problem.

              The paper puts it better than I could:
              “Albeit not as common as latent sector errors, data corruption does happen; we observed more than 400,000 cases of checksum mismatches. For some drive models as many as 4% of drives develop checksum mismatches during the 17 months examined. Similarly, even though they are rare, identity discrepancies and parity inconsistencies do occur. Protection offered by checksums and block identity information is therefore well-worth the extra space needed to store them.”

    • #2595711

      I use TeraCopy Pro (the fastest file copy, multiple files copy to multiple destinations..)
      Both the free and paid versions have verify after copy option :

      “TeraCopy verifies file integrity with support for 17 checksum algorithms (over 50 variations), including CRC32, MD5, SHA1, BLAKE3, xxHash3, and more.

      It can verify that destination files are identical to the source files and also generate or validate checksum files.”

      https://www.codesector.com/teracopy

      2 users thanked author for this post.
    • #2595899

      There are a number of third-party file copy utilities available that can verify the integrity of copied files. One popular option is FreeFileSync.
      Also, Teracopy and Gs Richcopy 360 are considered.

    • #2595938

      Thank you to everyone for their replies and I’ll look into some of the different options available over the weekend.

      One update on Teracopy though; I did pick up on this software while searching for a suitable option and tried it, however, for some reason it always resulted in corruption at the destination.  Initially, I tried with one of the large backup files and then an individual data file from my system – both returned a message that the files could not be accessed due to corruption.  One odd thing I noticed with the 9GB backup was that firstly, it took seconds to copy across and then the Teracopy window just disappeared with no confirmation?  (On checking, the full 9GB was at its destination).

      Fearing the worst, I first checked that I could access the data on my original backup – all okay.  I then copied the backup to the destination drive (the long way) using File Explorer and was able to access all the information just fine.

    • #2595946

      One update on Teracopy though

      I use TeraCopy for years. Copied probably hundreds Terabytes of data.

      Recently I backed up 8TB.

      Never failed.

    • #2596085

      Teracopy uses a fast copy method that can be flakey (anecdotal evidence). If it doesn’t work for you, don’t use it.

      Robocopy (built-in to Windows) is my favourite bulk copy software and you can get GUI front ends for it.
      https://www.codeitbro.com/best-robocopy-gui-for-windows/

      cheers, Paul

      • #2596217

        Thanks for the link Paul T.

        I was looking over the Command Line help and was kind of getting a bit bewildered – glad to see there are GUI that will make things much simpler!

    • #2596215

      One update on Teracopy though

      I use TeraCopy for years. Copied probably hundreds Terabytes of data.

      Recently I backed up 8TB.

      Never failed.

      Thanks for the response – I did initially have high hopes for Teracopy as being able to add it to my arsenal of software.

      For my own curiosity (and based on your own experience) I have a couple of questions:

      1. Correct me if I’m wrong, but a 9GB+ file would not copy in mere seconds?
      2. Using the interface; once the copying has been completed, does it just close the window without any confirmation of what it has just done (or maybe not done)?

      Look forward to your further response and thanks in advance.

    • #2596289

      Correct me if I’m wrong, but a 9GB+ file would not copy in mere seconds?
      Using the interface; once the copying has been completed, does it just close the window without any confirmation of what it has just done (or maybe not done)?

      The copy speed depends on your network and type of storage destination. I copy every day GBs of data on a 1Gbit network. Copying 9GB to a USB 3.0 HDD network drives would take ~2-3min.

      (When I copied 400GB to a Samsung Portable T7 SSD it took ~20min)

      I never checked for confirmation notice (maybe the is one) as it never failed on me. The app closes at the end of the process unless there were errors which are marked and notified .

    • #2596290

      Copying 9GB should take ~2min

      On Alex very fast system.
      Us mere mortals may have to wait a bit longer.  🙂

      cheers, Paul

    • #2596300

      Copying 9GB should take ~2min

      On Alex very fast system.
      Us mere mortals may have to wait a bit longer.  🙂

      cheers, Paul

      Copying 10.9GB from a folder to another folder on the same drive takes 1min (USB-C Samsung 4TB portable SSD)

      Copying to external drive like Samsung T7 is ~3x faster (300MB/s+).

    • #2596418

      If you’re interested in checking HASH codes on individual files or comparing a known HASH code to another file you can download my Check-FileHash.zip.

      Load the PS1 file into Notepad or equivalent and read the help file on how to have the program self install a shortcut into the right click menu of File Explorer. From there it’s easy to check files and generate hash codes.

      If you’re familiar with PowerShell you can also use the Get-Help [d:\path]\Check-FileHash.ps1 command since the comments are actually Comment Based Help.

      May the Forces of good computing be with you!

      RG

      PowerShell & VBA Rule!
      Computer Specs

    • #2596993

      Me, once again 😉

      Thanks to everyone for their input so far – much appreciated (although some of it went right over my head!)

      Just wondering if anyone has any views on FastCopy (https://fastcopy.jp/) which seems to be simple and tick all the boxes in what I am after?

      Summary of my key requirements:

      1. Copy ‘everything as is’ from one external HDD to another – specifcally Acronis True Image 2019 backups.
      2. Verify the integrity of the copied destination file.
    • #2597079

      If you’re interested in checking HASH codes on individual files or comparing a known HASH code to another file you can download my Check-FileHash.zip.

      Not seeing that RG.
      You have a lot of good stuff maybe some would benefit with a readme file for the directory.

      🍻

      Just because you don't know where you are going doesn't mean any road will get you there.
      • #2597087

        Wavy,

        It got deleted somehow????

        I’ve uploaded it again. Let me know if it works now.

        The File Hash Log.pdf has a list of all the .zip files and their MD5 file hashes.

        I’ve tried to make the names self explanatory as I can. Maybe I’ll get around to a file with explanations soon?

        May the Forces of good computing be with you!

        RG

        PowerShell & VBA Rule!
        Computer Specs

    • #2597133

      Running a simple program or script to audit checksums every so often is definitely practical for most people, however, and also solves the problem

      Only if you have a second copy of the data and you have run the check before you need the data to restore. Given that level of paranoia I’d spend money on a NAS with automated scrubbing so the data recovery is self contained.

      The alternative is multiple backup sets. If you get an error on one you copy the other over the top – after verifying it is OK.

      cheers, Paul

    Viewing 15 reply threads
    Reply To: Verifying Integrity of Copied Files

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: