News, tips, advice, support for Windows, Office, PCs & more. Tech help. No bull. We're community supported by donations from our Plus Members, and proud of it
Home icon Home icon Home icon Email icon RSS icon
  • Compare Contents of Two Documents

    Posted on Nathan Parker Comment on the AskWoody Lounge

    Home Forums AskWoody support Microsoft Office by version Compare Contents of Two Documents

    Viewing 13 reply threads
    • Author
      Posts
      • #2299232 Reply
        Nathan Parker
        AskWoody_MVP

        I have two long lists of book titles I need to compare the differences between and see if one list has all of the titles from the second list or is missing any titles.

        In terms of word processors, I have Word, BBEdit, TextEdit, Nota Bene, Google Docs, Mellel, Nisus Writer Pro, and Pages.

        Is there a way to accomplish this, or would I have to simply comb through the lists manually?

        Thanks!

        Nathan Parker

      • #2299244 Reply
        access-mdb
        AskWoody MVP

        Personally I would use Access to import the list as two tables and then use the query wizard to list what’s in one and not in the other. I suspect that as you’re on a Mac this isn’t available, but do you have a relational database app?

        Otherwise use Excel (or similar). Copy the two lists into separate columns and sort them (separately). Go to the midpoint of the list (e.g. 100 items, go to line 50). If they match there’s no difference in lines 1-50. Delete them and repeat. If you find a non match then the extra ones are in the first 50 lines (using the example). Go to the midpoint of those and compare. Just rinse and repeat. You don’t say how long your lists are, so this method might take longer for more. But it’s actually quite efficient. And it will find differences in both columns.

        2 users thanked author for this post.
      • #2299252 Reply
        anonymous
        Guest

        You have not written the type of files that the long lists are in, but if .txt text files you could try “WinMerge” – see https://winmerge.org/downloads/?lang=en  .

        There is a portable version of WinMerge (scroll down to the unofficial PortableApps link at the page above) so you don’t need to install it if this is just a temporary thing.

        I have a combination of 2 programs set up as described at https://dottech.org/167404/how-to-add-compare-to-and-compare-later-options-to-context-menu-in-windows-tip/  which achieves the same kind of thing i.e. a side by side comparison of files.

        I only continue with this method because I am familiar with it, but portable WinMerge might be better if you are starting from scratch.

         

        1 user thanked author for this post.
      • #2299254 Reply
        Kathy Stevens
        AskWoody Plus

        It would help to have a little more information regarding how the lists were created.

        Are both lists in the same format?

        Are they variations of the same MS Word document? Are they in Excel? Etc.

        If they are variations of the same MS Word document you can use Word’s compare tool.

        If not, and assuming that the lists are in the same format, you can paste the list into two separate Word documents, format each list so that each title is on a separate line using the same order (title, author, etc.). Separate the components of each title using the tab key. Then cut and paste each document into an Excel sheet. Sort the columns of each Excel sheet by title. Then cut and paste the titles form each list into a third Excel sheet, sort the columns alphabetically, and compare.

        1 user thanked author for this post.
        • #2299279 Reply
          mn–
          AskWoody Lounger

          … are the lists ordered?

          A lot of the usual “diff” tools will be a lot less useful if ordering is also significant. Oh and numbered lists are a bit of a bother requiring some additional work.

          The traditional way for plaintext files on Unix/Linux is to use the sort command line tool (forcing a fixed order for entries in a file), and then diff.

          Also if you only need to know how many entries are in only one of the lists, then you can do a concatenation with “sort file1 file2 |uniq -u” to report any lines that occur only once across both files. This won’t tell you which file any given line was in, though.

          (“sort -df …” to ignore case differences and non-alphanumeric characters in sort, “diff -iw” to ignore case and whitespace on a line in diff, “uniq -i” will also ignore case.)

          2 users thanked author for this post.
      • #2299255 Reply
        anonymous
        Guest
      • #2299267 Reply
        Ascaris
        AskWoody_MVP

        I use the excellent Meld program for this. While it is primarily a Linux tool, it does have a Windows and an unofficial (so far… it sounds like they may be planning Mac support from the way they put it, but not yet) Mac build also.

         

        Group "L" (KDE Neon Linux, User Edition).

        2 users thanked author for this post.
        • #2299803 Reply
          geekdom
          AskWoody Plus

          Exactly what I’ve been looking for. This site provides nice, useful resource software information.

          G{ot backup} TestBeta
          offline▸ Win10Pro 1909.18363.959 x64 i3-3220 RAM8GB HDD Firefox79.0 WindowsDefender
          online▸ Win10Pro 1909.18363.1139 x64 i5-9400 RAM16GB HDD Firefox83.0b1 WindowsDefender
          TargetReleaseVersion=1909
          WUMgr
      • #2299276 Reply
        Kathy Stevens
        AskWoody Plus

        Then there is the no brainier approach.

        Open both files.

        Block and copy the first title in the “base document” and do a search for it in the “second document”.

        If the search reveals that a title appears in both documents highlight it in both documents and move on to the second title in the “base document” and repeat the process.

        At the end of the process the titles that are not highlighted are not duplicated in either file.

        1 user thanked author for this post.
      • #2299392 Reply
        Nathan Parker
        AskWoody_MVP

        One list is on a web page at the moment, but I can copy the entire list into any form of document type (so I could do Word, TXT, RTF, etc).

        The second is a list inside a software app, but it gives me an option to copy all where I could again paste it into any document type.

        They both don’t seem to be in the same order, even though they should be since they’re a similar list of books.

        Nathan Parker

      • #2299448 Reply
        anonymous
        Guest

        Turn both versions into Word documents.  Save both.

        I have Word 2019 – open one document, then Review/Compare with the second?

        I recall that earlier versions of Word and Wordperfect had a similar tool.

        1 user thanked author for this post.
      • #2299525 Reply
        Vincenzo
        AskWoody Lounger

        I would turn them both into Word docs, use the built in tools to alphabetize them. (they may have to be in tables, can’t quite remember, in order to alphabetize.

        Then go to Review Tab. then Compare/Two Versions of a doc (legal blackline).

        I think you can do it in Excel too, but I’ve not done that.

        • This reply was modified 3 weeks, 2 days ago by Vincenzo.
        1 user thanked author for this post.
      • #2299720 Reply
        Nathan Parker
        AskWoody_MVP

        About 1,683 titles.

        Nathan Parker

      • #2299787 Reply
        access-mdb
        AskWoody MVP

        A way to do this in Excel is on https://trumpexcel.com/compare-two-columns/, look for

        Example: Compare Two Columns and Highlight Mismatched Data

        This works with a small sample by highlighting all in column 1 that are not in column 2 and vice versa and using styles to show differences. The list don’t have to be sorted. I shall be using this in future when I have lists to compare.

        Google sheets doesn’t appear to have this but probably MS Excel online might. I’ll have a look later.

        1 user thanked author for this post.
      • #2299793 Reply
        access-mdb
        AskWoody MVP

        Well I’ve found an even simpler way in Excel. Put your data into one column, select it. Now go to Data/remove duplicates. All done! Well it was in my simple test. Of course, you won’t find duplicates like ‘Lord of the Rings, The’ and ‘The Lord of the Rings’.

        • This reply was modified 3 weeks, 1 day ago by access-mdb.
        1 user thanked author for this post.
      • #2299808 Reply
        doriel
        AskWoody Lounger

        I use Notepad ++
        Download plugin “Compare” – it takes 10 seconds
        Output is crystal clear – I added two lines

        002-2

        But this is not for DOCX documents, just for plain text, source codes, etc.

        Dell Latitude E6530, Intel Core i5 @ 2.6 GHz, 4GB RAM, W10 1809 Enterprise

        HAL3000, AMD Athlon 200GE @ 3,4 GHz, 8GB RAM, Fedora 29

        Attachments:
        1 user thanked author for this post.
        • #2299825 Reply
          access-mdb
          AskWoody MVP

          Thanks Doriel, I had forgotten that Notepad++ had a compare plugin. The only thing I would say is that I had more differences than you and the output is a bit confusing (better learn how to interpret it). Nathan has said he has over 1600 titles, so this might not be easy (unless there’s not many differences). I think using Remove differences in Excel is much easier, but it’s always good to have more than one way of doing things!

          Notepad
          Excel difference
          Excel remove dups
          Excel result

          Attachments:
          2 users thanked author for this post.
          • #2299831 Reply
            doriel
            AskWoody Lounger

            I agree, you are absolutely true. Notepad++ is just basic tasks. When comparing so much titles, this is not the best nor easiest way.

            But it can help sometimes when performing basic tasks. Hope this colud help somebody.
            Thank you for the Excel how-to!

            Dell Latitude E6530, Intel Core i5 @ 2.6 GHz, 4GB RAM, W10 1809 Enterprise

            HAL3000, AMD Athlon 200GE @ 3,4 GHz, 8GB RAM, Fedora 29

            1 user thanked author for this post.
    Viewing 13 reply threads

    Please follow the -Lounge Rules- no personal attacks, no swearing, and politics/religion are relegated to the Rants forum.

    Reply To: Compare Contents of Two Documents

    You can use BBCodes to format your content.
    Your account can't use Advanced BBCodes, they will be stripped before saving.