• Need to find broken hyperlinks in txt files (Word

    Home » Forums » AskWoody support » Productivity software by function » Visual Basic for Applications » Need to find broken hyperlinks in txt files (Word

    Author
    Topic
    #454849

    I need a macro, perhaps using a regular expression, to find hyperlinks that are broken or with improper syntax. Expressed another way, I want to find hyprlinks that do not match the following patterns. In the first hyperlink, the only part of the hyperlink that changes from one record to the next is the number 39762. Likewise for the next hyperlink with the number 39764. In these two hyperlinks below, only the number changes from record to record. Everything else in the hyperlink is a constant from one record to the next.

    PREVIOUS

    NEXT

    In this third hyperlink below, the only part of the hyperlink to change from record to record is Rev 21:6

    Everything else in the hyprlink is a constant from one record to the next.

    Rev 21:6

    If a hyperlink does not match the above patterns, I would like the macro to insert three [[[ left brackets so I can find the improper hyperlink after the macro is run.

    Thank you in advance for your help.

    Viewing 1 reply thread
    Author
    Replies
    • #1130155

      Try the macro in the attached zip file (I zipped it because text with embedded HTML code could cause problems).

      • #1130157

        Thank you Hans. I will give this a try and post back in a few hours.

        • #1130163

          Thank you for the help.

          Hans, could you take a look at my comments preceded by the arrows >>>> to make my comments stand out. The hyperlinks below all have some kind of error. I ran the macro on these links. Please see the >>>> comment line.

          Isa 55:1. Compare the Mat 5:6 The reader of the text should note; Joh 7:37″>
          note; Rev 21:6
          note.

          And whosoever will, let him take the water of life freely – Rev 21:6Rev 21:6″>

          >>>>missed out of place underline tags
          >>>>missed improper “> in a link
          >>>>finds extra spaces in a hyperlink
          >>>>missed doubling of the display part of the link

          [[[PREVIOUS NEXT‘,’NULL’);

          >>>>Picks up if extra tag in Next link

          • #1130165

            The macro I wrote performs only simple pattern checking; I don’t see an easy way of expanding it to meet your requirements

            It would probably be better to use a regular expression tool for this; the Find feature in Word and the Like operator provide only a fraction of what is possible with regular expressions. However, complicated regular expressions always give me a headache, so I am not the best person to help you with that. sad

          • #1130232

            These sorts of errors are more of a syntax failure than a broken hyperlink (which is more commonly thought of as a valid hyperlink that points to a page that doesn’t exist).

            Perhaps an HTML or XML validator would be a better tool to check for these types of errors.

          • #1130249

            You might be able to use Tidy, also known as HTML Tidy. It is integrated into some HTML editors, or you can read more here: http://tidy.sourceforge.net/#binaries%5B/url%5D.

            • #1130254

              Thank you for the suggestions. The files I am working with are in the 10 to 20 meg. range. I think if I could find a regex statement, I could use that statement to mark all the urls and then extract those to another file, then process from there.

              Thank all of you again. Woody’s is a big help to me.

    • #1206826

      How would one find text that has a hyperlink assoicated with it (blue underlined text) like in a Word document?

      • #1206842

        How would one find text that has a hyperlink assoicated with it (blue underlined text) like in a Word document?

        Hi Jim,

        I’ve already given you code to do that for URLs in Post 771669 in the Word forum. That code finds URLs regrdless of whether they’re active hyperlinks. Are you looking for code to find only active hyperlinks? What about file, mailto and internal hyperlinks within a document?

        Cheers,
        Paul Edstein
        [Fmr MS MVP - Word]

    Viewing 1 reply thread
    Reply To: Need to find broken hyperlinks in txt files (Word

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: