• Fast way to remove EOL paragraph mark in Word

    Home » Forums » AskWoody support » Productivity software by function » MS Word and word processing help » Fast way to remove EOL paragraph mark in Word

    Author
    Topic
    #473448

    I have 13 Meg txt files opened in Word. Each line in the file ends with a paragraph mark. I am trying to replace each mark with a space by using Word’s replace feature (in Word I am using caret p). However, this process is taking forever. Is there a faster way to remove end of line paragraph marks from txt files?

    Thank you in advance for your replies.

    Viewing 18 reply threads
    Author
    Replies
    • #1257756

      Is the problem that you can’t use Replace All with Word’s Find and Replace, because you need to keep some of them?
      Are there some of the paragraph marks you need to keep because they really represent paragraphs?

      • #1257758

        Is the problem that you can’t use Replace All with Word’s Find and Replace, because you need to keep some of them?
        Are there some of the paragraph marks you need to keep because they really represent paragraphs?

        Thank you for the reply.

        No, I am using Find and Replace, Replace All. The problem is the enormous time and resources of processing the file. The sentences/phrases in the file are approximately five words long. Every time a paragraph mark is replaced, all the lines following get “pulled up.” I do not need to keep any paragraph marks because I embedded markers in the text before I started the paragraph mark deletion procedure. Is there some kind of application I can use to replace the paragraph marks outside of Word (if that would help any)?

    • #1257766

      Hi HowdeeDoodee,

      You might try something using WSH instead of Word. There’s some code here: http://www.ericphelps.com/scripting/samples/GlobalSearchAndReplace/GlobalSearchAndReplace.txt
      It’s coded to only work with files under 1MB, but deleting the lines:

      Code:
      If objScriptingFile.Size > 999999 Then
        Status "* TOO LARGE: " & objScriptingFile.Path
        Exit Sub
      End If

      might work for you. You could also change the line:
      strOldText = InputBox(“Enter old text you want replaced”, “Old Text”, “/index.htm””>”)
      to:
      strOldText = vbcr
      or:
      strOldText = vbcrlf
      No matter which program you use, though, processing files with 13MB of text is going to take some time. All the WSH will do is avoid Word’s overheads. A fast PC wouldn;t hurt, either

      Cheers,
      Paul Edstein
      [Fmr MS MVP - Word]

    • #1257778

      I would start by putting the file into Normal (Draft) view and the cursor at the top of the file to avoid repagination slowdowns. Then try something like this

      ActiveDocument.Range.Text = Replace(ActiveDocument.Range.Text, vbCr, ” “)

      I don’t know how big a string is allowed to be but a string function might work well enough.

    • #1257997

      Hello, HDDD.

      In your case, what I do is to go to Find/Replace, ( Ctrl H ) and replace all double ¶ marks with another, I use two %%. Then do a Alt A and they are all replaced. Then do the Ctrl H again and Find all single ¶ and Replace with a space. After this is done, go back to the %%, and replace them with a double ¶. To get the ¶, do a Ctrl^ P.

      Look into the “More” on the dialog to Find/Replace then click on “Special” to find the one to remove. The Ctrl^ P is indeed that : Ctrl + Shift 6 + P or quicker, the Alt 20 keys.

      Have a good look at that dialog, after you have selected what to Find, then what to replace with, it will show at the bottom that the A in All is underlined, this tells you that an Alt A will flush them all. If it is not to your liking, a Ctrl Z will bring you back one step.

      Are we having fun yet ? ………………….Jean.

    • #1258150

      Hi Jean,

      The issue isn’t one of replacing single paragraph breaks only with spaces, but of replacing all paragraph breaks with spaces, which requires nothing morte than a single Find/Replace.

      Cheers,
      Paul Edstein
      [Fmr MS MVP - Word]

    • #1258254

      macropod, greets.

      You are right, read my note and you will see that the procedure that I mentioned is just this, replace all ¶ with a space. Keep in mind that all true paragraphs are followed by an undesirable ¶ and the one to keep. So, there are cases where there are two in a row thus the necessity to F & R two ¶ ¶ and replace them with another two characters that will be later replaced in the text to actually keep the formatting of the text.

      The second procedure is to rid the text of the single ¶ by a space, if this is done first, the F & R would do this all over, disregarding that all ¶ ¶ should not be taken singly. At this moment, there are no more double ¶¶ and one can get rid of all the left over single ¶ and replace them with a space.

      If these are “maunal line-break” they are different, they are ^| I am looking for the Alt code for them, but mechanically, they are Shift 6 + pipe, ^| , you can try replacing them with a space and all will be happy, I think that Alt 28 could be it. I hope that you have toggled the ” show non-printing characters ” to see what is what. In Word, it is the ¶ up on the bar and if you hover on it, it will read “Show/Hide ¶”.

      I hope that you are having fun, I am……………….Jean.

    • #1258258

      HDDD, hello.

      If this text is not too personal, upload it here and I will do the trick for you. See just below when you reply, the Attachments, hit the Browse, find your .doc and then Upload file.

      Kid’s stuff. ……………Jean.

    • #1258260

      Me again macropod.

      >>> The issue isn’t one of replacing single paragraph breaks only with spaces, but of replacing all paragraph breaks with spaces, which requires nothing morte than a single Find/Replace.

      Does not HDDD want to keep the format of the whole text ? A .doc such as hers, is done by an uneducated typist doing a CR ( Enter ) at the end of all lines as on a typewriter, instead of letting wordwrap do its trick.

      A FWIW………..Jean.

      • #1258270

        >>> The issue isn’t one of replacing single paragraph breaks only with spaces, but of replacing all paragraph breaks with spaces, which requires nothing morte than a single Find/Replace.

        Does not HDDD want to keep the format of the whole text ? A .doc such as hers, is done by an uneducated typist doing a CR ( Enter ) at the end of all lines as on a typewriter, instead of letting wordwrap do its trick.

        No that is not the issue. He genuinely wants to replace all CRs, not just the double CRs. Read the full thread.
        The problem is that the document is enormous (13MB) and Replace All is taking far too long. He knows how to do a Replace All, but is looking for a faster alternative.

    • #1258280

      Hello John.

      >>>The problem is that the document is enormous (13MB) and Replace All is taking far too long. He knows how to do a Replace All, but is looking for a faster alternative.

      I do not see why, mine was done here in a wink. I had 33 ± full pages. But who am I to say ? I do not think that there is another Word alternative. I read the question over again and I have the impression, maybe wrong, that the replacement is done one ¶ at the time and the Replace All is not used. ( ??? )

      Be good………….. Jean.

      • #1258320

        I do not see why, mine was done here in a wink. I had 33 ± full pages.

        The why is because that’s what HowdeeDoodee says he wants to do.

        I have 13 Meg txt files opened in Word. Each line in the file ends with a paragraph mark. I am trying to replace each mark with a space

        Processing around 33 pages is trivial compared to processing 13MB of text with a paragraph break at the end of each line of about 5 words each. That makes for in excess of 450,000 paragraph breaks to replace! Your 33 page document would probably have much less than 1/3 that number of characters, relatively few of which (probably less than 1,000) would be paragraph breaks.

        Cheers,
        Paul Edstein
        [Fmr MS MVP - Word]

    • #1258299

      mine was done here in a wink. I had 33 ± full pages

      13 MB is probably 15,000 pages!

      In the third post in this thread, he says he is using Replace All

    • #1258325

      We haven’t yet heard back from the original poster to see if any of the suggestions here have helped, but based on this:

      Every time a paragraph mark is replaced, all the lines following get “pulled up.”

      – it seems they are doing the Find/Replace while in Print Layout view, in which case Andrew’s suggestion to first switch to Normal view would probably speed things up significantly.

      Another thing that might help would be to run the Find/Replace via a macro, and start the macro with the line:

      Application.ScreenUpdating = False

      Gary

    • #1258942

      To replace EOL characters, which are really just soft returns if I am understanding correctly, use the character above the 6 on the keyboard (called a carret), and a lower case L as the search targets. Then use a space as the character to replace. To replace paragraph marks use a lower case P instead of L. Remember that all formatting for a paragraph in carried in the paragraph mark for the paragraph. Removing, or replacing that character, may change the paragraph formatting.

    • #1258973

      I did a file with 10,600 replacements in 15 seconds flat. Assuming the math stated above is correct, it should take 10 minutes to do 15,000 pages or so. That’s not forever, unless you have to do it 10 times a day. I’m not so sure that we’re understanding the problem. If its being done 10 times a day, get a new data entry clerk.

    • #1259209

      Almost definitely, the slowdown is due to doing the job in a view which causes repagination and/or screen updating. As others said, use draft/normal, depending which version of Word, and make sure repaginate is switched off.

      I used to macro process ~30MB TXT files [logs] with ~50 operations to clean and arrange the data in Word 2003 on WinXP, most of them being search and replace all ops. It usually took longer to download the files than to process them, only minutes all round.

      Btw, if ^p doesn’t find all the para marks, ^13 should get the rest.

      Lugh.
      ~
      Alienware Aurora R6; Win10 Home x64 1803; Office 365 x32
      i7-7700; GeForce GTX 1060; 16GB DDR4 2400; 1TB SSD, 256GB SSD, 4TB HD

    • #1259291

      If the posts above are correct, what does one individual do with a report that’s 15,000 pages long. Is it a new “health care bill”?

    • #1259528

      I do this all the time – replace a paragraph marker ¶ with a space or even just remove the marker. But there can be some tricks to it.

      If it is a text file you imported, what appears at the end of each line – ¶ – may not be a paragraph maker (a two character code of a carriage return [hex 0d] and line feed [hex 0a]) but just one of these. That is only just a carriage return or a line feed. Then the replace ^p will not work.

      The easiest way to test this is use the Find Next button of the Replace panel looking for a ¶ using the ^p. If nothing is found, then try using the ^013 (for carrage return) or ^010 for line feed in the ‘find what’ line. Then use that code for the replace all.

      I often use the replace all to eliminate the blank lines by first doing a replace all double paragraph markers ¶ ¶ ( ‘find what:’ ^p^p) with just one ¶ (replace with:’ ^p).

      Or if every line in the document ends with a paragraph marker AND each paragraph has two paragraph markers (one at the end of the line and one at the left margin), and you want to put the paragraphs back together, then I replace the double paragraph markers (¶¶) with something that is not in the document like &&&&, then go back and replace all single line paragraph markers with a space, then go back and replace the &&&& with two paragraph markers again (to get the paragraphs back as they should be.)

      There may be some variations here, such as the blank lines with a paragraph marker in the left margin have a space first, then you would use ^p ^p. But even with a 13 meg document, if the replace results are not what you wanted, undo it with Ctrl-Z.

      Microsoft Word (all versions) is really a relational database. The text is one database table and all the paragraph formatting (like margins or space before or space after, etc) is a second database table related to the text table by line number and offset into the line. And all the formatting for any one paragraph is carried in the paragraph marker.

      To show how neat this can be, put the cursor just in front of a paragraph marker, then highlight just the paragraph marker with a single right arrow move. Then copy (Ctrl-C) the marker. Then place the insertion point next to a paragraph marker in a different paragraph – one with significantly different formation (margins for instance), and press Ctrl-V to insert the copied paragraph marker. Poof, the paragraph how has the format of the earlier paragraph.

    • #1259531

      Kirk, hello and welcome to the tread.

      I will totally agree with you. After all, this is what I posted a few lines up.

      The original poster never came back ! We are waisting our spit on this subject.

      The best of the season to you…………..Jean.

    • #1259607

      Hi Kirk,

      Please re-read the thread, especially posts 1, 3, 7, 11 & 14.

      Cheers,
      Paul Edstein
      [Fmr MS MVP - Word]

    • #1259981

      Hey Howdee…
      Have you had success in resolving your replace problem?
      I saw it today and thought of another option, gave it a test on a 15Mb text file, replaced the paragraph break just like you mentioned and it was done in seconds! What did I use? A free editor called EditPad Lite. If the file truly is a text file and not a Word doc with formatting to worry about, then this will do it fast.

    Viewing 18 reply threads
    Reply To: Fast way to remove EOL paragraph mark in Word

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: