Topic: Import TXT file into Excel (formatting issues) @ AskWoody

Import TXT file into Excel (formatting issues)
Home » Forums » AskWoody support » Productivity software by function » MS Excel and spreadsheet help » Import TXT file into Excel (formatting issues)
- This topic has 14 replies, 4 voices, and was last updated 8 years, 7 months ago.
Author

Topic
New Reply

WSaapke
AskWoody Lounger

March 24, 2010 at 9:49 pm #467677

I am looking for the easiest way to import the attached file into Excel so that analysis can be performed. The file was created by another application and has all the pertinent data. What I am struggling with is how to get the two lines of data onto one line in order to easily import it into Excel using the Text to Data feature. If you look at the copy/paste below, you’ll notice the header and the subsequent data is coming across on two lines. I cannot do a clean import into Excel and this is quite a big file. All suggestions are welcome.

CHECK
BANK CHECK # CHECK AMT CHECK DATE VENDOR # ADDRS # VENDOR NAME STATUS
WO # AMOUNT G/L ACCT # DESCRIPTION INVOICE # INV VEND
————————————————————————————————————————————

1 1 $38742.79 05/21/09 INT01 0 INTERNAL REVENUE SERVICE OUTSTANDING
38742.79 01-00000-20600-00000-000 PAYROLL TAXES FOR PE 4/24/09 PPE 4/24/09

1 2 $552.60 05/21/09 CON06 0 CONEXIS OUTSTANDING
552.60 90-00000-23327-00000-000 FLEX 125-DEPENDENT CARE REIMBS 051509

Thanks.

Reply | Quote

Viewing 12 reply threads
Author

Replies
- WSsdckapr
  AskWoody Lounger
  
  March 25, 2010 at 6:58 am #1214942
  
  I think the easiest thing to do would be to have the program exporting this text file, export it in a more appropriate way. Some programs can export into an excel worksheet instead of using text.
  
  If it is only this file, I would manually work on it, trying to determine what “type” the line of text is. If you are only interested in the data and not the header, I would add first add a column numbering the items (to always be able to sort to the original list) then a column with a 0, 1, 2, 3 (0 for header, 1 for line1, 2 for line2, 3 for blank) then you can sort by type and then original number, delete the rows with 0 and 3, and then do a text to column with type 1 and type 2 and then move type 2 columns into the columns in the same rows as the type 1
  
  If you have to do this a lot a macro could be written to do it, but how complex will depend on how much it may change from time to time
  
  Steve
  
  Reply | Quote
- macropod
  AskWoody_MVP
  
  March 25, 2010 at 7:12 am #1214946
  
  Hi aapke,
  
  Before loading the file into Excel, load it into Word and run the following macro against it, then re-save the file and load into Excel. From there you can use the Excel text-to-columns function, with fixed-with splits, to parse the data.
  
  Code:
  
  Sub StatementReformat() Application.ScreenUpdating = False With ActiveDocument.Range.Find .ClearFormatting .Replacement.ClearFormatting .Forward = True .Wrap = wdFindContinue .Format = False .MatchCase = False .MatchWholeWord = False .MatchAllWordForms = False .MatchSoundsLike = False .MatchWildcards = True 'First Page .Text = "REPORT*EXTRACT FILE : [A-Z0-9]{1,}^13^12" .Replacement.Text = "" .Execute Replace:=wdReplaceOne 'Column Headers (except New First Page) .Text = "^12*VEND^13*[-]{1,}^13" .Execute Replace:=wdReplaceAll 'Header Underline on First Page .Text = "[-]{1,}^13" .Execute Replace:=wdReplaceAll 'Last Page .Text = " TOTAL #*^12REPORT*{1,255}PAGE*{1,255}FUND TOTALS*[=]{1,}*^13*^13" .Execute Replace:=wdReplaceAll 'Unwanted Header Lines - First Page .Text = "REPORT*CHECK^13" .Execute Replace:=wdReplaceAll 'Empty Paragraphs .Text = "[ ]{1,}^13" .Execute Replace:=wdReplaceAll 'Header Line Wrap - First Page .Text = "(STATUS)^13" .Replacement.Text = "1" 'Record Line Wraps .Execute Replace:=wdReplaceAll .Text = "(OUTSTANDING)^13" .Execute Replace:=wdReplaceAll .Text = "(CLEARED)^13" .Replacement.Text = "1 " .Execute Replace:=wdReplaceAll .Text = "(VOIDED)^13" .Replacement.Text = "1 " .Execute Replace:=wdReplaceAll 'Pad Records with Multiple Rows .Text = "(^13)([!^13]{1,120}^13)" .Replacement.Text = "1 2" .Execute Replace:=wdReplaceAll .Execute Replace:=wdReplaceAll End With Application.ScreenUpdating = True End Sub
  
  Update:
  I noticed that negative values in the data have the ‘-‘ sign trailing the values concerned. Excel doesn’t handle those too well. You can convert them to values with a preceding ‘-‘ sign by adding the following four lines immediately before the ‘End With’ statement in the macro:
  
  Code:
  
  'Reformat -ve values .Text = "( )([0-9]{1,}.[0-9]{2})(-)" .Replacement.Text = "321" .Execute Replace:=wdReplaceAll
  
  Cheers,
  Paul Edstein
  [Fmr MS MVP - Word]
  
  Reply | Quote
- WSaapke
  AskWoody Lounger
  
  March 25, 2010 at 8:53 pm #1215159
  
  Okay, that was pretty cool to just watch. Thank YOU! Do I need to run it more than once to fix the issue with the multiple lines for one record?
  
  1 3000022 $697.49 04/16/09 COA05 0 COASTAL BLUE, INC OUTSTANDING 24.84 50-00000-62250-07203-000 FILM/MAPS/BLUEPRINTS 423225
  
  25.91 25-00000-63608-80028-000 FILMS/MAPS/BLUEPRINTS 423272
  12.34 50-00000-62250-09207-000 FILMS/MAPS/BLUEPRINTS 423458
  
  20.10 25-00000-63608-80058-000 FILMS/MAPS/BLUEPRINTS 423527
  19.22 50-00000-62250-09207-000 FILMS/MAPS/BLUEPRINTS 423950
  
  59.86 25-00000-63608-80068-000 FILMS/MAPS/BLUEPRINTS 423980
  51.56 25-00000-63608-80064-000 FILMS/MAPS/BLUEPRINTS 424065
  
  77.85 01-62000-63608-00000-000 FILMS/MAPS/BLUEPRINTS 424076
  30.44 50-00000-62250-07104-000 FILMS/MAPS/BLUEPRINTS 424187
  
  77.58 50-00000-62250-07107-000 FILMS/MAPS/BLUEPRINTS 424191
  37.34 50-00000-62250-07101-000 FILMS/MAPS/BLUEPRINTS 424320
  
  47.02 01-62000-63608-00000-000 FILMS/MAPS/BLUEPRINTS 424439
  20.10 60-94030-63608-00000-000 FILMS/MAPS/BLUEPRINTS 424479
  
  29.56 60-94030-63608-00000-000 FILMS/MAPS/BLUEPRINTS 424480
  53.71 60-94030-63608-00000-000 FILMS/MAPS/BLUEPRINTS 424620
  
  34.53 50-00000-62250-07104-000 FILMS/MAPS/BLUEPRINTS 424695
  75.53 25-00000-63608-80065-000 FILMS/MAPS/BLUEPRINTS 425067
  
  Thanks.
  
  Reply | Quote
- macropod
  AskWoody_MVP
  
  March 26, 2010 at 1:13 am #1215179
  
  Hi aapke,
  
  You shouldn’t need to. The two consecutive ‘.Execute Replace:=wdReplaceAll’ statements near the end of the first code module I posted should take care of that – it did in my testing.
  
  Cheers,
  Paul Edstein
  [Fmr MS MVP - Word]
  
  Reply | Quote
- WSaapke
  AskWoody Lounger
  
  March 26, 2010 at 11:08 am #1215258
  
  I just knew I did something incorrectly. Basic steps I used are: Generate the report again in the finance system. Saved the generated file as .TXT. Opened it up in Word. Inserted your totally cool code into a module. Saved the file as a .docm. Ran the macro, sat back, and watched it work it’s magic. Saved the file as a .TXT again. Imported it into Excel using the Text to Data feature. And yes, I did incorporate the additional 4 lines.
  
  Did I miss a step along the way? Either way, the new file I have to work with is so much cleaner than the old one, it won’t take more than a few hours to clean it up manually if needed! Thank you so much!
  
  aapke
  
  Reply | Quote
- macropod
  AskWoody_MVP
  
  March 26, 2010 at 11:24 am #1215261
  
  Hi aapke,
  
  I suspect what you’re seeing is line-wrapping effects. Attached is a Word file containing the full macro, with the prcoessed data and the page format set up so that each record takes a single line in the document. As you can see, the multi-entry records are fully padded.
  
  Cheers,
  Paul Edstein
  [Fmr MS MVP - Word]
  
  Reply | Quote
- WSaapke
  AskWoody Lounger
  
  March 26, 2010 at 2:08 pm #1215280
  
  Hi macropod,
  
  I’ve copied the steps exactly (I believe) and am still getting the line wrapping. Can you tell from a quick glance what I am doing incorrectly?
  
  Thanks again,
  aapke
  
  Reply | Quote
- macropod
  AskWoody_MVP
  
  March 26, 2010 at 4:46 pm #1215299
  
  Hi aapke,
  
  I don’t understand why your system isn’t behaving the same as mine. Is your copy of Word of Word 2007 fully updated? There were some problems with the vba implementation in the earlier releases. I’ve run the code with both Word 2000 SP3 and Word 2007 SP2 and it works fine. Nevertheless, it seems what you’re ending up with is not so much a line wrapping issue as unwanted line breaks in every 2nd padded line. You should be able to fix that by adding the following extra lines to your code:
  
  Code:
  
  'Fix Broken Lines .Text = "( )(^13)( )" .Replacement.Text = "13" .Execute Replace:=wdReplaceAll
  
  Cheers,
  Paul Edstein
  [Fmr MS MVP - Word]
  
  Reply | Quote
- WSaapke
  AskWoody Lounger
  
  March 26, 2010 at 8:54 pm #1215373
  
  Ah. That may just be it. My copy of Word 2007 shows as “MS Word 2007 (12.0.6514.5000) SP2 MSO (12.0.6521.5000). Hopefully that’ll fix everything!
  
  Thank you again for ALL the assistance.
  aapke
  
  Reply | Quote
- macropod
  AskWoody_MVP
  
  March 26, 2010 at 10:01 pm #1215378
  
  Hi aapke,
  
  My copy of Word 2007 shows as “MS Word 2007 (12.0.6514.5000) SP2 MSO (12.0.6521.5000).
  
  In that case, you do indeed have Word 2007’s SP2 update installed. Strange – perhaps it’s just some little glitch lurking somewhere else in your system.
  
  Cheers,
  Paul Edstein
  [Fmr MS MVP - Word]
  
  Reply | Quote
- macropod
  AskWoody_MVP
  
  March 26, 2010 at 11:48 pm #1215381
  
  Hi aapke,
  
  Instead of adding the extra lines of code for repairing the broken data lines that I suggested earlier, try changing:
  
  Code:
  
  'Pad Records with Multiple Rows .Text = "(^13)([!^13]{1,120}^13)" .Replacement.Text = "1 2" .Execute Replace:=wdReplaceAll .Execute Replace:=wdReplaceAll
  
  to
  
  Code:
  
  'Pad Records with Multiple Rows .Text = "(^13)([ ]{20})([ ]{1,8}[0-9]{1,8}.[0-9]{2})" .Replacement.Text = "12222222 3" .Execute Replace:=wdReplaceAll
  
  You’ll notice that the new block has only one ‘.Execute Replace:=wdReplaceAll’ statement. That’s all I found necessary and it should make the overall execution just that little bit quicker – as should not needing those other extra lines of code for repairing the broken data lines .
  
  Cheers,
  Paul Edstein
  [Fmr MS MVP - Word]
  
  Reply | Quote
- WSSMI
  AskWoody Lounger
  
  October 14, 2016 at 3:35 am #1581397
  
  Hi aapke,
  
  Instead of adding the extra lines of code for repairing the broken data lines that I suggested earlier, try changing:
  
  Code:
  
  'Pad Records with Multiple Rows .Text = "(^13)([!^13]{1,120}^13)" .Replacement.Text = "1 2" .Execute Replace:=wdReplaceAll .Execute Replace:=wdReplaceAll
  
  to
  
  Code:
  
  'Pad Records with Multiple Rows .Text = "(^13)([ ]{20})([ ]{1,8}[0-9]{1,8}.[0-9]{2})" .Replacement.Text = "12222222 3" .Execute Replace:=wdReplaceAll
  
  You’ll notice that the new block has only one ‘.Execute Replace:=wdReplaceAll’ statement. That’s all I found necessary and it should make the overall execution just that little bit quicker – as should not needing those other extra lines of code for repairing the broken data lines .
  
  Hi,
  I also tried this code and got really impressed the way you have used regular expressions within the with…end with block. I am an absolute beginner as far as RegEx is concerned. Just want to know if you could allow me to share a similar file with you that I have from our database. That way looking at that code that you will assist me with, will give me a better understanding how RegEx could help me with kind of report I have to deal with on regular basis.
  
  I would be waiting for your feedback.
  
  Kind regards
  Syed
  
  Reply | Quote
- WSaapke
  AskWoody Lounger
  
  March 27, 2010 at 4:23 pm #1215434
  
  Will do!
  
  Thank you again.
  aapke
  
  Reply | Quote
- WSSMI
  AskWoody Lounger
  
  October 17, 2016 at 6:55 am #1581679
  
  Hi Paul,
  OK now I understand this is not RegEx but wildcards within MS Office that are similar to RegEx. However, even though I have looked up some reference material but still need your help. Only if you could explain the logic behind following few lines extracted from your above code. Note that to make my query more understandable and precise I have avoided With…End With block and have re-written it otherwise:
  
  Code chunk #1:
  ActiveDocument.Range.Find.Text = ” TOTAL #*^12REPORT*{1,255}PAGE*{1,255}FUND TOTALS*[=]{1,}*^13*^13″
  
  Code chunk #2:
  ActiveDocument.Range.Find.Text = “(CLEARED)^13”
  ActiveDocument.Range.Find.Replacement.Text = “1 ”
  ActiveDocument.Range.Find.Execute Replace:=wdReplaceAll
  
  Code chunk #3:
  ‘Pad Records with Multiple Rows
  ActiveDocument.Range.Find.Text = “(^13)([ ]{20})([ ]{1,8}[0-9]{1,8}.[0-9]{2})”
  ActiveDocument.Range.Find.Replacement.Text = “12222222 3”
  ActiveDocument.Range.Find.Execute Replace:=wdReplaceAll
  
  Question referring to chunk# 1: I need your help as to what all those curly braces and numbers mean? I am not sure about spaces before word “TOTAL” in your code. Also the CHECKREG.TXT does not have word “FUND TOTALS”, so I am not sure what does this part of code line mean?
  
  Question referring to chunk# 2 : Understand that code looks for word “CLEARED” followed by paragraph mark but I need some clarity on the next two lines. What these last two lines are doing?
  
  Question referring to chunk# 3: I am totally clueless to these last 3 lines. Can you please explain about their function?
  
  It is hoped you will appreciate my intent to learn some code from your already created set of code instead of digging into something from scratch.
  
  I would be waiting for your response.
  
  REGARDS,
  SMI
  
  Reply | Quote
Viewing 12 reply threads

Reply To: Import TXT file into Excel (formatting issues)
You can use BBCodes to format your content.
Your account can't use all available BBCodes, they will be stripped before saving.

Your information:
Name (required):

Mail (will not be published) (required):

Website:

Cancel

Plus Membership

Donations from Plus members keep this site going. You can identify the people who support AskWoody by the Plus badge on their avatars.

AskWoody Plus members not only get access to all of the contents of this site -- including Susan Bradley's frequently updated Patch Watch listing -- they also receive weekly AskWoody Plus Newsletters (formerly Windows Secrets Newsletter) and AskWoody Plus Alerts, emails when there are important breaking developments.

Welcome to our unique respite from the madness.

It's easy to post questions about Windows 11, Windows 10, Win8.1, Win7, Surface, Office, or browse through our Forums. Post anonymously or register for greater privileges. Keep it civil, please: Decorous Lounge rules strictly enforced. Questions? Contact Customer Support.

Import TXT file into Excel (formatting issues)

Plus Membership

Search Newsletters

Search Forums

View the Forum

Search for Topics

Recent Topics

Recent blog posts

My Profile

Key Links

Remembering Woody

Import TXT file into Excel (formatting issues)

Plus Membership

Search Newsletters

Search Forums

View the Forum

Search for Topics

Recent Topics

Recent blog posts

My Profile

Login and Registration

Key Links

Remembering Woody