• Identify problem characters

    Author
    Topic
    #2005519

    we are using MS Navision to upload order files. These files need to be in csv format. Today we .encountered a problem that the file was rejected without any clear indication as to the cause It just stated  Import Complete:  Created Orders : 0  Skipped Blank Lines: 0

    We looked at all cells to see if we can determine any problems with strange data but could not see anything.  We then copied the full data set into a new csv file with values only. The upload then worked as expected.

    I suspect that perhaps one or more cells contained invalid characters that may actually be hidden to the naked eye. Is there any formula or function to run against the original csv file to figure out the culprit?

    Viewing 6 reply threads
    Author
    Replies
    • #2006390

      Is there any formula or function to run against the original csv file to figure out the culprit?

      Not in the general case as csv just isn’t standardized very well… heh, Microsoft Office 365 web management often produces csv files that cannot be opened in Microsoft Office local applications…

      However, in this specific case you may already have a clue you could use…

      We then copied the full data set into a new csv file with values only. The upload then worked as expected.

      You could just compare the old and new files with an appropriate tool. You know, with a hexadecimal view, because that’ll show you any “non-character” content too.

      2 users thanked author for this post.
    • #2006438

      Hi MN-,

      thank you – I downloaded HxD and used a compare function and it confirmed that the files were different, though it’s not apparent to the naked eye. Looking at the attached screenshot,  I suspect that during the processing of the file, we somehow saved it incorrectly thus messing up the characters but I don’t know enough to draw any other conclusions. It seems to me it is not a specific field that was wrong. Does the additional ” appearing in all the fields shed any more light on this? The file on the left with the ” is the correct working one, the one on the right with the ” stripped is failing to upload. Just trying to understand and explain to others who do this process what to look out for or avoid doing.

      • This reply was modified 5 years, 6 months ago by beethoven.
      • #2006449

        I suspect that during the processing of the file, we somehow saved it incorrectly thus messing up the characters but I don’t know enough to draw any other conclusions. It seems to me it is not a specific field that was wrong. Does the additional ” appearing in all the fields shed any more light on this?

        The thing is, CSV is not defined precisely enough to determine this from the file itself. As per RFC 4180, either with the ” or without it is allowed, but so is having the fields in EBCDIC instead of ASCII.

        You need to determine the exact requirements from the application. And as to how to produce files that conform to that, depends on where those come from…

        Microsoft applications in particular are known to be picky about CSV files with no good reason, even to the extent that a CSV file produced in MS Excel set to a random European locale might not work in same version of MS Excel set to “US English”.

    • #2006452

      thanks again – at least we seem to have a way to check if a file has some hidden changes and the original copy / paste values option worked

    • #2006556

      Did some additional testing just to see if I can reproduce the issue (in order to avoid it in future) and to see how I can make use of HxD.

      Unfortunately, I cannot really interpret what I am seeing.

      In the attached example of a file that had “only” the name changed, one error was shown as demonstrated in the side by side screenshot with the highlights. I did not change any dot to comma.  However in a subsequent test I then changed data in one of the cells (the first cell after the dots) from 137WEB to 136WEB.  This was a change common to modify this sheet prior to upload and after making this change, it was not highlighted.

      Does that mean that the changes highlighted have nothing to do with “real” changes but only relate to invisible formatting?  I do get the warning each time when saving some formatting being incompatible and I click yes to save (as csv).  I am not sure if someone saved this as xls first and then resaved as csv this would lead to the original failure?

    • #2006597

      As you can’t easily work out what went wrong it may be easier to import problem files into a spreadsheet and then save the file as a CSV, without changing anything.

      cheers, Paul

    • #2006602

      Hi Paul,

      unfortunately that is not always possible as the data comes from a website where customers enter their addresses or details themselves, often in ways that are not correct or making sense.  So the issue is not so much data that we correct ourselves but data that is being entered by customers (sometimes invisible when they may use a cut and paste from a browser helper and this way not using proper csv characters or as MN- suggested perhaps using a different MS office language plugin. Ideally when a file is being rejected for import, I would like to have a feature/ application/ macro that would allow me to pinpoint the “offending” entry as this would also allow me to possibly amend this at the source (on the website) to prevent this from happening again by the same customer.  However if as it appears there is no straightforward solution, I guess the cut / paste value is the easiest way to deal with it.

       

    • #2006701

      However in a subsequent test I then changed data in one of the cells (the first cell after the dots) from 137WEB to 136WEB. This was a change common to modify this sheet prior to upload and after making this change, it was not highlighted.

      Does that mean that the changes highlighted have nothing to do with “real” changes but only relate to invisible formatting?

      Changing between a dot and a comma, as shown in the screenshot, tends to be extremely significant in a csv. It stands for “comma separated values”, after all. So,

      Price 20,.137WEB

      is 2 data cells so goes in two columns and

      Price 20..136WEB

      is a single data cell. As in single column.

      And everything else on that row after the change is then shifted one column…

      If your tool does this change silently, you need to fix or replace that tool.

      data that is being entered by customers (sometimes invisible when they may use a cut and paste from a browser helper and this way not using proper csv characters or as MN- suggested perhaps using a different MS office language plugin. Ideally when a file is being rejected for import, I would like to have a feature/ application/ macro that would allow me to pinpoint the “offending” entry

      Yeah, the problem is in part that there is no such thing as invalid character in a csv file in general. Any applications that I know of, accept only a small subset of the possible csv contents.

      You’d need to have the filter tool written for your application specifically, and this’ll need specific knowledge on what is allowed and what goes where, so that all the rows have the right number of columns and the like.

      This is possible, but to determine if it’ll be easy or difficult already requires more application-specific knowledge.

      I do get the warning each time when saving some formatting being incompatible and I click yes to save (as csv). I am not sure if someone saved this as xls first and then resaved as csv this would lead to the original failure?

      Oh, Excel does that if you do pretty much anything at all.

      See, Excel likes to save view state, as in which columns and rows were last visible if it doesn’t all fit on screen, and… And there’s just no place to save that information in a csv.

      Excel isn’t very good with csv files.

    Viewing 6 reply threads
    Reply To: Identify problem characters

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: