• Extracting text from the middle of the Cell

    Home » Forums » AskWoody support » Productivity software by function » MS Excel and spreadsheet help » Extracting text from the middle of the Cell

    Author
    Topic
    #494158

    I have been the recipient of a lot of help over the years, and I thank everyone that has helped. I would like to give back with something.

    I was given a list of chemicals and the Chemical Abstract Service Number (CAS No.) for each item. Unfortunately, the list was given as a simple number. Two examples would be
    • Formaldehyde 50000
    • Ammonia 7664417

    The normal way of showing a CAS No. is with some dashes:
    • Formaldehyde 50-00-0
    • Ammonia 7664-41-7

    I was looking for a way to extract this data into the typical format. It is complicated by the fact that the first set of numbers can be from two to six digits. A friend of mine came up with a really elegant way of doing this.

    Note that the chemical name and the CAS No. are in two separate cells.

    I put the CAS numbers into column E. Then I used the following formula, using the LEFT, RIGHT, and LEN functions:
    =LEFT(E2,LEN(E2)-3)&”-“&LEFT(RIGHT(E2,3),2)&”-“&RIGHT(E2,1)

    As this is sort of a tutorial for those unfamiliar with ranges, I will detail what this does. It can be quite useful to extract information from inside the contents of a cell. It also shows how formulas can be nested.

    First, the items between two quotation marks are inserted into the cell as text. Between each different part in the cell, an ampersand (&) must be used.

    Then, starting at the end of the formula:
    RIGHT(E2,1) means I start at the right of the cell E2, and take only the first character.

    LEFT(RIGHT(E2,3),2) is really elegant. The innermost function in the nest is RIGHT(E2,3), which means the formula extracts the three characters from the right of E2. Then, the LEFT function comes into play, choosing only the first two digits found.

    LEFT(E2,LEN(E2)-3) takes into account the varying lengths of the strings. LEN(E2)-3 starts by counting the number of characters in E2, and subtracts 3 from that number. That removes the final three characters described above. Then the LEFT function chooses the number of characters starting at the beginning of the string.

    Viewing 1 reply thread
    Author
    Replies
    • #1447680

      If you want to keep it as a number, you could also just use a custom number format to display the number in the CAS format: 0-00-0

      Steve
      PS a shorter alternative formula for generating the text string is:
      =REPLACE(REPLACE(E2,LEN(E2),0,”-“),LEN(E2)-2,0,”-“)

      PPS. and an even shorter one is not to insert but use the custom format:
      =TEXT(E2,”0-00-0”)

      • #1447901

        If you want to keep it as a number, you could also just use a custom number format to display the number in the CAS format: 0-00-0

        Steve
        PS a shorter alternative formula for generating the text string is:
        =REPLACE(REPLACE(E2,LEN(E2),0,”-“),LEN(E2)-2,0,”-“)

        PPS. and an even shorter one is not to insert but use the custom format:
        =TEXT(E2,”0-00-0”)

        That last one is really nice. Thank you.

    • #1447705

      Very Nice JohnD1. You could make a user defined function using your formula that would make it very easy to use:

      Enter the formula in cell D1 =CAS(B1) then copy down

      where A1 is the chemical name and B1 is the mal-formatted CAS number

      Code:
      Public Function CAS(rng As Range)
      Application.Volatile
      CAS = Left(rng, Len(rng) – 3) & “-” & Left(Right(rng, 3), 2) & “-” & Right(rng, 1)
      End Function
      
    Viewing 1 reply thread
    Reply To: Extracting text from the middle of the Cell

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: