user2062207
user2062207

Reputation: 955

How to count up elements in excel

So I have a column called chemical formula for like 40,000 entries, and what I want to be able to do is count up how many elements are contained in the chemical formula. So for example:-

EXACT_MASS  FORMULA
626.491026  C40H66O5
275.173274  C13H25NO5

For this, I need some kind of formula that will return with the result of

C  H  O
40 66 5
13 25 5

all as separate columns for the different elements and in rows for the different entries. Is there a formula that can do this?

Upvotes: 1

Views: 1519

Answers (3)

Jamie Dunstan
Jamie Dunstan

Reputation: 3765

You could make your own formula.

Open the VBA editor with ALT and F11 and insert a new module.

Add a reference to Microsoft VBScript Regular Expressions 5.5 by clicking Tools, then references.

Now add the following code:

Public Function FormulaSplit(theFormula As String, theLetter As String) As String

    Dim RE As Object
    Set RE = CreateObject("VBScript.RegExp")
    With RE
        .Global = True
        .MultiLine = False
        .IgnoreCase = False
        .Pattern = "[A-Z]{1}[a-z]?"
    End With

    Dim Matches As Object
    Set Matches = RE.Execute(theFormula)

    Dim TheCollection As Collection
    Set TheCollection = New Collection

    Dim i As Integer
    Dim Match As Object
    For i = (Matches.Count - 1) To 0 Step -1
        Set Match = Matches.Item(i)
        TheCollection.Add Mid(theFormula, Match.FirstIndex + (Len(Match.Value) + 1)), UCase(Trim(Match.Value))
        theFormula = Left(theFormula, Match.FirstIndex)
    Next

    FormulaSplit = "Not found"
    On Error Resume Next
    FormulaSplit = TheCollection.Item(UCase(Trim(theLetter)))
    On Error GoTo 0

    If FormulaSplit = "" Then
        FormulaSplit = "1"
    End If

    Set RE = Nothing
    Set Matches = Nothing
    Set Match = Nothing
    Set TheCollection = Nothing

End Function

Usage:

  • FormulaSplit("C40H66O5", "H") would return 66.
  • FormulaSplit("C40H66O5", "O") would return 5.
  • FormulaSplit("C40H66O5", "blah") would return "Not found".

You can use this formula directly in your workbook. Example of FormulaSplit usage in Workbook

Upvotes: 4

Rick Hitchcock
Rick Hitchcock

Reputation: 35670

Even though this has an excellent (and accepted) VBA solution, I couldn't resist the challenge to do this without using VBA.

I posted a solution earlier, which wouldn't work in all cases. This new code should always work:

=MAX(
   IFERROR(IF(FIND(C$1&ROW($1:$99),$B2),ROW($1:$99),0),0),
   IFERROR(IF(FIND(C$1&CHAR(ROW($65:$90)),$B2&"Z"),1,0),0)
 )

Enter as an array formula: Ctrl + Shift + Enter

Output:

enter image description here

The formula outputs 0 when not found, and I simply used conditional formatting to turn zeroes gray.

How it works

This part of the formula looks for the element, followed by a number between 1 and 99. If found, the number of atoms is returned. Otherwise, 0 is returned. The results are stored in an array:

IFERROR(IF(FIND(C$1&ROW($1:$99),$B2),ROW($1:$99),0),0)

In the case of C13H25NO5, a search for "C" returns this array:

{1,0,0,0,0,0,0,0,0,0,0,0,13,0,0,0,...,0}

1 is the first array element, because C1 is a match. 13 is the thirteenth array element, and that's what we're interested in.

The next part of the formula looks for the element, followed by an uppercase letter, which indicates a new element. (The letters A through Z are characters 65 through 90.) If found, the number 1 is returned. Otherwise, 0 is returned. The results are stored in an array:

IFERROR(IF(FIND(C$1&CHAR(ROW($65:$90)),$B2&"Z"),1,0),0)

"Z" is appended to the chemical formula, so that a match will be found when its last element has no number. (For example, "H2O".) There is no element "Z" in the Periodic Table, so this won't cause a problem.

In the case of C13H25NO5, a search for "N" returns this array:

{0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0}

1 is the 15th element in the array. That's because it found the letters "NO", and O is the 15th letter of the alphabet.

Taking the maximum value from each array gives us the number of atoms as desired.

Upvotes: 1

Dave
Dave

Reputation: 1643

I've had a stab at doing this in a formula nad come up with the following:

=IFERROR((MID($C18,FIND(D17,$C18)+1,2))*1,IFERROR((MID($C18,FIND(D17,$C18)+1,1))*1,IFERROR(IF(FIND(D17,$C18)>0,1),0)))

It's not very neat and would have to be expanded further if any of your elements are going to appear more than 99 times - I also used a random placement on my worksheet so the titles H,C and O are in row 17. I would personally go with Jamie's answer but just wanted to try this to see if I could do it in a formula possible and figured it was worth sharing just as another perspective.

Upvotes: 3

Related Questions