jamesMcKey
jamesMcKey

Reputation: 491

Parsing XML - string captured is front truncated. Swift

My parsing of xml documents works well, but what I have noticed is that randomly the text in between certain tags ("AbstractText") becomes truncated and I have no idea what. Again, this appears only here and there and is not always the case for this tag. My code for the parsing is below and also an example of a truncated text.

 var abstractBool = false
 var abstract = ""

 func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String : String] = [:]) {
      switch elementName {
       case "AbstractText":
             abstractBool = true
      }
}


 func parser(_ parser: XMLParser, foundCharacters string: String) {
       if abstractBool{
          abstract = string
       }
 }


 func parser(_ parser: XMLParser, didEndElement elementName: String, 
 namespaceURI: String?, qualifiedName qName: String?) {
      switch elementName {
           case "AbstractText":
           abstractBool = false
      }
 }

Raw XML from remote server:

 <Abstract>
 <AbstractText>
 Protein kinase C (PKC) has been shown to activate the mammalian target of 
 rapamycin complex 1 (mTORC1) signaling pathway, a central hub in the 
 regulation of cell metabolism, growth and proliferation. However, the 
 mechanisms by which PKCs activate mTORC1 are still ambiguous. Our previous 
 study revealed that activation of classical PKCs (cPKC) results in the 
 perinuclear accumulation of cPKC and phospholipase D2 (PLD2) in recycling 
 endosomes in a PLD2-dependent manner. Here, we report that mTORC1 activation 
 by phorbol 12,13-myristate acetate (PMA) requires both classic, cPKC, and 
 novel PKC (nPKC) isoforms, specifically PKCη, acting through distinct 
 pathways. The translocation of mTOR to perinuclear lysosomes was detected 
 after treatment of PKC activators, which was not colocalized with PKCα- or 
 RAB11-positive endosomes and was not inhibited by PLD inhibitors. We found 
 that PKCη inhibition by siRNA or bisindolylmaleimide I effectively decreased 
 mTOR accumulation in lysosomes and its activity. Also, we identified that 
 PKCη plays a role upstream of the v-ATPase/Ragulator/Rag pathway in response 
 to PMA. These data provides a spatial aspect to the regulation of mTORC1 by 
 sustained activation of PKC, requiring co-ordinated activation of two 
 distinct elements, the perinuclear accumulation of cPKC- and PLD-containing 
 endosomes and the nPKC-dependent translation of of mTOR in the perinuclear 
 lysosomes. The close proximity of these two distinct compartments shown in 
 this study suggests the possibility that transcompartment signaling may be a 
 factor in the regulation of mTORC1 activity and also underscores the 
 importance of PKCη as a potential therapeutic target of mTORC-related 
 disorders.
 </AbstractText>
 </Abstract>

What i am able to extract is the following front-end-truncated portion of the string:

 scompartment signaling may be a factor in the regulation of mTORC1 activity 
 and also underscores the importance of PKCη as a potential therapeutic target 
 of mTORC-related disorders.

Upvotes: 1

Views: 142

Answers (1)

Ramy Al Zuhouri
Ramy Al Zuhouri

Reputation: 21966

That's the normal behavior. You are given partial strings, and you have to concatenate them and the final result is given by the concatenation of all the strings obtained in the lapse between didStartElement and didEndElement are called. Be sure to save the current element in a variable so in foundCharacters you know where the partial string belongs to.

var abstractBool = false
var abstract = ""

func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String : String] = [:]) {
      switch elementName {
       case "AbstractText":
             abstractBool = true
             abstract = "" // if you read an XML document with multiple
                           // AbstractText elements, you need to reset 
                           // the variable
      }
}


 func parser(_ parser: XMLParser, foundCharacters string: String) {
     if abstractBool{
        abstract.append(string)
     }
 }


 func parser(_ parser: XMLParser, didEndElement elementName: String, 
 namespaceURI: String?, qualifiedName qName: String?) {
     switch elementName {
         case "AbstractText":
         abstractBool = false
         // Here the variable 'abstract' contains the full text
     }
 }  

Upvotes: 1

Related Questions