Gnux
Gnux

Reputation: 3

Count occurrences of character in string variable without SQL

I'm looking for a way to count the number of times a character occurs in a string without using SQL.

I'm relatively new to RPGLE and I've created a test program that takes user input in character format, goes through validation, and converts the successful data to numeric. One of these inputs can be a positive or negative integer. When going through validation, I test for '-' being in the first position and use %CHECK to make sure that input is 0-9 or '-'. (ex. '-10' passes, '1-0' fails)

However, if the input has multiple occurrences of the '-' symbol, such as '-1-1-1-1', it passes the validation and crashes when the program tries to convert to numeric.

I'm aware that I can use Edit codes in my DDS to have the system handle this but I'm trying to learn different ways to allow my programs to control validation. In my research, I've found that TestN and Module/Convert/%Error are methods that can be used to ensure the output is numeric, but I can't test for this specific instance so I can give meaningful feedback.

Is there a way to count the occurrences of '-' so I can test it?

Since there seems to be some confusion as to my intent, I'll add another example. If I wanted to find out how many occurrences of the letter 'L' are in the word 'HELLO', what would be the best way to go about it.

Upvotes: 0

Views: 3409

Answers (4)

CRPence
CRPence

Reputation: 1259

The SCAN operation code (OpCode) [as contrasted with the %SCAN built-in] has a capability almost matching the requirement to effect a count of the occurrences of a character in a string; enabling that effect is achieved by specifying an Array as the Result-Field [or, in MI parlance, the receiver; for reference a doc snippet from both the RPG reference and the near-equivalent MI instruction]. A second step is required however.

http://www.ibm.com/support/knowledgecenter/api/content/ssw_ibm_i_71/rzasd/sc092508999.htm#zzscan

SCAN (Scan String)
Free-Form Syntax (not allowed ...
...
The SCAN operation scans a string (base string) ...
...

http://www.ibm.com/support/knowledgecenter/ssw_ibm_i_71/rzatk/SCAN.htm

Scan (SCAN)

The following code source is capable of being compiled as a callable bound RPGLE program [to v5r1 and likely several releases prior]; when called, accepts a 32-byte input string [thus easily invoked without issues via command-line CALL CHARCOUNT PARM('string specified' 'i') /* for which the displayed result is: DSPLY 3 */] and a one-byte character value as arguments. The first argument is the string against which the number of occurrences of the character specified as the second argument, are counted. The output is via the DSPLY opcode of the storage from the input string rewritten as the edited numeric count of occurrences. Offered as-is, without further comment, except to state that the %lookup depends on a sequential search on the array:

 H dftactgrp(*no) actgrp(*CALLER)                         
 D CHARCOUNT       PR                  ExtPgm('CHARCOUNT')
 D  inpstring                    32A                      
 D  findchar                      1A                      
 D CHARCOUNT       PI                                     
 D  inpstring                    32A                      
 D  findchar                      1A                      
 D clen            C                   const(32)          
 D decary          S              5S00 dim(clen)          
 D i               S              2P00                    
 c     findchar      scan      inpstring     decary       
  /free                                                   
      // locate first zero-value array element            
      i = %lookup(0:decary) - 1 ;                         
      inpstring = %editc(i:'3') ;                         
      DSPLY inpstring ;                                   
      *INLR = *ON ;                                       
  /end-free                                               

Upvotes: 1

John Y
John Y

Reputation: 14559

In light of comments and edits, let's ignore any specific use case and really just answer "how do I count occurrences of a particular character in a string using RPG without embedded SQL?".

To address the comment

I was just curious as to whether or not their was something similar to a BIF that would return a result much easier than using a SCAN

The answer is: Currently there isn't any BIF which just gives you the result directly. That is, there isn't something that works like

occurrences = %COUNT(needle:haystack);

If you are on version 7.1 or later, the most elegant way is probably to utilize %SCANRPL, which is analogous to SQL's REPLACE. Assuming needle is a single character and haystack is a varying-length string, it would be something like

occurrences = %LEN(haystack) - %LEN(%SCANRPL(needle:'':haystack));

That is, see how much shorter haystack becomes if you remove all occurrences of needle. (You can generalize this to needles longer than one character by dividing this result by the length of needle.)

If you're on an earlier version, then your decision is probably between repeated %SCANs as you've done, or looping over haystack character by character. The former is probably a bit more efficient, especially if the "needle density" is very low, but the latter is simpler to code and arguably easier to read and maintain.

I'd like to note at this point that the "no SQL" constraint is a very artificial one, like one you would encounter on a school assignment. In a real-world system, it's unlikely that you would have access to RPG but not to RPG-with-embedded-SQL, so if there is an elegant, readable SQL solution, then for real-world use, there is no reason to rule it out.

Upvotes: 3

Buck Calabro
Buck Calabro

Reputation: 7648

RPG Is a strongly typed language, so generally speaking, if you need a number, use a numeric field. Don't use a character field, then test and convert to a number. Display files (using DDS) were intended to make this task (ask the user to enter a number) easy.

That said, sometimes you don't control that input. You might be dealing with an EDI transaction or other file transfer where the other party puts text into a field and it's up to you to extract out the numeric portion. For cases like that, where you receive something like '-$45,907.12' you need to do more than count the number of minus signs.

IBM's Barbara Morris has posted the following code and it is an example of extracting out the numeric value from a character field. It understands minus symbols, decimal separators, decimal points and currency symbols.

<-----* prototype for /COPY file start here ----->

  *---------------------------------------------------------
  * getNum - procedure to read a number from a string
  *          and return a 30p 9 value
  * Parameters:
  *   I:      string   - character value of number
  *   I:(opt) decComma - decimal point and digit separator
  *   I:(opt) currency - currency symbol for monetary amounts
  * Returns:  packed(30,9)
  *
  * Parameter details:
  *   string:   the string may have 
  *             - blanks anywhere
  *             - sign anywhere
  *               accepted signs are: + - cr CR ()
  *               (see examples below)
  *             - digit separators anywhere
  *             - currency symbol anywhere
  *   decComma: if not passed, this defaults to 
  *                 decimal point   = '.'
  *                 digit separator = ','
  *   currency: if not passed, defaults to ' '
  *
  * Examples of input and output (x means parm not passed):
  *
  *        string         | dec | sep | cursym |   result         
  *        ---------------+-----+-----+--------+------------
  *          123          | x   | x   | x      |   123
  *          +123         | x   | x   | x      |   123
  *          123+         | x   | x   | x      |   123
  *          -123         | x   | x   | x      |   -123
  *          123-         | x   | x   | x      |   -123
  *          (123)        | x   | x   | x      |   -123
  *          12,3         | ,   | .   | x      |   12.3
  *          12.3         | x   | x   | x      |   12.3
  *          1,234,567.3  | x   | x   | x      |   1234567.3
  *          $1,234,567.3 | .   | ,   | $      |   1234567.3
  *          $1.234.567,3 | ,   | .   | $      |   1234567.3
  *          123.45CR     | x   | x   | x      |   -123.45
  *
  * Author: Barbara Morris, IBM Toronto Lab
  * Date:   March, 2000
  *---------------------------------------------------------
 D getNum          pr            30p 9
 D  string                      100a   const varying
 D  decComma                      2a   const options(*nopass)
 D  currency                      1a   const options(*nopass)

<-----* prototype for /COPY file end here ----->

<-----* test program start here----->

  * Copy prototype for procedure getNum
 D/COPY GETNUM_P

 D res             s                   like(getNum)
 D msg             s             52a

 C     *entry        plist
 C                   parm                    p                32
 C                   parm                    dc                2
 C                   parm                    c                 1

 C                   select
 C                   when      %parms = 1
 C                   eval      res = getNum(p)
 C                   when      %parms = 2
 C                   eval      res = getNum(p : dc)
 C                   when      %parms = 3
 C                   eval      res = getNum(p : dc : c)
 C                   endsl
 C                   eval      msg = '<' + %char(res) + '>'
 C     msg           dsply

 C                   return

<-----* test program end here----->

<-----* module GETNUM start here ----->

 H NOMAIN

  * Copy prototype for procedure getNum
 D/COPY GETNUM_P     

 p getNum          b
 D getNum          pi            30p 9
 D  string                      100a   const varying
 D  decComma                      2a   const options(*nopass)
 D  currency                      1a   const options(*nopass)

  * defaults for optional parameters
 D decPoint        s              1a   inz('.')
 D comma           s              1a   inz(',')
 D cursym          s              1a   inz(' ')
  * structure for building result
 D                 ds
 D result                        30s 9 inz(0)
 D resChars                      30a   overlay(result)
  * variables for gathering digit information
  * pNumPart points to the area currently being gathered 
  * (the integer part or the decimal part)
 D pNumPart        s               *
 D numPart         s             30a   varying based(pNumPart)
 D intPart         s             30a   varying inz('')
 D decPart         s             30a   varying inz('')
  * other variables
 D intStart        s             10i 0
 D decStart        s             10i 0
 D sign            s              1a   inz('+')
 D i               s             10i 0
 D len             s             10i 0
 D c               s              1a

  * override defaults if optional parameters were passed
 C                   if        %parms > 1
 C                   eval      decPoint = %subst(decComma : 1 : 1)
 C                   eval      comma    = %subst(decComma : 2 :1)
 C                   endif

 C                   if        %parms > 2
 C                   eval      cursym = currency
 C                   endif

  * initialization
 C                   eval      len = %len(string)
  * begin reading the integer part
 C                   eval      pNumPart = %addr(intPart)

  * loop through characters
 C                   do        len           i
 C                   eval      c = %subst(string : i : 1)

 C                   select
  * ignore blanks, digit separator, currency symbol
 C                   when      c = comma or c = *blank or c = cursym
 C                   iter
  * decimal point: switch to reading the decimal part
 C                   when      c = decPoint
 C                   eval      pNumPart = %addr(decPart)
 C                   iter
  * sign: remember the most recent sign
 C                   when      c = '+' or c = '-'
 C                   eval      sign = c
 C                   iter
  * more signs: cr, CR, () are all negative signs
 C                   when      c = 'C' or c = 'R' or
 C                             c = 'c' or c = 'r' or
 C                             c = '(' or c = ')'
 C                   eval      sign = '-'
 C                   iter
  * a digit: add it to the current build area     
 C                   other
 C                   eval      numPart = numPart + c

 C                   endsl
 C                   enddo

  * copy the digit strings into the correct positions in the
  * zoned variable, using the character overlay
 C                   eval      decStart = %len(result) - %decPos(result)
 C                                      + 1
 C                   eval      intStart = decStart - %len(intPart)
 C                   eval      %subst(resChars
 C                                  : intStart
 C                                  : %len(intPart))
 C                               = intPart
 C                   eval      %subst(resChars
 C                                  : decStart
 C                                  : %len(decPart))
 C                               = decPart
  * if the sign is negative, return a negative value
 C                   if        sign = '-'
 C                   return    - result
  * otherwise, return the positive value
 C                   else
 C                   return    result
 C                   endif
 p                 e

<-----* module GETNUM end here ----->

Upvotes: 1

Charles
Charles

Reputation: 23823

The %scan() bif accepts a 3 parameter - starting position. So you can do multiple scan's starting from the location of the last hit.

However, I'm not fond of this type of manual validation. From a performance standpoint, assuming most data is good, you're wasting cycles. More importantly, the test you've laid out, requiring '-' to be in the first position means that ' -10' would fail;

I prefer to simply do the conversion of catch the exception if need be.

monitor;
  myValue = %dec(myString);
on-error;
  // let the user know
endmon;

Lastly, TESTN is obsolete and should be avoided. It probably doesn't work the way you'd want anyway. For example (IIRC), '5A' passes the TESTN test.

The RPG manual itself has this to say about TESTN:
Free-Form Syntax - (not allowed - rather than testing the variable before using it, code the usage of the variable in a MONITOR group and handle any errors with ON-ERROR. See Error-Handling Operations.)

Upvotes: 1

Related Questions