Reputation: 3
I'm looking for a way to count the number of times a character occurs in a string without using SQL.
I'm relatively new to RPGLE and I've created a test program that takes user input in character format, goes through validation, and converts the successful data to numeric. One of these inputs can be a positive or negative integer. When going through validation, I test for '-' being in the first position and use %CHECK to make sure that input is 0-9 or '-'. (ex. '-10' passes, '1-0' fails)
However, if the input has multiple occurrences of the '-' symbol, such as '-1-1-1-1', it passes the validation and crashes when the program tries to convert to numeric.
I'm aware that I can use Edit codes in my DDS to have the system handle this but I'm trying to learn different ways to allow my programs to control validation. In my research, I've found that TestN and Module/Convert/%Error are methods that can be used to ensure the output is numeric, but I can't test for this specific instance so I can give meaningful feedback.
Is there a way to count the occurrences of '-' so I can test it?
Since there seems to be some confusion as to my intent, I'll add another example. If I wanted to find out how many occurrences of the letter 'L' are in the word 'HELLO', what would be the best way to go about it.
Upvotes: 0
Views: 3409
Reputation: 1259
The SCAN operation code (OpCode) [as contrasted with the %SCAN built-in] has a capability almost matching the requirement to effect a count of the occurrences of a character in a string; enabling that effect is achieved by specifying an Array as the Result-Field [or, in MI parlance, the receiver; for reference a doc snippet from both the RPG reference and the near-equivalent MI instruction]. A second step is required however.
http://www.ibm.com/support/knowledgecenter/api/content/ssw_ibm_i_71/rzasd/sc092508999.htm#zzscan
SCAN (Scan String)
Free-Form Syntax (not allowed ...
...
The SCAN operation scans a string (base string) ...
...
http://www.ibm.com/support/knowledgecenter/ssw_ibm_i_71/rzatk/SCAN.htm
Scan (SCAN)
The following code source is capable of being compiled as a callable bound RPGLE program [to v5r1 and likely several releases prior]; when called, accepts a 32-byte input string [thus easily invoked without issues via command-line CALL CHARCOUNT PARM('string specified' 'i') /* for which the displayed result is: DSPLY 3 */] and a one-byte character value as arguments. The first argument is the string against which the number of occurrences of the character specified as the second argument, are counted. The output is via the DSPLY opcode of the storage from the input string rewritten as the edited numeric count of occurrences. Offered as-is, without further comment, except to state that the %lookup depends on a sequential search on the array:
H dftactgrp(*no) actgrp(*CALLER)
D CHARCOUNT PR ExtPgm('CHARCOUNT')
D inpstring 32A
D findchar 1A
D CHARCOUNT PI
D inpstring 32A
D findchar 1A
D clen C const(32)
D decary S 5S00 dim(clen)
D i S 2P00
c findchar scan inpstring decary
/free
// locate first zero-value array element
i = %lookup(0:decary) - 1 ;
inpstring = %editc(i:'3') ;
DSPLY inpstring ;
*INLR = *ON ;
/end-free
Upvotes: 1
Reputation: 14559
In light of comments and edits, let's ignore any specific use case and really just answer "how do I count occurrences of a particular character in a string using RPG without embedded SQL?".
To address the comment
I was just curious as to whether or not their was something similar to a BIF that would return a result much easier than using a SCAN
The answer is: Currently there isn't any BIF which just gives you the result directly. That is, there isn't something that works like
occurrences = %COUNT(needle:haystack);
If you are on version 7.1 or later, the most elegant way is probably to utilize %SCANRPL
, which is analogous to SQL's REPLACE
. Assuming needle
is a single character and haystack
is a varying-length string, it would be something like
occurrences = %LEN(haystack) - %LEN(%SCANRPL(needle:'':haystack));
That is, see how much shorter haystack
becomes if you remove all occurrences of needle
. (You can generalize this to needles longer than one character by dividing this result by the length of needle
.)
If you're on an earlier version, then your decision is probably between repeated %SCAN
s as you've done, or looping over haystack
character by character. The former is probably a bit more efficient, especially if the "needle density" is very low, but the latter is simpler to code and arguably easier to read and maintain.
I'd like to note at this point that the "no SQL" constraint is a very artificial one, like one you would encounter on a school assignment. In a real-world system, it's unlikely that you would have access to RPG but not to RPG-with-embedded-SQL, so if there is an elegant, readable SQL solution, then for real-world use, there is no reason to rule it out.
Upvotes: 3
Reputation: 7648
RPG Is a strongly typed language, so generally speaking, if you need a number, use a numeric field. Don't use a character field, then test and convert to a number. Display files (using DDS) were intended to make this task (ask the user to enter a number) easy.
That said, sometimes you don't control that input. You might be dealing with an EDI transaction or other file transfer where the other party puts text into a field and it's up to you to extract out the numeric portion. For cases like that, where you receive something like '-$45,907.12' you need to do more than count the number of minus signs.
IBM's Barbara Morris has posted the following code and it is an example of extracting out the numeric value from a character field. It understands minus symbols, decimal separators, decimal points and currency symbols.
<-----* prototype for /COPY file start here ----->
*---------------------------------------------------------
* getNum - procedure to read a number from a string
* and return a 30p 9 value
* Parameters:
* I: string - character value of number
* I:(opt) decComma - decimal point and digit separator
* I:(opt) currency - currency symbol for monetary amounts
* Returns: packed(30,9)
*
* Parameter details:
* string: the string may have
* - blanks anywhere
* - sign anywhere
* accepted signs are: + - cr CR ()
* (see examples below)
* - digit separators anywhere
* - currency symbol anywhere
* decComma: if not passed, this defaults to
* decimal point = '.'
* digit separator = ','
* currency: if not passed, defaults to ' '
*
* Examples of input and output (x means parm not passed):
*
* string | dec | sep | cursym | result
* ---------------+-----+-----+--------+------------
* 123 | x | x | x | 123
* +123 | x | x | x | 123
* 123+ | x | x | x | 123
* -123 | x | x | x | -123
* 123- | x | x | x | -123
* (123) | x | x | x | -123
* 12,3 | , | . | x | 12.3
* 12.3 | x | x | x | 12.3
* 1,234,567.3 | x | x | x | 1234567.3
* $1,234,567.3 | . | , | $ | 1234567.3
* $1.234.567,3 | , | . | $ | 1234567.3
* 123.45CR | x | x | x | -123.45
*
* Author: Barbara Morris, IBM Toronto Lab
* Date: March, 2000
*---------------------------------------------------------
D getNum pr 30p 9
D string 100a const varying
D decComma 2a const options(*nopass)
D currency 1a const options(*nopass)
<-----* prototype for /COPY file end here ----->
<-----* test program start here----->
* Copy prototype for procedure getNum
D/COPY GETNUM_P
D res s like(getNum)
D msg s 52a
C *entry plist
C parm p 32
C parm dc 2
C parm c 1
C select
C when %parms = 1
C eval res = getNum(p)
C when %parms = 2
C eval res = getNum(p : dc)
C when %parms = 3
C eval res = getNum(p : dc : c)
C endsl
C eval msg = '<' + %char(res) + '>'
C msg dsply
C return
<-----* test program end here----->
<-----* module GETNUM start here ----->
H NOMAIN
* Copy prototype for procedure getNum
D/COPY GETNUM_P
p getNum b
D getNum pi 30p 9
D string 100a const varying
D decComma 2a const options(*nopass)
D currency 1a const options(*nopass)
* defaults for optional parameters
D decPoint s 1a inz('.')
D comma s 1a inz(',')
D cursym s 1a inz(' ')
* structure for building result
D ds
D result 30s 9 inz(0)
D resChars 30a overlay(result)
* variables for gathering digit information
* pNumPart points to the area currently being gathered
* (the integer part or the decimal part)
D pNumPart s *
D numPart s 30a varying based(pNumPart)
D intPart s 30a varying inz('')
D decPart s 30a varying inz('')
* other variables
D intStart s 10i 0
D decStart s 10i 0
D sign s 1a inz('+')
D i s 10i 0
D len s 10i 0
D c s 1a
* override defaults if optional parameters were passed
C if %parms > 1
C eval decPoint = %subst(decComma : 1 : 1)
C eval comma = %subst(decComma : 2 :1)
C endif
C if %parms > 2
C eval cursym = currency
C endif
* initialization
C eval len = %len(string)
* begin reading the integer part
C eval pNumPart = %addr(intPart)
* loop through characters
C do len i
C eval c = %subst(string : i : 1)
C select
* ignore blanks, digit separator, currency symbol
C when c = comma or c = *blank or c = cursym
C iter
* decimal point: switch to reading the decimal part
C when c = decPoint
C eval pNumPart = %addr(decPart)
C iter
* sign: remember the most recent sign
C when c = '+' or c = '-'
C eval sign = c
C iter
* more signs: cr, CR, () are all negative signs
C when c = 'C' or c = 'R' or
C c = 'c' or c = 'r' or
C c = '(' or c = ')'
C eval sign = '-'
C iter
* a digit: add it to the current build area
C other
C eval numPart = numPart + c
C endsl
C enddo
* copy the digit strings into the correct positions in the
* zoned variable, using the character overlay
C eval decStart = %len(result) - %decPos(result)
C + 1
C eval intStart = decStart - %len(intPart)
C eval %subst(resChars
C : intStart
C : %len(intPart))
C = intPart
C eval %subst(resChars
C : decStart
C : %len(decPart))
C = decPart
* if the sign is negative, return a negative value
C if sign = '-'
C return - result
* otherwise, return the positive value
C else
C return result
C endif
p e
<-----* module GETNUM end here ----->
Upvotes: 1
Reputation: 23823
The %scan() bif accepts a 3 parameter - starting position. So you can do multiple scan's starting from the location of the last hit.
However, I'm not fond of this type of manual validation. From a performance standpoint, assuming most data is good, you're wasting cycles. More importantly, the test you've laid out, requiring '-' to be in the first position means that ' -10' would fail;
I prefer to simply do the conversion of catch the exception if need be.
monitor;
myValue = %dec(myString);
on-error;
// let the user know
endmon;
Lastly, TESTN
is obsolete and should be avoided. It probably doesn't work the way you'd want anyway. For example (IIRC), '5A' passes the TESTN test.
The RPG manual itself has this to say about TESTN:
Free-Form Syntax - (not allowed - rather than testing the variable before using it, code the usage of the variable in a MONITOR group and handle any errors with ON-ERROR. See Error-Handling Operations.)
Upvotes: 1