Find a substring in a string in COBOL

daniegarcia254 picture daniegarcia254 · Oct 14, 2014 · Viewed 12.2k times · Source

My problem is, given a variable which I read from a file, see if it contains or matches another string.

In other words, find in a file all the records whose variable

BRADD PIC X(30)

matches or contains a string introduced by keyboard.

I'm very confident this problem is resolved through the INSPECT instruction, and I've tried something like this in my code:

           READ BRANCHFILE NEXT RECORD 
             AT END SET EndOfFile TO TRUE
           END-READ.
           PERFORM UNTIL EndOfFile
               INSPECT BBRADD 
                 TALLYING CONT for CHARACTERS
                   BEFORE INITIAL CITY
               IF CONT>1
                   DISPLAY " BRANCH CODE    :" BBRID
                   DISPLAY " BRANCH NAME    :" BBRNAME
                   DISPLAY " BRANCH ADDRESS :" BBRADD
                   DISPLAY " PHONE          :" BBRPH
                   DISPLAY " E-MAIL         :" BEMAIL
                   DISPLAY " MANAGER NAME   :" BMGRNAME
                   DISPLAY " ------------------"
                   DISPLAY " ------------------"
               END-IF
               READ BRANCHFILE NEXT RECORD 
                   AT END SET EndOfFile TO TRUE
               END-READ
               MOVE 0 TO CONT
           END-PERFORM.

Where CITY is the variable I introduce through keyboard.

¿Anyone knows how to find a "substring" in a "string"?

For example, if I introduced "Zaragoza" my program have to print all the records in the file which variable BBRADD contains "Zaragoza".

01 BRANCHREC. 
   88 EndOfFile VALUE HIGH-VALUE. 
   02 BBRID PIC X(6). 
   02 BBRNAME PIC X(15). 
   02 BBRADD PIC X(30). 
   02 BBRPH PIC X(10). 
   02 BEMAIL PIC X(20). 
   02 BMGRNAME PIC X(25). 

Answer

Bill Woodger picture Bill Woodger · Oct 14, 2014

You would need to set CONT to zero before the INSPECT, every time.

CONT just gets updated from its initial value when the INSPECT starts. After you find your first one, every record will look like it has CITY in it.

If may initially seem odd that it works that way, but if it didn't you'd be limited on the occasions when that is how you want it to work.

Ah, looking a little closer, you are setting CONT to an initial value, you are just doing it in an unexpected place. If it needs to be zero, set it to zero immediately before it should be zero. Much easier to find, less easy for someone changing the program in the future to make a mess of.

However, you have another problem. Let's say CITY is PIC X(20). The user enters SEVILLA and your INSPECT will now search for SEVILLA followed by 13 spaces. Ideally you'd want SEVILLA followed by one space.

You need to be able to test for a value that the user has entered, with a trailing blank, but not more.

The current popular way to do this is with reference-modification.

You need to take your user-input, find out how many trailing spaces it contains, calculate how long the data is, add one for the trailing blank, and hold that value in a field (preferably a BINARY field).

Then your INSPECT can look like this:

           INSPECT BBRADD 
             TALLYING CONT for CHARACTERS
               BEFORE INITIAL CITY ( 1 : length-of-data-plus-one )

However, then you have a problem if SEVILLA is actually in the start of the field.

So you make a small change, not to count characters which appear before it, but to count occurrences of it.

           INSPECT BBRADD 
             TALLYING CONT for ALL
                           CITY ( 1 : length-of-data-plus-one )

Many people will instead code a PERFORM loop with reference-modification and do the test that way. With the final version of the INSPECT above have to code the termination logic yourself. For learning purposes it would be good to do it both ways.

When doing file-io, always use and check the FILE STATUS. Put your READ into a paragraph and perform it, you don't need two different pieces of code. If you use the FILE STATUS you don't need the AT END (or the END-READ) as the field you use to receive the FILE STATUS value will be "10" for end-of-file. Just use your 88 on that field, with the value of "10".

The Edit on your question now indicates where your existing 88-level is.

On the one hand, this is a good idea, because the end-of-file is associated with the record, and there can be no valid accidental content.

On the other hand, this is not a "portable" solution: if you use other COBOLs you may find that once end-of-file is reached it is no longer valid to access data under the FD. In the standard what happens in this situation is not defined, so you get differences amongst compilers.

You can retain the 88 on the group-item had have it portable by using READ ... INTO ... and having your record-layout in WORKING-STORAGE. This takes slightly longer to execute, as the data has to be transferred from one location to another.

I prefer the 88 on the FILE STATUS field and simplify the READ by being able to remove the AT END and END-READ. I already can't access the record-area under the FD so I can't accidentally get wrong values which look good.