How does COBOL store and retrieve data?

Matt Phillips picture Matt Phillips · Mar 14, 2010 · Viewed 21.8k times · Source

I'm starting to learn about COBOL. I have some experience writing programs that deal with SQL databases and I guess I'm confused how COBOL stores and retrieves data that is stored in a mainframe for example. I know that it's not like relational databases but every example program I've seen takes data straight from the command line and I know that's not how real world COBOL programs process the data. Can someone explain or show me a good resource that can explain it?

Answer

NealB picture NealB · Mar 14, 2010

COBOL is just another third generation computer language. It is just a little older than most which doesn't mean it is somehow incomplete (actually it comes with quite a bit of baggage - but that is another story).

As with any other third generation language, COBOL manipulates data files in pretty much the same way that you would in a C program. Nothing odd, mysterious or magical about it. Files are opened, read, written and closed using the file I/O features of the language.

Various mechanisms are used to form a link between an actual file and the program. The details here are often specific to the operating system you are working under. Generally, COBOL implementations try to isolate themselves from the operating environment through a logical file name as opposed to an actual name. This added indirection is important when you are writing programs that will be ported to different platforms (e.g. write and test within an IDE on a Windows platform, and then run on a mainframe).

The following examples relate to an IBM Mainframe environment.

Within the IBM mainframe world, you will find that programs are run as either batch or on-line (e.g. CICS). I will not describe how to set up for file I/O under CICS (that's a long story). Programs that are used to manipulate files are usually batch. Here is a rough illustration of how a batch program works:

  1. Batch programs are run via JCL. JCL is used to identify the program to run ('EXEC' statement) and to identify which files your program will reference using 'DD' statements. The function of a DD Statement is to form a logical connection between an actual file and a name your COBOL program will reference when it wants to refer to the file (this is the isolation mechanism mentioned earlier). For example,

    JCLDDNAM DD DSN='HLQ.MY.FILE'...
    

    would associate the 'DD' name 'JCLDDNAM' to the file named 'HLQ.MY.FILE'. This part is platform dependant so the details are specific to the operating environment.

  2. In the 'FILE-CONTROL' section of your COBOL program, you connect the 'DD NAME' defined in your JCL with the name you will use on each I/O statement to reference that file. This connection is defined using the 'SELECT' statement.
    For example,

    SELECT MYFILE
    ASSIGN JCLDDNAM
    remainder of select
    

    makes a connection between whatever file you associated with 'JCLDDNAM' in your 'JCL' to 'MYFILE' that you will later reference in COBOL I/O statements. The SELECT statement itself is part of the ISO COBOL standard. However, many COBOL implementations define some non-standard extentions to facilitate various quirks to their file subsystems.

  3. Open, read, write, close files within the 'PROCEDURE DIVISION' of you program using the name 'MYFILE' as in:

    OPEN MYFILE  
    READ MYFILE  
    CLOSE MYFILE  
    

The above is highly simplified, and there are a multitude of ways to do this within COBOL. Understanding the complete picture will take some real effort, time and practice. The I/O statements illustrated above are part of the COBOL standard, but every vendor will have their own extentions.

IBM COBOL supports a wide range of file organizations and access methods. You can review the IBM Enterprise COBOL Language Reference manual here to get the syntax and rules for file manipulation, However, the User Guide provides a lot of good examples for reading/writing files (you will have to dig a bit—but it is all laid out for you).

The setup to reference an SQL database via a COBOL program is somewhat different but involves setting up a connection between your program and the database subsystem. Within the IBM world this is done throug JCL, other environments will use different mechanisms.

IBM COBOL uses a pre-processor or co-processor to integrate database access and data exchange. For example, the following code would retrieve some data from a DB2 database:

MOVE 1234 TO PERSON-ID
EXEC SQL
  SELECT  FIRST_NAME,  LAST_NAME
  INTO   :FIRST-NAME, :LAST-NAME
  FROM PERSON
  WHERE PERSON_ID = :PERSON-ID
END-EXEC
DISPLAY PERSON-ID FIRST-NAME LAST-NAME

The stuff between EXEC SQL and END-EXEC is a pretty simple SQL select statement. The names preceded by colons are COBOL host variables used to pass data to DB2 or receive it back. If you have ever coded database access routines before this should look very familiar to you. This link provides a simple introduction to incorporating SQL statements into an IBM Enterpirse COBOL program.

By the way, IBM Enterprise COBOL is capable of working with XML documents too. Sorry for the heavy IBM slant, but that is the environment I am most familiar with.

Hope this gets you started in the right direction.