Cette page n'est pas disponible en français. Veuillez-nous en excuser.

ABIS Infor - 2012-06

Object-Oriented COBOL

Peter Vanroose (ABIS) - April 2012

Abstract

What programming environment would be more distant from objects than COBOL? And yet the most recent COBOL standard (2002) introduced support for object-oriented syntax. Here's a brief summary of the not so well-known OO-COBOL.

What is COBOL?

The COBOL programming language dates from around 1960 so it is one of the first 3rd generation languages. It is still actively used for application development, mainly on mainframes. Its main strength lies in performance benefits when it comes to massive data I/O.

COBOL is remarkably "verbose": all statements start with a verb (like ADD or COMPUTE or MOVE) and read as an English sentence. Programs are subdivided in paragraphs which can be "performed" (i.e., called, but without the overhead of building a call stack). COBOL is also remarkably versatile regarding datatypes for variables: one can e.g. declare integer and decimal types of any wanted representation length, from 1 up to 18 decimal digits.

What is an object-oriented program?

An object is an encapsulation of both data and functions (called "methods"). In summary, there are two important advantages of using objects in a computer program: (1) an object contains a bunch of (structured) data as a single variable, i.e., the object carries state, and (2) public interface (the methods) and private implementation (based on the data) of an object are cleanly separated: modifying the data representation of an object may require (some) method implementations to be rewritten, but never require users of the object to change their programs.

This makes modular programming suddenly much easier than with procedural programming, where it's the responsibility of the caller of a function to maintain state (and hence to decide on data structures). Now it's the implementor of a set of methods who decides on the layout of data structures. The caller doesn't even have to know about those internals.

One last term is important: a class is the abstraction of the combined implementation and interface for objects sharing the same data layout. Otherwise said: an object is an instantiation of a class structure. Two different objects which instantiate the same class may have different states (i.e., different content of their internal data) but they have a common interface.

COBOL and objects

COBOL has had four major standardisation "waves" spelling out what language syntax a COBOL compiler should understand. COBOL-68, COBOL-74 and COBOL-85 are currently supported by essentially all compilers. COBOL-85 is still the de facto standard for the COBOL language, and for most modern programs. The COBOL 2002 standard added syntax to support objects. As with the other standards, COBOL 2002 is backward compatible, which means that traditional (functional) programming can me mixed with the new object syntax.

COBOL programs can of course call subroutines written in other programming languages (like e.g. PL/I or assembler). The caller only needs to know the name of the subroutine, and the byte layout of the parameters passed in and out.

Similarly, COBOL programs can now create and use objects whose class implementation is written in e.g. Java. This is currently probably the most important and most common use of OO-COBOL. But the 2002 standard does of course also provide for implementing classes in COBOL, to be called from other COBOL programs or (why not?) from Java programs.

Commercial compilers like Fujitsu, MicroFocus, and IBM Enterprise all support OO-COBOL. MicroFocus actually started support for OO back in 1994! IBM first had a "System Object Model" (SOM) based OO-COBOL dialect but switched to a more standardized OO support in their current Enterprise COBOL compiler (4.1 and 4.2).

A simple example

Suppose you need to develop an application which allows users to place an order. The application interacts with the user (so it's maybe a CICS program), asking him/her all sorts of questions and returning feedback before the order is confirmed. Upon completion the application may e.g. insert a single record in an "order" database table, or send a fixed-format e-mail to the administration.

The "functional programming" variant of such a program could call subroutines for consulting e.g. the list of available products, the price of a product, its delivery time, etc. But it would be the main program which has to keep track of "state" information like remembering the answers given by the user or caching prices and product names.

The OO variant of such a program would design an "orders" class with methods like "add n times product x to the order", "return the list of available products", "give the price of product x", "enter delivery address", etc. Actually, it would design several classes, including a "products list" class, to serve as the return type for the "available products" function; a "product" class which can be interrogated to obtain product prices; a "customer" class to store e.g. address information of the user, etc. These classes would be able to "use" one another, all without having to know each others internal data layout. Methods of one class may accept as input (or return as output) an object instantiating one of the other classes.

To keep it simple, let's just look at the COBOL 2002 syntax for declaring and using a class to keep track of an order and which we will call PRODUCTLIST. It will allow a user to add/remove a product to/from the list, and to ask for the list length.

To create a new class, put the following two entries to your IDENTIFICATION DIVISION:

A paragraph named CLASS-ID, to replace the PROGRAM-ID. It defines the class name:

		CLASS-ID.	PRODUCTLIST
				INHERITS 'java.lang.Object'.

And a paragraph named OBJECT, which defines the interface (methods) and data structures of instantiations of the class:

		OBJECT.
			DATA DIVISION.
				...
			PROCEDURE DIVISION.
				...
		END OBJECT.

A COBOL program wanting to use this class must add a REPOSITORY paragraph to the CONFIGURATION SECTION:

		CONFIGURATION SECTION.
		REPOSITORY.
			CLASS PRODUCTLIST IS 'ProductList'
			CLASS PRODUCT     IS 'Product'    .

The PROCEDURE DIVISION interacts with objects (be it for creating a new instance, or calling a method) through the new statement INVOKE, which understands the NEW, USING, and RETURNING clauses to pass parameters in or out. Objects instantiating a class must be declared by using the "OBJECT REFERENCE" usage clause:

		DATA DIVISION.
		WORKING-STORAGE SECTION.
		77  LEN    PIC S9(9) COMP.
		77  PRICE  PIC S9(9) COMP.
		77  PLIST OBJECT REFERENCE PRODUCTLIST.
		77  PRD   OBJECT REFERENCE PRODUCT.
		PROCEDURE DIVISION.
			INVOKE PRODUCTLIST NEW RETURNING PLIST
			INVOKE PRODUCTLIST 'GetLength' USING PLIST RETURNING LEN
			IF LEN = 0 THEN
				DISPLAY 'Your Product list is empty (as expected)'
			END-IF
			INVOKE PRODUCT NEW RETURNING PRD
			INVOKE PRODUCTLIST 'AddProduct' USING BY VALUE PLIST PRD
			INVOKE PRODUCTLIST 'GetLength' USING BY VALUE PLIST RETURNING LEN
			DISPLAY 'The length of your Product list should now be 1, is ' LEN
			INVOKE PRODUCTLIST 'GetLastProduct' USING BY VALUE PLIST RETURNING PRD
			INVOKE PRODUCT 'GetPrice' USING BY VALUE PRD RETURNING PRICE
			DISPLAY 'The price of the last product is ' PRICE
		(etc.)

A completely worked out implementation of this simple example for the IBM Enterprise compiler can be found on the ABIS website:
( http://www.abis.be/resources/newsletter/OOCobol_sample1.pdf and
http://www.abis.be/resources/newsletter/OOCobol_sample2.pdf).

For simplicity, the example only passes integer values (PIC S9(9) COMP) in and out of classes. In order to pass e.g. decimal numbers or text strings, a relatively complicated syntax using pointers is needed with this compiler, unfortunately. The reason for this (nonstandard) complication is maintaining binary compatibility with Java programs (since those are often used to implement classes). Even COBOL-defined classes must inherit from the class java.lang.Object !

Conclusion

One of the main advantages of OO-COBOL (as compared to e.g. Java) is that it allows a COBOL programmer to start using objects with minimal effort. Typical COBOL programs will then be a mix of classical functional programming with some object-oriented aspects. At the same time this is also the main drawback of OO-COBOL (as compared to e.g. Java): object orientation goes much further than this. (Think of e.g. inheritance, overloading, polymorphism, templates, ...)

Programming the OO way requires a focus shift: from thinking in terms of data and algorithms, to thinking in terms of objects, responsibilities, and cooperation. Interestingly, with OO-COBOL, this is possible without loosing performance.

Most importantly, OO-COBOL allows an application design to combine the strengths of COBOL (I/O performance) and Java (flexibility, modularity, manageable complexity) into a single program!

Be aware, though, that IBM Enterprise COBOL is not implementing the full OO-COBOL standard. And it is also lacking some flexibility with respect to datatypes. Let's hope this will gradually be improved upon in future versions.