Object-oriented programming

Generic containers as described in the previous section improve on containers for fixed types, but they are still limited to storing a single type. In some circumstances it makes sense to design data structures or functions to work on any type that supplies the right operations. To make this work, we will attach the functions for manipulating an object to the object itself. This is the central idea behind object-oriented programming.

As with most sophisticated programming techniques, C doesn’t provide any direct support for object-oriented programming, but it is possible to make it work anyway by taking advantage of C’s flexibility. We will use two basic ideas: first, we’ll give each object a pointer to a method table in the form of a struct full of function pointers, and second we’ll take advantage of the fact that C lays out struct fields in the order we declare them to allow subtyping, where some objects are represented by structs that extend the base struct used for all objects. (We could apply a similar technique to the method tables to allow subtypes to add more methods, but for the example in this section we will keep things simple and provide the same methods for all objects.)

Here is the file that declares the base object type. Note that we expose the details of both struct object and struct methods, since subtypes will need to implement these.

#ifndef _OBJECT_H
#define _OBJECT_H

// truncated version of an object
// real object will have more fields after methods
// we expose this for implementers
struct object {
    const struct methods *methods;
};

typedef struct object Object;

struct methods {
    Object *(*clone)(const Object *self);
    void (*print)(const Object *self);
    void (*destroy)(Object *self);
};

#endif

examples/objectOriented/object.h

Objects in this system have three methods: clone, which makes a copy of the object, print, which sends a representation of the object to stdout, and destroy, which frees the object. Each of these methods takes the object itself as its first argument self, since C provides no other mechanism to tell these functions which object to work on. If we needed to pass additional arguments to a method, we would add these after self, but for these simple methods this is not needed.

To implement a subtype of object, we extend struct object by defining a new struct type that has methods as its first field, but may have additional fields to store the state of the object. We also write functions implementing the new type’s methods, and build a constant global struct containing pointers to these functions, which all objects of the new type will point to. Finally, we build a constructor function that allocates and initializes an instance of the new object. This will be the only function that is exported from the module for our subtype, because anything else we want to do to an object, we can do by calling methods. This gives a very narrow interface to each subtype module, which is good, because the narrower the interface, the less likely we run into collisions with other modules.

Here is the very short header file for a subtype of Object that holds ints. Most of this is just the usual header file boilerplate.

#ifndef _INTOBJECT_H
#define _INTOBJECT_H

#include "object.h"

Object *
intObjectCreate(int value);

#endif

examples/objectOriented/intObject.h

And here is the actual implementation. Note that it is only in this file that functions can access the representation struct intObject of these objects. Everywhere else, they just look like Objects. This does come with a cost: each of the method implementations has to cast in and out of Object pointers to work with the underlying struct intObjects, and even though this corresponds to precisely zero instructions at run time, if we fail to be disciplined enough to only apply intObject methods to intObjects, the compiler will not catch the error for us.

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

#include "intObject.h"

// wrap ints up as objects
// this extends Object with extra field
struct intObject {
    struct methods *methods;
    int value;
};

static void printInt(const Object *s);
static Object *cloneInt(const Object *s);
static void destroyInt(Object *s);

static struct methods intObjectMethods = {
    cloneInt,
    printInt,
    destroyInt
};

static void
printInt(const Object *self)
{
    printf("%d", ((struct intObject *) self)->value);
}

static Object *
cloneInt(const Object *self)
{
    return intObjectCreate(((struct intObject *) self)->value);
}

static void
destroyInt(Object *self)
{
    // we don't have any pointers, so we can just free the block
    free(self);
}

Object *
intObjectCreate(int value)
{
    struct intObject *self = malloc(sizeof(struct intObject));
    assert(self);

    self->methods = &intObjectMethods;
    self->value = value;

    return (Object *) self;
}

examples/objectOriented/intObject.c

Having implemented these objects, we can use them in any context where the three provided methods are enough to work with them. For example, here is the interface to a stack that works on arbitrarily Objects.

#ifndef _STACK_H
#define _STACK_H

#include "object.h"

// basic stack implementation
// stack is a pointer to its first element
// caller will keep a pointer to this
typedef struct elt *Stack;

// create and destroy stacks
Stack *stackCreate(void);
void stackDestroy(Stack *);

// usual functions
void stackPush(Stack *s, Object *);

// don't call this on an empty stack
Object *stackPop(Stack *s);

// returns true if not empty
int stackNotEmpty(const Stack *s);

// print the elements of a stack to stdout
// using function print
void stackPrint(const Stack *s);

#endif

examples/objectOriented/stack.h

Internally, this stack will use the clone method to ensure that it gets its own copy of anything pushed onto the stack in stackPush, to protect against the caller later destroying or modifying the object being pushed; the print method to print objects in stackPrint; and the destroy method to clean up in stackDestroy. The implementation looks like this:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

#include "stack.h"

struct elt {
    struct elt *next;
    Object *value;
};

// create and destroy stacks
Stack *
stackCreate(void) {
    Stack *s;

    s = malloc(sizeof(Stack));
    assert(s);
    *s = 0;  // empty stack
    return s;
}

void
stackDestroy(Stack *s) {
    Object *o;
    while(stackNotEmpty(s)) {
        o = stackPop(s);
        o->methods->destroy(o);
    }
    free(s);
}

// usual functions
void 
stackPush(Stack *s, Object *value) {
    struct elt *e = malloc(sizeof(struct elt));
    e->next = *s;
    e->value = value->methods->clone(value);
    *s = e;
}

// don't call this on an empty stack
Object *
stackPop(Stack *s) {
    assert(stackNotEmpty(s));

    struct elt *e = *s;
    Object *ret = e->value;

    *s = e->next;
    free(e);

    return ret;
}

// returns true if not empty
int
stackNotEmpty(const Stack *s) {
    return *s != 0;
}

// print the elements of a stack to stdout
void
stackPrint(const Stack *s) {
    for(struct elt *e = *s; e; e = e->next) {
        e->value->methods->print(e->value);
        putchar(' ');
    }
    putchar('\n');
}

examples/objectOriented/stack.c

Because we are working in C, method calls are a little verbose, since we have to follow the method table pointer and supply the self argument ourself. Object-oriented programming languages generally provide syntactic sugar to simplify this task (and avoid possible errors). So a messy line in C like

    e->value->methods->print(e->value);

would look in C++ like

    e->value.print();

Something similar would happen in other object-oriented languages like Python or Java.

Because stack.c accesses objects only through their methods, it will work on any objects, even objects of different types mixed together. Below is a program that mixes int objects as defined above with string objects defined in string.h and string.c:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <string.h>

#include "object.h"
#include "intObject.h"
#include "stringObject.h"

#include "stack.h"

#define N (3)

// do some stack stuff
int
main(int argc, char **argv)
{
    char str[] = "hi";
    Object *o;

    int n = N;
    if(argc >= 2) {
        n = atoi(argv[1]);
    }

    Stack *s = stackCreate();

    for(int i = 0; i < n; i++) {
        // push a string onto the stack
        str[0] = 'a' + i;
        o = stringObjectCreate(str);
        stackPush(s, o);
        o->methods->destroy(o);
        stackPrint(s);

        // push an int onto the stack
        o = intObjectCreate(i);
        stackPush(s, o);
        o->methods->destroy(o);
        stackPrint(s);
    }

    while(stackNotEmpty(s)) {
        o = stackPop(s);
        putchar('[');
        o->methods->print(o);
        o->methods->destroy(o);
        fputs("] ", stdout);
        stackPrint(s);
    }

    stackDestroy(s);

    return 0;
}

examples/objectOriented/testStack.c

This pushes an alternating pile of ints and strings onto the stack, printing the stack after each push, then pops and prints these objects, again printing the stack after each push. Except for having to choose between intObjectCreate and stringObjectCreate at creation time, nothing in testStack.c depends on which of these two subtypes each object is.

Of course, to build testStack we need to link together a lot of files, which we can do with this Makefile. Running make test gives the following output, demonstrating that we are in fact successfully mixing ints with strings:

gcc -std=c99 -Wall -g3   -c -o testStack.o testStack.c
gcc -std=c99 -Wall -g3   -c -o stack.o stack.c
gcc -std=c99 -Wall -g3   -c -o intObject.o intObject.c
gcc -std=c99 -Wall -g3   -c -o stringObject.o stringObject.c
gcc -std=c99 -Wall -g3 -o testStack testStack.o stack.o intObject.o stringObject.o
for i in ; do ./$i; done
for i in testStack; do valgrind -q --leak-check=full ./$i; done
ai 
0 ai 
bi 0 ai 
1 bi 0 ai 
ci 1 bi 0 ai 
2 ci 1 bi 0 ai 
[2] ci 1 bi 0 ai 
[ci] 1 bi 0 ai 
[1] bi 0 ai 
[bi] 0 ai 
[0] ai 
[ai] 

As with generic containers, the nice thing about this approach is that if we want to add more subtypes of object, we can do so the same way we did with intObject and stringObject, without having to ask anybody’s permission to change any of the code in object.h, stack.h, stack.c, or testStack.c. This is very different from what would happen, for example, if an Object was implemented as a tagged union, where adding a new type would require rewriting the code for Object. The cost is that we have to follow function pointers and be disciplined in how we use them.


Licenses and Attributions


Speak Your Mind

-->