Cuite design (1/?): QObject in OCaml

2020-05-10 18:39:59+02:00

Two years ago, I worked on "Cuite", an OCaml binding to Qt5. It stalled when I got to the point where all core concepts were mapped to OCaml. The remaining work was very repetitive: go through the huge hierarchy of Qt classes and bind each method, accommodating for the occasional ad-hoc behavior.

There is also some shortcomings to revisit in my approach:

This post is the first of a series where I explain the thoughts that went in the design of the library and how these issues are addressed.

Exposing QObjects

QObject is the root of the main class hierarchy in Qt. It is used everywhere: all widgets are QObject instances.

The binding needs to expose QObject classes, instances and functions to OCaml programs. In this post we will take a look at memory management: how QObjects are allocated and released when manipulated from OCaml.

There are a few properties that I wanted the binding to preserve. This is subjective, another binding might look for other properties. Here is what I was looking for:

I ended up with a scheme that provides all these properties to the binding. The rest of the post focuses on memory management for QObjects.

QObject values

Each QObject instance visible from the OCaml program is mapped to a unique value. This graph shows all the infrastructure involved.

Exposing a QObject to OCaml world

An instance QObject *obj is made accessible from OCaml code via the mlproxy value. In other words, we want the functions:

value Val_QObject(QObject *obj);
QObject *QObject_val(value v);

QObjectal: from value to QObject

The OCaml block mlproxy contains a pointer to an object cproxy in the C++ heap. In turn cproxy has a pointer to obj, the QObject.

To get to the QObject from the OCaml value we just need two follow two pointers.

Handling QObject destruction

We need to keep track of when the QObject is deleted: the OCaml value might still be reachable and we don't want to accidentally deferences the QObject past that point.

This is not too difficult, we can either:

From QObject to CProxy

The Val_QObject function will be invoked many times, we don't want to create a new proxy each time. The ProxyTable remember the CProxy associated to a QObject. It is a hash-table indexed by object addresses. It is populated by the helper function:

static CProxy *QObject_proxy(QObject *obj);

QObject_proxy starts by looking up the hash-table. If a valid proxy is found, it is returned. Otherwise, the object has not yet been exported to OCaml world. We allocate, initialize, and add a new CProxy to the table. The weakid field is initialized to -1.

From CProxy to value: the weakid field

We have a CProxy, but not yet an OCaml value. The weakid field is an index in the WeakTable, a global OCaml table that weakly references MLProxy's:

This is done from a primitive exported by OCaml code that also registers a finalizer to handled the cleanup of unreachable objects:

val finalize_and_index : ml_proxy -> int

Why go through the hoops of this weak table? Because C++ code needs to access the OCaml values but normal roots are strong references. That would prevent MLProxy values from being collectible by the GC.

QObjectval/ValQObject: ✔️

We now have both functions:

value Val_QObject(QObject *obj);
QObject *QObject_val(value v);

They:
- can convert from value to QObject and from QObject to value
- safely handle explicit QObject deletion
- enable automatic deletion of unreachable objects