Polymorphism and Implicit Sharing

Posted on kdekritatechc++

Table of Contents 目录

Recently I have been researching into possibilities to make members of KoShape copy-on-write. At first glance, it seems enough to declare d-pointers as some subclass of QSharedDataPointer (see Qt's implicit sharing) and then replace pointers with instances. However, there remain a number of problems to be solved, one of them being polymorphism.

polymorphism and value semantics

In the definition of KoShapePrivate class, the member fill is stored as a QSharedPointer:

QSharedPointer<KoShapeBackground> fill;

There are a number of subclasses of KoShapeBackground, including KoColorBackground, KoGradientBackground, to name just a few. We cannot store an instance of KoShapeBackground directly since we want polymorphism. But, well, making KoShapeBackground copy-on-write seems to have nothing to do with whether we store it as a pointer or instance. So let's just put it here -- I will come back to this question at the end of this post.

d-pointers and QSharedData

The KoShapeBackground heirarchy (similar to the KoShape one) uses derived d-pointers for storing private data. To make things easier, I will here use a small example to elaborate on its use.

derived d-pointer

class AbstractPrivate
{
public:
    AbstractPrivate() : var(0) {}
    virtual ~AbstractPrivate() = default;

    int var;
};

class Abstract
{
public:
    // it is not yet copy-constructable; we will come back to this later
    // Abstract(const Abstract &other) = default;
    ~Abstract() = default;
protected:
    explicit Abstract(AbstractPrivate &dd) : d_ptr(&dd) {}
public:
    virtual void foo() const = 0;
    virtual void modifyVar() = 0;
protected:
    QScopedPointer<AbstractPrivate> d_ptr;
private:
    Q_DECLARE_PRIVATE(Abstract)
};

class DerivedPrivate : public AbstractPrivate
{
public:
    DerivedPrivate() : AbstractPrivate(), bar(0) {}
    virtual ~DerivedPrivate() = default;

    int bar;
};

class Derived : public Abstract
{
public:
    Derived() : Abstract(*(new DerivedPrivate)) {}
    // it is not yet copy-constructable
    // Derived(const Derived &other) = default;
    ~Derived() = default;
protected:
    explicit Derived(AbstractPrivate &dd) : Abstract(dd) {}
public:
    void foo() const override { Q_D(const Derived); cout << "foo " << d->var << " " << d->bar << endl; }
    void modifyVar() override { Q_D(Derived); d->var++; d->bar++; }
private:
    Q_DECLARE_PRIVATE(Derived)
};

The main goal of making DerivedPrivate a subclass of AbstractPrivate is to avoid multiple d-pointers in the structure. Note that there are constructors taking a reference to the private data object. These are to make it possible for a Derived object to use the same d-pointer as its Abstract parent. The Q_D() macro is used to convert the d_ptr, which is a pointer to AbstractPrivate to another pointer, named d, of some of its descendent type; here, it is a DerivedPrivate. It is used together with the Q_DECLARE_PRIVATE() macro in the class definition and has a rather complicated implementation in the Qt headers. But for simplicity, it does not hurt for now to understand it as the following:

#define Q_D(Class) Class##Private *const d = reinterpret_cast<Class##Private *>(d_ptr.data())

where Class##Private means simply to append string Private to (the macro argument) Class.

Now let's test it by creating a pointer to Abstract and give it a Derived object:

int main()
{
    QScopedPointer<Abstract> ins(new Derived());
    ins->foo();
    ins->modifyVar();
    ins->foo();
}

Output:

foo 0 0
foo 1 1

Looks pretty viable -- everything's working well! -- What if we use Qt's implicit sharing? Just make AbstractPrivate a subclass of QSharedData and replace QScopedPointer with QSharedDataPointer.

making d-pointer QSharedDataPointer

In the last section, we commented out the copy constructors since QScopedPointer is not copy-constructable, but here QSharedDataPointer is copy-constructable, so we add them back:

class AbstractPrivate : public QSharedData
{
public:
    AbstractPrivate() : var(0) {}
    virtual ~AbstractPrivate() = default;

    int var;
};

class Abstract
{
public:
    Abstract(const Abstract &other) = default;
    ~Abstract() = default;
protected:
    explicit Abstract(AbstractPrivate &dd) : d_ptr(&dd) {}
public:
    virtual void foo() const = 0;
    virtual void modifyVar() = 0;
protected:
    QSharedDataPointer<AbstractPrivate> d_ptr;
private:
    Q_DECLARE_PRIVATE(Abstract)
};

class DerivedPrivate : public AbstractPrivate
{
public:
    DerivedPrivate() : AbstractPrivate(), bar(0) {}
    virtual ~DerivedPrivate() = default;

    int bar;
};

class Derived : public Abstract
{
public:
    Derived() : Abstract(*(new DerivedPrivate)) {}
    Derived(const Derived &other) = default;
    ~Derived() = default;
protected:
    explicit Derived(AbstractPrivate &dd) : Abstract(dd) {}
public:
    void foo() const override { Q_D(const Derived); cout << "foo " << d->var << " " << d->bar << endl; }
    void modifyVar() override { Q_D(Derived); d->var++; d->bar++; }
private:
    Q_DECLARE_PRIVATE(Derived)
};

And testing the copy-on-write mechanism:

int main()
{
    QScopedPointer<Derived> ins(new Derived());
    QScopedPointer<Derived> ins2(new Derived(*ins));
    ins->foo();
    ins->modifyVar();
    ins->foo();
    ins2->foo();
}

But, eh, it's a compile-time error.

error: reinterpret_cast from type 'const AbstractPrivate*' to type 'AbstractPrivate*' casts away qualifiers
 Q_DECLARE_PRIVATE(Abstract)

Q_D, revisited

So, where does the const removal come from? In qglobal.h, the code related to Q_D is as follows:

template <typename T> inline T *qGetPtrHelper(T *ptr) { return ptr; }
template <typename Ptr> inline auto qGetPtrHelper(const Ptr &ptr) -> decltype(ptr.operator->()) { return ptr.operator->(); }

// The body must be a statement:
#define Q_CAST_IGNORE_ALIGN(body) QT_WARNING_PUSH QT_WARNING_DISABLE_GCC("-Wcast-align") body QT_WARNING_POP
#define Q_DECLARE_PRIVATE(Class) \
    inline Class##Private* d_func() \
    { Q_CAST_IGNORE_ALIGN(return reinterpret_cast<Class##Private *>(qGetPtrHelper(d_ptr));) } \
    inline const Class##Private* d_func() const \
    { Q_CAST_IGNORE_ALIGN(return reinterpret_cast<const Class##Private *>(qGetPtrHelper(d_ptr));) } \
    friend class Class##Private;

#define Q_D(Class) Class##Private * const d = d_func()

It turns out that Q_D will call d_func() which then calls an overload of qGetPtrHelper() that takes const Ptr &ptr. What does ptr.operator->() return? What is the difference between QScopedPointer and QSharedDataPointer here?

QScopedPointer's operator->() is a const method that returns a non-const pointer to T; however, QSharedDataPointer has two operator->()s, one being const T* operator->() const, the other T* operator->(), and they have quite different behaviours -- the non-const variant calls detach() (where copy-on-write is implemented), but the other one does not.

qGetPtrHelper() here can only take d_ptr as a const QSharedDataPointer, not a non-const one; so, no matter which d_func() we are calling, we can only get a const AbstractPrivate *. That is just the problem here.

To resolve this problem, let's replace the Q_D macros with the ones we define ourselves:

#define CONST_SHARED_D(Class) const Class##Private *const d = reinterpret_cast<const Class##Private *>(d_ptr.constData())
#define SHARED_D(Class) Class##Private *const d = reinterpret_cast<Class##Private *>(d_ptr.data())

We will then use SHARED_D(Class) in place of Q_D(Class) and CONST_SHARED_D(Class) for Q_D(const Class). Since the const and non-const variant really behaves differently, it should help to differentiate these two uses. Also, delete Q_DECLARE_PRIVATE since we do not need them any more:

class AbstractPrivate : public QSharedData
{
public:
    AbstractPrivate() : var(0) {}
    virtual ~AbstractPrivate() = default;

    int var;
};

class Abstract
{
public:
    Abstract(const Abstract &other) = default;
    ~Abstract() = default;
protected:
    explicit Abstract(AbstractPrivate &dd) : d_ptr(&dd) {}
public:
    virtual void foo() const = 0;
    virtual void modifyVar() = 0;
protected:
    QSharedDataPointer<AbstractPrivate> d_ptr;
};

class DerivedPrivate : public AbstractPrivate
{
public:
    DerivedPrivate() : AbstractPrivate(), bar(0) {}
    virtual ~DerivedPrivate() = default;

    int bar;
};

class Derived : public Abstract
{
public:
    Derived() : Abstract(*(new DerivedPrivate)) {}
    Derived(const Derived &other) = default;
    ~Derived() = default;
protected:
    explicit Derived(AbstractPrivate &dd) : Abstract(dd) {}
public:
    void foo() const override { CONST_SHARED_D(Derived); cout << "foo " << d->var << " " << d->bar << endl; }
    void modifyVar() override { SHARED_D(Derived); d->var++; d->bar++; }
};

With the same main() code, what's the result?

foo 0 0
foo 1 16606417
foo 0 0

... big whoops, what is that random thing there? Well, if we use dynamic_cast in place of reinterpret_cast, the program simply crashes after ins->modifyVar();, indicating that ins's d_ptr.data() is not at all a DerivedPrivate.

virtual clones

The detach() method of QSharedDataPointer will by default create an instance of AbstractPrivate regardless of what the instance really is. Fortunately, it is possible to change that behaviour through specifying the clone() method.

First, we need to make a virtual function in AbstractPrivate class:

virtual AbstractPrivate *clone() const = 0;

(make it pure virtual just to force all subclasses to re-implement it; if your base class is not abstract you probably want to implement the clone() method) and then override it in DerivedPrivate:

virtual DerivedPrivate *clone() const { return new DerivedPrivate(*this); }

Then, specify the template method for QSharedDataPointer::clone(). As we will re-use it multiple times (for different base classes), it is better to define a macro:

#define DATA_CLONE_VIRTUAL(Class) template<>                      \
    Class##Private *QSharedDataPointer<Class##Private>::clone()   \
    {                                                             \
        return d->clone();                                        \
    }
// after the definition of Abstract
DATA_CLONE_VIRTUAL(Abstract)

It is not necessary to write DATA_CLONE_VIRTUAL(Derived) as we are never storing a QSharedDataPointer<DerivedPrivate> throughout the heirarchy.

Then test the code again:

foo 0 0
foo 1 1
foo 0 0

-- Just as expected! It continues to work if we replace Derived with Abstract in QScopedPointer:

QScopedPointer<Abstract> ins(new Derived());
QScopedPointer<Abstract> ins2(new Derived(* dynamic_cast<const Derived *>(ins.data())));

Well, another problem comes, that the constructor for ins2 seems too ugly, and messy. We could, like the private classes, implement a virtual function clone() for these kinds of things, but it is still not gentle enough, and we cannot use a default copy constructor for any class that contains such QScopedPointers.

What about QSharedPointer that is copy-constructable? Well, then these copies actually point to the same data structures and no copy-on-write is performed at all. This still not wanted.

the Descendents of ...

Inspired by Sean Parent's video, I finally come up with the following implementation:

template<typename T>
class Descendent
{
    struct concept
    {
        virtual ~concept() = default;
        virtual const T *ptr() const = 0;
        virtual T *ptr() = 0;
        virtual unique_ptr<concept> clone() const = 0;
    };
    template<typename U>
    struct model : public concept
    {
        model(U x) : instance(move(x)) {}
        const T *ptr() const { return &instance; }
        T *ptr() { return &instance; }
        // or unique_ptr<model<U> >(new model<U>(U(instance))) if you do not have C++14
        unique_ptr<concept> clone() const { return make_unique<model<U> >(U(instance)); }
        U instance;
    };

    unique_ptr<concept> m_d;
public:
    template<typename U>
    Descendent(U x) : m_d(make_unique<model<U> >(move(x))) {}

    Descendent(const Descendent & that) : m_d(move(that.m_d->clone())) {}
    Descendent(Descendent && that) : m_d(move(that.m_d)) {}

    Descendent & operator=(const Descendent &that) { Descendent t(that); *this = move(t); return *this; }
    Descendent & operator=(Descendent && that) { m_d = move(that.m_d); return *this; }

    const T *data() const { return m_d->ptr(); }
    const T *constData() const { return m_d->ptr(); }
    T *data() { return m_d->ptr(); }
    const T *operator->() const { return m_d->ptr(); }
    T *operator->() { return m_d->ptr(); }
};

This class allows you to use Descendent<T> (read as "descendent of T") to represent any instance of any subclass of T. It is copy-constructable, move-constructable, copy-assignable, and move-assignable.

Test code:

int main()
{
    Descendent<Abstract> ins = Derived();
    Descendent<Abstract> ins2 = ins;
    ins->foo();
    ins->modifyVar();
    ins->foo();
    ins2->foo();
}

It gives just the same results as before, but much neater and nicer -- How does it work?

First we define a class concept. We put here what we want our instance to satisfy. We would like to access it as const and non-const, and to clone it as-is. Then we define a template class model<U> where U is a subclass of T, and implement these functionalities.

Next, we store a unique_ptr<concept>. The reason for not using QScopedPointer is QScopedPointer is not movable, but movability is a feature we actually will want (in sink arguments and return values).

Finally it's just the constructor, moving and copying operations, and ways to access the wrapped object.

When Descendent<Abstract> ins2 = ins; is called, we will go through the copy constructor of Descendent:

Descendent(const Descendent & that) : m_d(move(that.m_d->clone())) {}

which will then call ins.m_d->clone(). But remember that ins.m_d actually contains a pointer to model<Derived>, whose clone() is return make_unique<model<Derived> >(Derived(instance));. This expression will call the copy constructor of Derived, then make a unique_ptr<model<Derived> >, which calls the constructor of model<Derived>:

model(Derived x) : instance(move(x)) {}

which move-constructs instance. Finally the unique_ptr<model<Derived> > is implicitly converted to unique_ptr<concept>, as per the conversion rule. "If T is a derived class of some base B, then std::unique_ptr<T> is implicitly convertible to std::unique_ptr<B>."

And from now on, happy hacking --- (.>w<.)