Last time I have been working on a C++ binary serialization framework that allows to serialize simple data structures with a few lines of code. First, you add AWL_REFLECT
macro to all your structures as follows:
#include "Awl/Reflection.h"
#include <string>
#include <vector>
#include <set>
struct A
{
int a;
bool b;
std::string c;
double d;
AWL_REFLECT(a, b, c, d)
};
struct C
{
int x;
A a;
AWL_REFLECT(x, a)
};
struct B
{
A a;
A b;
int x;
bool y;
std::vector<A> v;
std::set<C> v1;
AWL_REFLECT(a, b, x, y, v, v1)
};
Then you define Reader
and Writer
:
#include "Awl/Io/Vts.h"
// Generate std::variant containing all the data types participating in the serialization.
using V = awl::io::helpers::variant_from_structs<A, B>;
// Define the reader.
template <class IStream>
using Reader = awl::io::Reader<V, IStream>;
// Defined the writer.
template <class OStream>
using Writer = awl::io::Writer<V, OStream>;
and serialize structures A
and B
with the following code:
#include <iostream>
#include "Awl/Io/VectorStream.h"
int main()
{
// std::vector that will contain serialized data.
std::vector<uint8_t> v;
const A a_expected = { 1, true, "abc", 2.0 };
const C c_expected = { 7, a_expected };
const B b_expected = { a_expected, a_expected, 1, true, std::vector<A>{ a_expected, a_expected, a_expected }, { c_expected } };
try
{
// Write A and B.
{
// A stream that writes into std::vector.
awl::io::VectorOutputStream out(v);
// Serialization context.
Writer ctx;
// Write automatically generated meta information first.
ctx.WriteNewPrototypes(out);
// Write A and B.
ctx.WriteV(out, a_expected);
ctx.WriteV(out, b_expected);
}
// Read A and B.
{
// A stream that reads from std::vector.
awl::io::VectorInputStream in(v);
// Serialization context.
Reader ctx;
// Read the meta information first.
ctx.ReadOldPrototypes(in);
// Read A.
A a;
ctx.ReadV(in, a);
assert(a == a_expected);
// Read B.
B b;
ctx.ReadV(in, b);
assert(b == b_expected);
// Ensure we read entire stream.
assert(in.End());
}
}
catch (const awl::io::IoException& e)
{
std::cout << "IO error: " << e.What() << std::endl;
return 1;
}
return 0;
}
The serialization is version tolerant, this means that if you wrote A
and B
structures into a file and then added new fields to you structures or deleted some fields in a new version of your software you are still able to read your new A
and B
from that file. The only inconvenience with the deleting is that you should include the type of deleted field into std::variant
, for example if you delete field c
from structure A
you define std::variant
in the new version of your software as follows:
// Include std::string to the variant to make serialization engine aware of the type of the deleted field.
using V = awl::io::helpers::variant_from_structs<A, B, std::string>;
but if you delete field b
from structure A
you do not need to include bool
type to std::variant
because another fields of type bool
still exists in structure B
and so they are automatically included into std::variant
. Also if you delete v1
from B
you do not need to include std::set<C>
into std::variant
because std::vector<C>
and std::set<C>
are identical at the metadata level, they both are sequence<C>
.
You even able to rename a field by specializing FieldMap
class template. The code below renames B::v1
with B::v2
:
namespace awl::io
{
template <>
class FieldMap<B>
{
public:
static std::string_view GetNewName(std::string_view old_name)
{
using namespace std::literals;
if (old_name == "v1"sv)
{
return "v2"sv;
}
return old_name;
}
};
}
If you add a new field no action is required while the type of the new filed is known by the framework. If it is not you specialize its descriptor and overload Read
and Write
functions as the code below does for std::optional
:
namespace awl::io
{
template <class T>
struct type_descriptor<std::optional<T>>
{
static constexpr auto name()
{
return fixed_string("optional<") + make_type_name<T>() + fixed_string(">");
}
};
static_assert(make_type_name<std::optional<std::string>>() == fixed_string("optional<sequence<int8_t>>"));
template <class Stream, typename T, class Context = FakeContext>
requires sequential_input_stream<Stream>
void Read(Stream & s, std::optional<T>& opt_val, const Context & ctx = {})
{
bool has_value;
Read(s, has_value, ctx);
if (has_value)
{
T val;
Read(s, val, ctx);
opt_val = std::move(val);
}
}
template <class Stream, typename T, class Context = FakeContext>
requires sequential_output_stream<Stream>
void Write(Stream & s, const std::optional<T>& opt_val, const Context & ctx = {})
{
const bool has_value = opt_val.has_value();
Write(s, has_value, ctx);
if (has_value)
{
Write(s, opt_val.value(), ctx);
}
}
}
Advantages
- The advantages of this serialization technique is that it is simple, intuitive, has close to zero overhead and its performance is comparable with
std::memmove
. - It works directly with
C++
structures and does not require additional wrappers and generators like Protobuf does, for example, and thus allows to serialize template classes.
Limitations
- It is not cross-language and not cross-platform. For example, the representation of arithmetic types depends on the platform because the framework simply casts them to
uint8_t*
withreinterpret_cast
. - It allows to serialize only a tree of objects (but not a graph and even not a directed graph), because we do not have a mechanism that would prevent an object from being serialized twice (we do not compare the objects references as .NET or Java serialization engines do), so the serialization of a type like
std::share_ptr
can be problematic. - It requires all the types participating in the serialization to be default constructible that can be problematic in certain scenarios, for example, when we need to initialize
std::vector
with an instance of an allocator orstd::set
with an instance of a comparer.
Future Improvements
Non-default constructible types
We need to invent a mechanism to handle non default-constructible types. Assume we have two sets of the same type with different comparers:
#include <set>
#include <vector>
class Compare
{
public:
Compare(bool less) : m_less(less) {}
bool operator () (int a, int b) const
{
if (m_less)
{
return a < b;
}
return b < a;
}
private:
const bool m_less;
};
using Set = std::set<int, Compare>;
struct A
{
Set s;
std::vector<Set> v;
AWL_REFLECT(s, v)
};
int main()
{
A a { Set(Compare(true)), std::vector{Set(Compare(false))}};
// ...
awl::io::Write(<some-stream>, a);
// ..
// How to read it?
return 0;
}
How to read structure A
from a stream? Should the compares be serialized or not? If they should the next questions is what about the allocators?
Further usage of C++20 concepts
Another improvement is that we need to make is_tuplizable_v
and is_reflectable_v
not a boolean variables, but concepts as in the sample code below:
#include <concepts>
#include <iostream>
class A
{
public:
int foo() { return 25; }
};
template <class T>
concept self_fooable = requires(T& t)
{
t.foo();
};
static_assert(self_fooable<A>);
template <class T> requires self_fooable<T>
constexpr auto object_as_foo(T& val)
{
return val.foo();
}
template <class T>
concept fooable = requires(T & t)
{
object_as_foo(t);
};
static_assert(fooable<A>);
class B {};
constexpr auto object_as_foo(B&)
{
return 1;
}
static_assert(fooable<B>);
int main()
{
A a;
std::cout << object_as_foo(a) << std::endl;
return 0;
}
We need a separate self_fooable
concept because we can’t make as_tuple
not a member function, but a friend
function, because this will require the structure name to be AWL_REFLECT
macro parameter.
It is not clear enough how to define serializable
concept, for example, the code below is not quite correct because Read
and Write
accept different streams:
template <class Stream, class T>
concept serializable = requires(Stream& s, T& val)
{
Read(s, t);
Write(s, std::as_const(t));
};
Reflection for C++26
In a far future when C++ supports reflection we’ll probably get rid of AWL_REFLECT
macro.
Source Code
The framework is a part of AWL Library available on GitHub. Feel free to clone it and test.
JSON and other formats
https://github.com/getml/reflect-cpp