How and Why to Serialialize Lambdas

Overview

Serializing lambdas can be useful in a number of use cases such as persisting configuration, or as a visitor pattern to remote resources.

Remote Visitors

For example, so I want to access a resource on a remote Map, I can use get/put, but say I just want to return a field from the value of a Map, I can pass a lambda as a visitor to extract the information I want.

MapView userMap =
     Chassis.acquireMap("users", String.class, UserInfo.class);
userMap.put("userid", new UserInfo("User's Name"));
// print out changes
userInfo.registerSubscriber(System.out::println);

// obtain just the fullName without downloading the whole object
String name= userMap.applyToKey("userid", u -> u.fullName);

// increment a counter atomically and trigger
// an updated event printed with the subscriber.
userMap.asyncUpdateKey("userid", ui -> {
     ui.usageCounter++;
     return ui;
});

// increment a counter and return the userid
int count = userMap.syncUpdateKey("userid",
      ui -> { ui.usageCounter++; return ui;},
      ui -> ui.usageCounter);

As you can see, it is easy to add various simple functions, or call a method to perform the action you need.  The only problem is that lambdas by default are not serializable.

Serializable Lambdas

A simple way to make a lambda serializable is to add a cast of & Serializable to a variable referring to an implementation of the lambda.

Function<UserInfo, String> fullNameFunc = (Function<UserInfo,String> & Serializable) ui -> ui.fullName;
String fullName = userInfo.applyToKey("userid", fullNameFunc);

As you can see this introduces a lot of boiler plate.  A key reason to use lambdas is to avoid boiler plate code so what is the alternative?

Making lambdas serializable in your API.

Unfortunately the standard APIs cannot be changed or sub-classes to add this but if you have your own API, you can use a Serializable interface.

@FunctionalInterface
public interface SerializableFunction<I, O> extends Function<I, O>, Serializable {
}

This interface can be used as a parameter type.

default <R> R applyToKey(K key, @NotNull SerializableFunction<E, R> function) {
    return function.apply(get(key));
}

The user of your API doesn't have to explicitly say that the lambda is serializable.

// obtain just the fullName without downloading the whole object
String name= userMap.applyToKey("userid", u -> u.fullName);

The remote implementation serializes the lambda, executes it on the server and returns the result.

Similarly, there is methods for applying a lambda to the map as a whole.

Query and subscription

To support queries, you can't use the built in stream() API if you want to implictly add Serializable. However, you can create one which is as similar as possible.

Map> collect = userMap.entrySet().query()
    .filter(e -> e.getKey().matches("u*d"))
    .map(e -> e.getValue())
    .collect(Collectors.groupingBy(u -> u.usageCounter));
or as a filtered subscription.

// print userid which have a usageCounter > 10 each time it is incremented.        userMap.entrySet().query()
        .filter(e -> e.getValue().usageCounter > 10)
        .map(e -> e.getKey())
        .subscribe(System.out::println);

What makes this different from the regular stream API, is the data could be distributed across many servers and you get a call back when that changes on any server. Only the data you are interested in is sent across the network as the filter and map is being applied on the server.

Java Serialization

Java Serialization is a good generalised, backwardly compatible serialization library. Two of the most common problems that alternatives try to solve is performance and cross platform serialization.

In the example above, fullNameFunc serializes to over 700 bytes and there is very limited options to optimise this to either reduce the size of the message or the amount of garbage it produces.  By comparison, a straight forward binary YAML serialization uses 348, with more options to optimise the serialization.

This raises the problem of how to serialize a lambda using an alternative, or cross platform or faster serialization format.

Alternative Serialization

You can hook into the current serialization mechanism. This is not supported, and it could change at any time but there is not other supported way to do this.

Never the less you can do this

Method writeReplace = lambda.getClass()
                                  .getDeclaredMethod("writeReplace");
writeReplace.setAccessible(true);
SerializedLambda sl = (SerializedLambda) writeReplace.invoke(lambda);
 This gives you an object you can inspect to extract the contents of the lambda.  Either to see what method it calls, or to serialize it.  On the deserialization side, you can recreate this object and can readResolve on that object.

Standard API

Currently, there is no standard API for introspection of a lambda.  This is done deliberately so that in future the implementation can be changed, although there is no public JEP to do so.  However, like Unsafe which is internal API, I look forward to the day when we can use a standard API rather than having to dig into the internals of the JVM to implement solutions.

Conclusions

With some changes to your API you can make serializing lambdas largely transparent to the developer.  This makes implementing simple distributed systems easier to use while giving you options to optimise how this is done.




Comments

Popular posts from this blog

Java is Very Fast, If You Don’t Create Many Objects

System wide unique nanosecond timestamps

Comparing Approaches to Durability in Low Latency Messaging Queues