GemPBA Now Speaks Java

Table of Contents

GemPBA just learned a new language. 🎉

As of the v4.1 line, you can drive GemPBA’s parallel branch-and-bound engine straight from the JVM. Add one Maven dependency, write plain Java, and let the same battle-tested C++ scheduler do the heavy lifting underneath. No JNI to hand-roll, no native libraries to chase down, no LD_LIBRARY_PATH archaeology.

That’s a big door to open. Whole worlds of work live on the JVM (Spark pipelines, Airflow jobs, enterprise services), and until now, using a serious parallel optimizer from any of them meant writing C++ and bridging it yourself. Now you just import it.

The same engine, now in Java
#

The Java API mirrors the C++ one almost beat for beat. If you’ve used GemPBA before, this will look immediately familiar:

LoadBalancer lb = GemPBA.createLoadBalancer(BalancingPolicy.QUASI_HORIZONTAL);
NodeManager  nm = GemPBA.createNodeManager(lb);
nm.setGoal(Goal.MAXIMISE, ScoreType.I32);
nm.setThreadPoolSize(8);

Node seed = GemPBA.createSeedNode(lb, myTask, 0, intToBytes, bytesToInt);
nm.tryLocalSubmit(seed);
nm.waitForCompletion();

System.out.println(nm.getScore());
GemPBA.shutdown();

Set a goal, seed the recursion, wait, read the result. It’s the exact shape GemPBA users already know, now from Java.

How it stays fast
#

A binding is only worth having if it doesn’t tax you for the convenience, so this one was built to get out of the way:

One JAR, every platform. Each published artifact bundles native binaries for Linux, Windows, and macOS, and a loader picks the right one at runtime. You add a dependency, not a build system.
Bytes at the boundary. Java has no C++ templates, so task arguments and results cross over as bytes: you hand GemPBA a serializer/deserializer pair when you seed. The neat part is that on a single machine only the seed is serialized (children capture their arguments from the Java closure), so you pay that cost exactly where work actually moves. And the same serializer code scales from threads on one box to a full MPI cluster by flipping a single classifier. No rewrite.
The numbers hold up. On a depth-25 search tree (33M+ leaves) on a 24-core machine, the Java run tracked native C++ to within a fraction of a percent multithreaded, and paid only about 6% across processes. Convenience, basically for free.

Under all of it sits a new stable C ABI, the foundation a future Python, Rust, or .NET binding will sit on too. Java just happens to be the first guest.

The full Java guide, requirements (JDK 25, Maven 3.9+), and a pile of runnable examples are in the Java documentation and the gempba-java-examples repo.

Install it, don’t clone it
#

The other half of growing up: you don’t clone GemPBA to use it anymore. You install it, like any real library.

GemPBA now ships as native packages for apt, Homebrew, and MSYS2/pacman (and as a Maven dependency for Java). It comes in two flavors, plain multithreading and MPI-powered multiprocessing, that install side by side, and you pick the one you want at find_package time, not in your code. Your source stays identical either way. A bonus that quietly removes a lot of friction: multithreading builds no longer require MPI to be installed at all.

sudo apt install libgempba-dev          # multithreading
brew install rapastranac/gempba/gempba  # macOS

Add the -mpi package when you want to go distributed.

See what your run is actually doing
#

Long parallel runs used to be black boxes. You kicked off a job across 24 cores or six machines and just… waited. Not anymore.

Since v3.3.0, and on by default in every v4 build, GemPBA ships a runtime telemetry hub: live worker and node activity over TCP and MPI (on its own private communicator, so it never fights your application’s traffic), plus hardware topology probed through hwloc. Point a viewer at it and watch load balance, idle workers, and lopsided ranks in real time. Those are exactly the things that decide whether a big run finishes in minutes or hours. Don’t want it? One runtime call turns it off.

The telemetry docs cover how to turn it on, connect a viewer, and read the stream.

Moving up from v3
#

If you’re coming from the v3.0.0 redesign (announced here), the upgrade is gentle: the call site got simpler (an unqualified gempba::create_*), the multithreading/multiprocessing namespaces were renamed for clarity, the examples moved to their own repo, and apt’s base package is now multithreading-only (add -mpi for distributed). The release notes walk through every step.

Go build something big
#

GemPBA started life as a fast C++ research framework. It’s now an installable, JVM-speaking, self-observing distributed library, without giving up the performance that was the entire point.

Grab it from the docs, browse the internals on DeepWiki, or dive into the source. Then go point it at a problem that used to be too big.

Happy solving!

The same engine, now in Java#

How it stays fast#

Install it, don’t clone it#

See what your run is actually doing#

Moving up from v3#

Go build something big#

Related