2018-04-04

GitHub project: mikenakis-agentclaire

A Java Agent to end all Java Agents.

The mikenakis-agentclaire logo
based on a piece of clip art found on the interwebz.

The Problem


Java Agents are a feature of the Java Virtual Machine which is used for transforming the bytecode of classes as they are loaded and before they are used. We register a Java Agent with the JVM at startup, and each time a class is loaded, the JVM will invoke our java agent to give it a chance to transform the class.

The JVM supplies the Java Agent with each class to be transformed in the form of an array of bytes, and expects the Java Agent to return the transformed class also as an array of bytes.

This means that the Java Agent has to parse the array of bytes in order to build some kind of object graph representing the class, manipulate the object graph to apply the intended transformation, and then re-pack the object graph into another array of bytes before returning it to the JVM.

Furthermore, the class being transformed may reference other classes, which the Java Agent may need to examine in order to perform the intended transformation, so for each class being transformed the java agent may need to load multiple arrays of bytes from the filesystem, and do more parsing of bytes and building of object graphs in order to make sense out of them.

Now, a program may be started with multiple Java Agents attached to it, so all this parsing and repacking will be repeated by each Java Agent. 

This represents an insane amount of overhead.

Most people do not mind this overhead, because it is only incurred during application startup, and for some reason it has come to be that the entire computing industry today is resigned to slow-as-molasses startup times that are simply to be endured as a fact of life.

I beg to differ. I like things to be snappy, especially at startup, because as a programmer, waiting for the application that I am developing to start up tends to represent a considerable portion of my working day.  

Also, the same bytecode transformations are usually performed when running tests, and I want my tests to be running as quickly as possible, without unnecessary overhead.


The Solution


AgentClaire is a Java Agent that can be used to simplify and optimize the process of examining and transforming bytecode during application startup.

With AgentClaire, instead of writing Java Agents, you write AgentClaire Interceptors.

The difference between a Java Agent and an AgentClaire Interceptor is that instead of receiving an array of bytes and returning an array of bytes, the interceptor is given an instance of `ByteCodeType` (see the `mikenakis-bytecode` project) which has been constructed by parsing the original array of bytes just once.

Then, once each AgentClaire Interceptor has had a chance to modify the `ByteCodeType`, AgentClaire will take care of repacking it just once into an array of bytes before returning it to the JVM

This way, a considerable amount of time is saved during startup by parsing and repacking bytes only once per class, instead of once per class per java agent.

Furthermore, the `Umbilical` interface that is provided by AgentClaire contains a `ByteCodeService` service which can be used to obtain an instance of `ByteCodeType` given a class name and a `ClassLoader`. The `ByteCodeService` achieves this either by looking up the `ByteCodeType` from a cache of all those that have already been loaded, or by loading the class bytes and parsing them to construct a new instance of `ByteCodeType`.

This way, an interceptor can very easily and very efficiently obtain classes that are referenced by the class being transformed. Furthermore, the `ByteCodeService` is also available to the main program, so it can have read-only access to bytecode information about its own classes.  This way, not a single class ever needs to be parsed from bytes more than once throughout the lifetime of the JVM.


Usage


AgentClaire is started via the `-javaagent` option of the JVM.  The option looks like this:

    -javaagent:<AgentClaire-jar>=<interceptor-jar>[;<another-interceptor-jar>...]

This will cause the JVM to load AgentClaire before the main program runs, and AgentClaire will then load each of the specified interceptor jars. Each interceptor jar must contain a `MANIFEST.MF` with an `AgentClaire-Interceptor-Class` entry that gives the fully qualified name of a class to be instantiated by AgentClaire.  (No stinkin' static entry points.)

This class must have a constructor which accepts two parameters:
 - a `Umbilical` interface (See the `mikenakis-umbilical` project)
 - an `AgentClaire` interface (See the `mikenakis.agentclaire-main` project.)

The constructed class can use the `AgentClaire` interface to register itself as an `Interceptor`. Once this is done, then the interceptor's entrypoint will be invoked for each class that is loaded by the JVM, giving the interceptor a chance to modify it.



No comments:

Post a Comment