How to create new image operations

From Endrov

This section describes the specifics of a new image operation. If you want to know how to put it in a plugin, see How to make plugins.

Contents

Overview

Endrov has a clear separation between the image processing algorithms and the user interface.

  • Image operations are accessed using Flows. If the algorithm is to be available to the user, an implementation of FlowUnit is required.
  • The algorithm is in an implementation of EvOpGeneral. It is possible to put the algorithm in a flow only but this is strongly recommended against.

The algorithm

Algorithms work on one of the following levels (reads and writes)

  • Image planes
  • Stacks
  • Channels

with increasing complexity. The user is assumed to want to be able to use it on any level, since e.g. an image plane is just a trivial stack or channel. EvOpGeneral requires that an algorithm can handle all levels which would give the programmer a lot of work. Instead most programmers want to use either of

  • EvOpSlice
  • EvOpStack
  • EvOpChannel

which requires an implementation for a single level and then makes it available for all other layers as well.

To learn the details, read the code for a simple algorithm e.g. EvOpImageAddImage

Operation classes should be named EvOpAbcDef. This signals that this class is meant to be used by others (not a helper class) and it can be listed with other operations using auto-complete (ctrl+space in Eclipse).

Accessing pixel data

Each image plane is stored in an EvPixels class. Pixels are of types:

  • Unsigned 8-bit integer
  • Signed 16-bit integer
  • Signed 32-bit integer
  • 32-bit float
  • 64-bit float

The workflow philosophy requires that all formats are supported by all algorithms, to longest extent. On the other hand, this should not give the developers any additional work. Endrov employs implicit conversion:

8bit int -> 16bit int -> 32bit int -> 32bit float -> 64bit float

A conversion is just a cast, no multiplication. The data will always be in range when it follows the arrows, hence the conversion is safe. If an algorithm only support types to the left of the current, then only then will it convert in the left direction. It is up to the user to make sure the data fits the type.

More details will follow on how this is implemented. As of writing, only specific types are supported, mainly 32bit int and 64bit float.

Conversion is done like this:

a=a.convertTo(EvPixels.TYPE_INT, true);

Here true means that the data is read-only. If the data already is of the right-type then this operation will not create a copy of the data, hence it's for free. Given the policy of never modifying the data, this is how it will normally be used.

After conversion, get the array using the right method for the type (you will get an error if you use the wrong method, if you are lucky):

int[] aPixels=a.getArrayInt();

Despite that the array is 1-dimensional, it actually contains the entire image plane. This is an optimization (memory locality and alignment). The index corresponding to (x,y) is calculated as index=y*width+x or using the methods in EvPixels. Calculating it manually results in faster algorithms than invoking the method. Further optimizations can be done by avoiding multiplications e.g:

int index=y*width+startx; //Point to the current line
for(int x=startx;x<xend;x++)
 {
 f(array[index]); //do something with pixel. index is (x,y)
 index++; //Move to the pixel to the right
 }
}

Only one multiplication is used for the entire line which is a rather big saving. Also:

int i=y*width+x; //Reference pixel
int left=i-1;    //To the left
int right=i+1;   //To the right
int up=i-w;      //Above
int down=i+w;    //Below

Internals: Laziness

The Endrov workflow philosophy is that data should never be modified and the user should never have to take care to keep computational time down. For example, when adjusting the contrast/brightness of a channel, it should be possible to do so on all the data. If this was implemented head-on then it could trigger 40GB data to be loaded from disk and calculated. To avoid this, Endrov uses lazy semantics - the result is only calculated whenever it is required. Contrast/brightness would only be evaluated for the image plane the user is looking at.

To give an example, 1234*5678 is easy to remember and carry around. You carry the expression, you do not evaluate it to first produce 7006652. Only if someone asks you for the particular number would you do so. The operation here is "*"; in case of Endrov, it could have been EvOpImageMulImage.

The lazy machinery is all hidden in EvOp*. EvOpSlice will lazily load one image plane at a time, EvOpStack will load one stack at a time and so on. This makes EvOpSlice-operations particularly cheap to work with. The only reason these cannot always be used is that they enforce the operation to be 2D. Some operations can be made almost-2D for example convolving with with a 3x3x3 kernel, which would only require 3 planes to be loaded to evaluate a single plane. If you wish to implement this optimization then you have to step back and use EvOpGeneral. Study EvOpSlice to see how the internals work.

Flows

Every unit that can be part of a flow inherits FlowUnit. This class is complicated, most programmers should rather extend FlowUnitBasic. A unit consists of the following:

  • Rendering of the component (done by FlowUnitBasic)
  • Optional Swing component (FlowUnitBasic has none)
  • Handling of sub-units (FlowUnitBasic has none)
  • Definition of input and output argument names and types (functions: getTypes*)
  • Loading and saving of unit data to file (functions: to/fromXML)
  • The algorithm (function: evaluate)

It is most constructive to study the code of existing simple units, such as FlowUnitAdd.

Future development

Operations should be able to run on OpenCL once it becomes mainstream. Because so many users will not have access to it for years, both pure-Java and OpenCL implementations will be necessary. The memory bus is a big potential bottle neck. EvOp* has to take care of data transport so that consecutive OpenCL-algorithms can work on the data without an additional transport.

Algorithms will be templated for all data formats. Java Generics cannot do this because it's based on erasure. Details on how this is circumvented will come later.

Design decision: Stacks vs image planes

Stacks are split into multiple arrays (EvPixels) to speed up loading time and conserve memory. This makes the memory discontinuous which makes some optimizations impossible. I judged that this optimization was not worth the problems it would bring, and additional optimizations should have been done in such a case. For example, the memory should be aligned on 4-byte or 8-byte boundaries to speed up FFTW. With the coming of OpenCL it is even harder to predict what one would have to do in addition, but for sure all these things are making life harder for the general users, helping only a few extreme users.

Design decision: Pixel data

Choosing the trade-offs were difficult. It's based on the following observations:

  • Both integer and floating point formats are needed for different applications
  • Many integer image operations can produce negative data
  • Microscopes internally produce unsigned 8, 12 and 16bit data
  • Java cannot easily cope with unsigned data
  • It is frustrating as a user to manually cope with algorithms supporting different formats
  • The pixel intensity range is a computer boundary and has nothing to do with the physics, but it matters when data is displayed