Computer Graphics and Interactive Systems

Computer Graphics and
Interactive Systems Laboratory

ESIP

The satellite images could reveal information on the earth surface, weather, geographic areas, pollution, and natural phenomena. The Environment oriented Satellite Data Processing Platform (ESIP) is based on the gProcess platform developed by the MedioGrid national research project. ESIP layers on gProcess a set of Web and Grid services, and image oriented basic operators such as arithmetic (e.g. addition, division, exponent, complement etc), radiometric transformations (e.g. histogram equalization, mean, histogram scaling etc), spatial domain transformations (e.g. blurring, convolution etc), edge and line detection (e.g. gradient transformation), pseudo colouring, geometric transformations (e.g. rotation, scale etc), statistics (e.g. compute histogram, standard deviation pixel values etc), and no spatial domain transforms (e.g. forward and inverse Fourier transformation etc). A few Multispectral Imagery Operators has been experimented as well, such as Vegetation indices: DVI, NDVI, NDVI, IPVI, RVI, OSAVI, GEMI; Water indices: NDWI, VWC, WI; Nonlinear interpolation; and Change detection. ESIP provides the user with the possibility to explore the optimal solutions for Grid processing and information searching in the multispectral bands of the satellite images.

Operators

ESIP adds to gProcess a set of basic operators and algorithms for satellite image processing. These operators can be used to define various vegetation or water indices or other satellite image processing algorithms. Currently the ESIP supports TIFF images and operates on 512 x 512 tile dimension. Large images are processed in tiles.

  • Basic operators
    • Addition, Subtraction, Multiplication and Division: binary operators, where the two inputs represents two TIFF images, and the output is also a TIFF image
    • AddConst, SubtractConst, MultiplyConst, DivideConst: binary operators, one input is represented by a TIFF image, the other input is a float value, the output is also a TIFF image
  • Logical operators
    • LogicAnd, LogicOr, LogicXor: binary operators, applies the bitwise logical operation on the two input TIFF images, the result is also a TIFF image
    • LogicNor: unary operator, applies the bitwise logic not operation
  • Spatial domain transformation operators
    • Blur, Sharpern, EdgeDetection, ThresholdFilter, MeanValue, HistogramEq, HistogramScale, Blend: binary or unary operators, the input is TIFF images, the same is the output
  • Interpolation operators
    • CoarseToFine Interpolation: interpolation from a coarse grid to a finer grid, the coarse grid input is a NetCDF file, the finer grid is a HDF file, the output is a NetCDF file
    • FineToCoarse Interpolation: interpolation from a finer grid to a coarse grid, the coarse grid input is a NetCDF file, the finer grid is a HDF file, the output is a NetCDF file
  • Model calibration
    • Calibration: perform the BiomeBGC model calibration


ESIP based Grid application development methodology

The ESIP offers a Grid application development methodology, based on 6 steps:

  • Algorithm identification and analysis – concerning with selection of algorithms based on different criteria, such as parallelization capability, performance, resources, etc.;
  • Data model definition – deals with data identification, and in addition concerns with data management;
  • Atomic parts identification – the gridification phase uses atomic parts to define workflows of execution;
  • Algorithm implementation – the atomic parts are implemented in different programming languages;
  • ESIP based process description – the algorithm is described by Process Description Graph (PDG). This PDG is then instantiated to concrete datasets (iPDG – instantiated PDG). The set of services exposed by the ESIP platform are used to execute and monitor the selected iPDG;
  • User interface development – the GUI and user interaction techniques are created.


Monitoring execution

The monitoring solution is based on MonALISA (Monitoring Agents using a Large Integrated Services Architecture). It allows collecting various types of parameters about computational nodes, networks as well as custom application parameters. For visualization, MonALISA offers a Java application used for real-time data analysis. The ApMon component sends the monitoring parameters to the MonALISA repository. The ApMon is a Java library that helps to collect different information by using some default sensors, such as run_time (elapsed time from the job start), cpu_time (processor time spent for job run), mem_usage (percent of the memory occupied by the job), cpu_usr (percent of the time spent by the CPU in user mode), load1 (average system load over the last minute), mem_used (amount of currently used memory), net_in (network (input) transfer in kBps), hostname (the machine's hostname), cpu_MHz (CPU frequency), no_CPUs (number of CPUs), etc.
ESIP features

User management

A valid certificate is required to access and to use the ESIP platform. It can be either stored in the ESIP database or it can be retrieved from a MyProxy server.

Data management

ESIP uses the JDL InputSandbox to specify small input files. For larger files we are using a Storage Element to store and then to retrieve the input and output data.

iPDG execution steps

  • Parse the XML structure and create an internal structure that consists of resources, services, subgraphs and operator objects
  • Expand the subgraphs and create a graph representation using nodes and link objects
  • Identify the inputs required by the processing nodes (i.e. operators and services)
  • If the required inputs are available than it starts a thread responsible with the node execution. The system creates a thread only for the operators that already have the required inputs
  • Start the threads by a Java ExecutorService and use a fixed thread pool
  • Execute the lifecycle of the operator
  • Consume the Web Services
  • After all the operator nodes were submitted, the system checks periodically the execution status. After a job output is stored onto the server, this result may be accessed and visualized