Analytics Layer

Overview

The Analytics Layer is a core component of the IUDX AI Sandbox and provides batch processing capabilities for large-scale data analysis and computation.

The sandbox offers a notebook-based interface that allows users to develop analytical code. This code is executed on the Batch Processing Engine, which forms the underlying Analytics Layer.

The Analytics Layer enables users to process both structured and unstructured data in a scalable, controlled, and observable execution environment.

Batch Processing Capabilities

The batch processing framework supports end-to-end execution and management of analytical workloads and provides the following features:

Job Execution and Scheduling

  • Execution of batch jobs on the analytics compute infrastructure

  • Scheduling of jobs for deferred or periodic execution

Compute Resource Management

  • Submission of jobs across available compute resources

  • Allocation of resources with defined limits to ensure fair usage

  • Configuration of job-specific parameters

Monitoring and Observability

Users can:

  • Query the status of batch jobs

  • View execution progress and completion state

  • Access input and output logs generated during processing

  • Receive notifications related to job execution and failures

These capabilities enable transparent and debuggable batch workloads.

Data Workflow Capabilities

The Analytics Layer also supports data workflow orchestration, enabling the construction of complex processing pipelines.

Key workflow features include:

  • Creation of ETL pipelines

  • Storage and management of intermediate artefacts

  • Process automation using:

    • Workflow triggers

    • Step-based execution

    • Pipeline chaining

    • Cron-based scheduling

    • Related orchestration mechanisms

These features enable reproducible, automated, and scalable data processing workflows.


Last updated