Sandbox Layer
Overview
The Sandbox Layer is a central component of the IUDX-Novo platform, providing users with a controlled, scalable, and secure environment for data science and machine learning activities.
It serves as the primary interaction layer for analysts, data scientists, and researchers, enabling experimentation, model development, and analytical workflows through a notebook-driven interface.
Core Capabilities
The Sandbox Layer enables users to:
Spawn data analytics notebooks as the primary mechanism for interactive analysis
Load datasets catalogued in the Data Exchange using simple UI utilities within the notebook environment
Load models catalogued in the Data Exchange using similar discovery and access mechanisms
Interact with the Analytics Layer to perform data processing, experimentation, and model development
Create visualizations and dashboards that can be published or shared
Use MLOps utilities to:
Run training jobs, including long-running workloads
Host models for inference applications
Architectural Foundation

Figure 6: Internal Architecture of the MLOps Platform – Kubeflow ⚠️ Diagram to be updated / added for IUDX-Novo
The Sandbox Layer is built on the open-source MLOps framework Kubeflow, with additional platform-specific integrations.
Core Kubeflow Components Used
The following Kubeflow components are utilized directly within the Sandbox Layer:
Kubeflow Pipelines
Enables the definition of ETL pipelines using declarative specifications
Pipelines are translated into Argo workflows for execution
Katib
Provides built-in hyperparameter tuning capabilities
Supports automated experimentation and optimization workflows
KNative
Offers Kubernetes-based model hosting and serving
Supports scalable and production-ready inference endpoints
Notebooks
Implements JupyterHub-based notebook environments
Enables interactive analysis and experimentation
Dex
Provides internal authentication and authorization for notebook spawning
Integrates with platform identity services
Istio
Acts as the service and network management layer
Supplies a service mesh to manage communication between notebooks and controller components
Platform-Specific Integrations
In addition to native Kubeflow components, the Sandbox Layer includes several integrations to meet IUDX-Novo platform requirements:
Identity, authorization, and permissioning integrations between the Data Exchange and Dex
Model and dataset discovery and download integrated with the Data Exchange and MinIO
Downstream connectivity to data pipelines and analytical workflows
Job submission and management integrations with the Analytics Layer
Model hosting and inference integrations with the Analytics Layer
Compute credit management integrations with the Data Exchange
Scalability and Resource Management
The Sandbox Layer leverages Kubernetes to dynamically provision and manage compute resources, providing the following advantages:
On-demand spawning of notebook environments for users
Dynamic provisioning of executor clusters such as Ray or Dask
Enforcement of user-specific compute restrictions through Kubernetes namespaces
Isolation of resource usage and cost across users and workloads
Automatic scaling of compute resources based on workload demand and peak usage
High availability for all deployed system components
Role in the Platform
Acts as the primary user-facing execution environment for analytics and ML
Bridges the Data Exchange and Analytics Layer
Enables secure, scalable, and governed experimentation and model deployment
Last updated

