Standalone Mode

The Wallaroo Engine can be used in a single-container standalone mode, without the full Wallaroo Kubernetes stack. This page will cover the intended workflow:

Install Prerequisites

  • Docker CLI

  • Python3, or Jupyter. This notebook can also be used if Jupyter is preferred. Python 3.8.6 has been tested but other versions may work. Installing the .whl file will bring in transitive dependencies.

pip install standalone-engine/wallaroo-0.0.24-py3-none-any.whl

Install the Engine image

The installation directory contains: * engine-image.tgz - A compressed TAR file of the x86-64 Wallaroo standalone engine Docker image. The image is based off the busybox:glibc image, which is derived from Debian GLIBC. * wallaroo-*.whl - The Wallaroo Python SDK

A desktop with working Docker CLI is required. First load the image into the local system and then list images to observe it was loaded:

$ docker load --input standalone-engine/engine-image.tgz

$ docker images | head -2
REPOSITORY                               TAG         IMAGE ID       CREATED         SIZE
ghcr.io/wallaroolabs/standalone-mini     latest      9f393d8dd074   18 hours ago    975MB

Use the SDK to generate models and configuration files

The Wallaroo standalone engine requires configuration and model files placed into the appropriate directory. The engine config is required before startup and the others can be provided any time before inference. The Wallaroo SDK enables this flow by generating the configuration files to place in this directory structure.

The engine configuration file location can be set by using the ENGINE_CONFIG_FILE environment variable. This defaults to /engine/config.yaml

The model and pipeline configuration directories can be set using the EngineConfig. They default to /modelconfig and /pipelineconfig.

The directory structure to which models are expected to adhere is

/models/<model_class>/<model_name>/<file>

The top level models directory can also be configured through the EngineConfig

Configurations in each directory must be named uniquely.

An example follows below, where we will write the files to an “engine” directory:

[ ]:
from wallaroo.pipeline_config import *
from wallaroo.model import *
from wallaroo.model_config import *
from wallaroo.engine_config import *
from wallaroo.standalone_client import *
from os import makedirs

builder = PipelineConfigBuilder.as_standalone(
    pipeline_name="pipeline-1", variant_name="v1"
)

model = Model.as_standalone("model_name", "model_version", "hello.onnx")

model_config = ModelConfig.as_standalone(model=model, runtime="onnx")

engine_config = EngineConfig.as_standalone(
    cpus=1,
    model_directory="/engine/model",
    pipeline_config_directory="/engine/pipelineconfig",
    model_config_directory="/engine/modelconfig",
)

builder.add_model_step(model_config)

makedirs("engine", exist_ok=True)
makedirs("engine/modelconfig", exist_ok=True)
makedirs("engine/pipelineconfig", exist_ok=True)
makedirs("engine/model", exist_ok=True)

with open("engine/modelconfig/my_model_config.yaml", "w") as model_config_file:
    model_config_file.write(model_config.to_yaml())

with open("engine/pipelineconfig/my_pipeline_config.yaml", "w") as pipeline_config_file:
    pipeline_config_file.write(builder.config().to_yaml())

with open("engine/engine_config.yaml", "w") as engine_config_file:
    engine_config_file.write(engine_config.to_yaml())

Launch standalone engine

As above, the container will require an environment variable pointing to the config file, a volume mount for the files, and a local port for inference. This command will run the container in the background. Docker will respond with the container ID.

$ docker run --detach --rm \
    --env ENGINE_CONFIG_FILE=/engine/config.yaml \
    --publish 29502:29502 \
    --volume `pwd`/engine:/engine \
    ghcr.io/wallaroolabs/standalone-mini:latest
585dc0b1f8e638241886b4a9f459b8f1bfb506029bbf0eb3940ca0bf69cd61c8

After the engine starts, it will continually monitor the configured directories for models and config files. In the above example, the configs were all provided but not the model. Providing the model will cause the engine to begin listening for inference.

$ cp hello.onnx engine/model

Perform inference

The SDK supports running inference by passing a Python dict as the input tensor to a standalone client’s infer method. The example will assume the container was launched locally as above and its port is available on localhost:29502.

To run an inference, make a client for your model or your pipeline as below.

[ ]:
#To run inference on a model
client = StandaloneClient("localhost", 29502, model=model)
client.infer({"tensor": [[1,2,3,4,5]]})

#To run inference on a pipeline
client = StandaloneClient("localhost", 29502, pipeline_config=builder.config())
client.infer({"tensor": [[1,2,3,4,5]]})

Standalone engine HTTP API

There is a simple HTTP REST API for status and inference. The examples again assume the engine is running on localhost:29502 as launched above.

The status calls are both GET methods: anything other than a Running status is an error and the container logs should be examined for messages.

Model Status

$ curl -s localhost:29502/models | jq .
{
  "models": [
    {
      "class": "id2",
      "name": "ver2",
      "status": "Running"
    }
  ]
}

Pipeline Status

$ curl -s localhost:29502/pipelines | jq .
{
  "pipelines": [
    {
      "id": "pipeline-1",
      "status": "Running"
    }
  ]
}

Inference

The inference call is a POST to the endpoints listed above, passing a JSON tensor as input.

$ curl -XPOST localhost:29502/pipelines/pipeline-1 \
    --data --data '{"text_input":[[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,28,16,32,23,29,32,30,19,26,17]]}'

[{"check_failures":[],"elapsed":12453700,"model_id":"id2","model_version":"version","original_data":{"text_input":[[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,28,16,32,23,29,32,30,19,26,17]]},"outputs":[{"Float":{"data":[0.001519620418548584],"dim":[1,1],"v":1}},{"Float":{"data":[0.9829147458076477],"dim":[1,1],"v":1}},{"Float":{"data":[0.01209956407546997],"dim":[1,1],"v":1}},{"Float":{"data":[0.000047593468480044976],"dim":[1,1],"v":1}},{"Float":{"data":[0.000020289742678869516],"dim":[1,1],"v":1}},{"Float":{"data":[0.0003197789192199707],"dim":[1,1],"v":1}},{"Float":{"data":[0.011029303073883057],"dim":[1,1],"v":1}},{"Float":{"data":[0.9975639581680298],"dim":[1,1],"v":1}},{"Float":{"data":[0.010341644287109375],"dim":[1,1],"v":1}},{"Float":{"data":[0.008038878440856934],"dim":[1,1],"v":1}},{"Float":{"data":[0.016155093908309937],"dim":[1,1],"v":1}},{"Float":{"data":[0.006236225366592407],"dim":[1,1],"v":1}},{"Float":{"data":[0.0009985864162445068],"dim":[1,1],"v":1}},{"Float":{"data":[1.7933298217905702e-26],"dim":[1,1],"v":1}},{"Float":{"data":[1.388984431455466e-27],"dim":[1,1],"v":1}}],"pipeline_id":"pipeline-1","time":1639425783513}]

Run integration tests

A simple Python test fixture is provided which creates and destroys engine containers, feeds them models, and performs inference. It also serves as a source of examples and demonstration of SDK use.

  1. Install the SDK and the test resources.

$ cd standalone-engine
$ pip install wallaroo-0.0.24-py3-none-any.whl
$ pip install -r test_requirements.txt
  1. Run the tests

$ PERF_RESOURCES=./test_resources TEST_RESOURCES=./test_resources pytest
================================================================================= test session starts ==================================================================================
platform darwin -- Python 3.9.8, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /Users/mnp/prj/platform/standalone_test/standalone-engine
plugins: benchmark-3.4.1, snapshot-0.6.1
collected 6 items

tests/test_standalone_engine.py ......                                                                                                                                           [100%]

================================================================================== 6 passed in 27.37s ==================================================================================