Standalone Mode¶
The Wallaroo Engine can be used in a single-container standalone mode, without the full Wallaroo Kubernetes stack. This page will cover the intended workflow:
Install prerequisites for local use
Install the engine image for local use
Use the SDK to generate models and configuration files for models and model configuration to the engine
Call the HTTP inference API directly
Install Prerequisites¶
Docker CLI
Python3, or Jupyter. This notebook can also be used if Jupyter is preferred. Python 3.8.6 has been tested but other versions may work. Installing the
.whl
file will bring in transitive dependencies.
pip install standalone-engine/wallaroo-0.0.24-py3-none-any.whl
Install the Engine image¶
The installation directory contains: * engine-image.tgz
- A compressed TAR file of the x86-64
Wallaroo standalone engine Docker image. The image is based off the busybox:glibc image, which is derived from Debian GLIBC. * wallaroo-*.whl
- The Wallaroo Python SDK
A desktop with working Docker CLI is required. First load the image into the local system and then list images to observe it was loaded:
$ docker load --input standalone-engine/engine-image.tgz
$ docker images | head -2
REPOSITORY TAG IMAGE ID CREATED SIZE
ghcr.io/wallaroolabs/standalone-mini latest 9f393d8dd074 18 hours ago 975MB
Use the SDK to generate models and configuration files¶
The Wallaroo standalone engine requires configuration and model files placed into the appropriate directory. The engine config is required before startup and the others can be provided any time before inference. The Wallaroo SDK enables this flow by generating the configuration files to place in this directory structure.
The engine configuration file location can be set by using the ENGINE_CONFIG_FILE
environment variable. This defaults to /engine/config.yaml
The model and pipeline configuration directories can be set using the EngineConfig
. They default to /modelconfig
and /pipelineconfig
.
The directory structure to which models are expected to adhere is
/models/<model_class>/<model_name>/<file>
The top level models directory can also be configured through the EngineConfig
Configurations in each directory must be named uniquely.
An example follows below, where we will write the files to an “engine” directory:
[ ]:
from wallaroo.pipeline_config import *
from wallaroo.model import *
from wallaroo.model_config import *
from wallaroo.engine_config import *
from wallaroo.standalone_client import *
from os import makedirs
builder = PipelineConfigBuilder.as_standalone(
pipeline_name="pipeline-1", variant_name="v1"
)
model = Model.as_standalone("model_name", "model_version", "hello.onnx")
model_config = ModelConfig.as_standalone(model=model, runtime="onnx")
engine_config = EngineConfig.as_standalone(
cpus=1,
model_directory="/engine/model",
pipeline_config_directory="/engine/pipelineconfig",
model_config_directory="/engine/modelconfig",
)
builder.add_model_step(model_config)
makedirs("engine", exist_ok=True)
makedirs("engine/modelconfig", exist_ok=True)
makedirs("engine/pipelineconfig", exist_ok=True)
makedirs("engine/model", exist_ok=True)
with open("engine/modelconfig/my_model_config.yaml", "w") as model_config_file:
model_config_file.write(model_config.to_yaml())
with open("engine/pipelineconfig/my_pipeline_config.yaml", "w") as pipeline_config_file:
pipeline_config_file.write(builder.config().to_yaml())
with open("engine/engine_config.yaml", "w") as engine_config_file:
engine_config_file.write(engine_config.to_yaml())
Launch standalone engine¶
As above, the container will require an environment variable pointing to the config file, a volume mount for the files, and a local port for inference. This command will run the container in the background. Docker will respond with the container ID.
$ docker run --detach --rm \
--env ENGINE_CONFIG_FILE=/engine/config.yaml \
--publish 29502:29502 \
--volume `pwd`/engine:/engine \
ghcr.io/wallaroolabs/standalone-mini:latest
585dc0b1f8e638241886b4a9f459b8f1bfb506029bbf0eb3940ca0bf69cd61c8
After the engine starts, it will continually monitor the configured directories for models and config files. In the above example, the configs were all provided but not the model. Providing the model will cause the engine to begin listening for inference.
$ cp hello.onnx engine/model
Perform inference¶
The SDK supports running inference by passing a Python dict as the input tensor to a standalone client’s infer
method. The example will assume the container was launched locally as above and its port is available on localhost:29502
.
To run an inference, make a client for your model or your pipeline as below.
[ ]:
#To run inference on a model
client = StandaloneClient("localhost", 29502, model=model)
client.infer({"tensor": [[1,2,3,4,5]]})
#To run inference on a pipeline
client = StandaloneClient("localhost", 29502, pipeline_config=builder.config())
client.infer({"tensor": [[1,2,3,4,5]]})
Standalone engine HTTP API¶
There is a simple HTTP REST API for status and inference. The examples again assume the engine is running on localhost:29502
as launched above.
The status calls are both GET methods: anything other than a Running
status is an error and the container logs should be examined for messages.
Model Status¶
$ curl -s localhost:29502/models | jq .
{
"models": [
{
"class": "id2",
"name": "ver2",
"status": "Running"
}
]
}
Pipeline Status¶
$ curl -s localhost:29502/pipelines | jq .
{
"pipelines": [
{
"id": "pipeline-1",
"status": "Running"
}
]
}
Inference¶
The inference call is a POST to the endpoints listed above, passing a JSON tensor as input.
$ curl -XPOST localhost:29502/pipelines/pipeline-1 \
--data --data '{"text_input":[[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,28,16,32,23,29,32,30,19,26,17]]}'
[{"check_failures":[],"elapsed":12453700,"model_id":"id2","model_version":"version","original_data":{"text_input":[[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,28,16,32,23,29,32,30,19,26,17]]},"outputs":[{"Float":{"data":[0.001519620418548584],"dim":[1,1],"v":1}},{"Float":{"data":[0.9829147458076477],"dim":[1,1],"v":1}},{"Float":{"data":[0.01209956407546997],"dim":[1,1],"v":1}},{"Float":{"data":[0.000047593468480044976],"dim":[1,1],"v":1}},{"Float":{"data":[0.000020289742678869516],"dim":[1,1],"v":1}},{"Float":{"data":[0.0003197789192199707],"dim":[1,1],"v":1}},{"Float":{"data":[0.011029303073883057],"dim":[1,1],"v":1}},{"Float":{"data":[0.9975639581680298],"dim":[1,1],"v":1}},{"Float":{"data":[0.010341644287109375],"dim":[1,1],"v":1}},{"Float":{"data":[0.008038878440856934],"dim":[1,1],"v":1}},{"Float":{"data":[0.016155093908309937],"dim":[1,1],"v":1}},{"Float":{"data":[0.006236225366592407],"dim":[1,1],"v":1}},{"Float":{"data":[0.0009985864162445068],"dim":[1,1],"v":1}},{"Float":{"data":[1.7933298217905702e-26],"dim":[1,1],"v":1}},{"Float":{"data":[1.388984431455466e-27],"dim":[1,1],"v":1}}],"pipeline_id":"pipeline-1","time":1639425783513}]
Run integration tests¶
A simple Python test fixture is provided which creates and destroys engine containers, feeds them models, and performs inference. It also serves as a source of examples and demonstration of SDK use.
Install the SDK and the test resources.
$ cd standalone-engine
$ pip install wallaroo-0.0.24-py3-none-any.whl
$ pip install -r test_requirements.txt
Run the tests
$ PERF_RESOURCES=./test_resources TEST_RESOURCES=./test_resources pytest
================================================================================= test session starts ==================================================================================
platform darwin -- Python 3.9.8, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /Users/mnp/prj/platform/standalone_test/standalone-engine
plugins: benchmark-3.4.1, snapshot-0.6.1
collected 6 items
tests/test_standalone_engine.py ...... [100%]
================================================================================== 6 passed in 27.37s ==================================================================================