Python Serialization Cost Benchmark
Prequisites
An authenticated K8S cluster with istio and Seldon Core installed
You can use the ansible seldon-core playbook at https://github.com/SeldonIO/ansible-k8s-collection
vegeta and ghz benchmarking tools
Port forward to istio
kubectl port-forward $(kubectl get pods -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].metadata.name}') -n istio-system 8003:8080Tests
Large Batch Size
predictmethod with:REST
ndarray
tensor
tftensor
gRPC
ndarray
tensor
tftensor
predict_rawmethod with:REST
ndarray
tensor
tftensor
gRPC
ndarray
tensor
tftensor
Small Batch Size
predictmethod with:REST
ndarray
tensor
tftensor
gRPC
ndarray
tensor
tftensor
TLDR
gRPC is faster than REST
tftensor is best for large batch size
ndarray with gRPC is bad for large batch size
simpler tensor/ndarray is better for small batch size
Test with Predict method on Large Batch Size
The seldontest_predict has simply a predict method that does a loop with a configurable number of iterations (default 1) to simulate work. The iterations can be set as a Seldon parameter but in this case we are looking to benchmark the serialization/deserialization cost so want a minimal amount of work.
Create payloads and associated vegeta configurations for
ndarray
tensor
tftensor
We will create an array of 100,000 consecutive integers.
Smoke test port-forward to check everything is working
Test REST
ndarray
tensor
tftensor
This can be done locally as the results should be indicative of the relative differences rather than very accurate timings.
Example results
19.8ms
19.7ms
16.2ms
Test gRPC
ndarray
tensor
tftensor
Example results
253ms
8.4ms
5.5ms
Conclusions
gRPC is generally faster than REST except for ndarray which is much worse and should not be used with gRPC
tftensor is fastest
Test Predct Raw
Smoke test port-forward to check everything is working
Test REST
ndarray
tensor
tftensor
This can be done locally as the results should be indicative of the relative differences rather than very accurate timings.
Example results
13.3ms
13.3ms
11.1ms
Test gRPC
ndarray
tensor
tftensor
Example results
46ms
7.9ms
5.0ms
Conclusions
predict_rawis faster thanpredictbut you will need to handle the serialization/deserializtion yourself which maybe will make them equivalent unless specific techniques can be applied for your use case.
Test with Predict method on Small Batch Size
The seldontest_predict has simply a predict method that does a loop with a configurable number of iterations (default 1) to simulate work. The iterations can be set as a Seldon parameter but in this case we are looking to benchmark the serialization/deserialization cost so want a minimal amount of work.
Create payloads and associated vegeta configurations for
ndarray
tensor
tftensor
We will create an array of 100,000 consecutive integers.
Smoke test port-forward to check everything is working
Test REST
ndarray
tensor
tftensor
This can be done locally as the results should be indicative of the relative differences rather than very accurate timings.
Example results
1.8ms
1.8ms
2.1ms
Test gRPC
ndarray
tensor
tftensor
Example results
1.46ms
1.49ms
1.57ms
Conclusions
gRPC is generally faster than REST
There is very little difference between payload types with simpler tensor/ndarray probably being slightly faster
Last updated
Was this helpful?