Skip to content

Latest commit

 

History

History
86 lines (65 loc) · 3.7 KB

README.md

File metadata and controls

86 lines (65 loc) · 3.7 KB

repo-checks

Minimal Tensor Serving Python Client

A lightweight python client to communicate with Tensor Serving.

Description

Communicating with Tensorflow models via Tensor Serving requires gRPC and Tensorflow-specific protobufs. The tensorflow-serving-apis package on PyPI provides these interfaces, but requires tensorflow as a dependency. The Tensorflow python package currently stands at 700 Mb, with much of this space dedicated to libraries and executables required for training, saving, and visualising Tensorflow Models; these libraries are not required at inference time when communicating with Tensorflow Serving.

This package exposes a minimal Tensor Serving client that does not include Tensorflow as a dependency. This reduces the overall package size to < 1 Mb. This is particularly useful when deploying web services via AWS Lambda that need to communicate with Tensorflow Serving, as Lambda carries a size limit on deployments.

Install from PyPi

This is the quickest way to get started! Just run:

pip install min-tfs-client

Installing from source

git clone https://github.com/zendesk/min-tfs-client.git
cd min-tfs-client
pip install .

Development Installation

For dev installation, run pip install -e . instead of pip install .. Also, you will require tensorflow-model-server and tensorflow to be installed to run and modify the integration tests. Specifically:

  1. tensorflow is required to run the model generation script (tests/integration/fixtures) that creates a test model for integration testing. It is not required to just run the tests.
  2. tensorflow-model-server is required to serve the model to perform the integration test. The commands that are used to run these tests in Travis are contained in .travis.yml.

Usage

Basic Usage

from min_tfs_client.requests import TensorServingClient
from min_tfs_client.tensors import tensor_proto_to_ndarray

client = TensorServingClient(host="127.0.0.1", port=4080, credentials=None)
response = client.predict_request(
    model_name="default",
    model_version=1,
    input_dict={
        # These input keys are model-specific
        "string_input": np.array(["hello world"]),
        "float_input": np.array([0.1], dtype=np.float32),
        "int_input": np.array([2], dtype=np.int64),
    },
)
float_output = tensor_proto_to_ndarray(
    # This output key is model-specific
    response.outputs["float_output"]
)

Running tests

Run all tests with

pytest -v tests/

Run a single test file with

pytest <path_to_test_file>

Run unit / integration tests with

pytest tests/<unit or integration>

Updating upstream changes

See this README for instructions on how to update the protobuf definitions from tensorflow/tensorflow and/or tensorflow/serving.

Contribution Guidelines

Improvements are always welcome. Please follow these steps to contribute:

  1. Submit a Pull Request with a detailed explanation of changes
  2. Receive approval from maintainers
  3. Maintainers will merge your changes

Licence Information

Use of this software is subject to important terms and conditions as set forth in the LICENSE file.

The code contained within protobuf_srcs/tensorflow is forked from Tensorflow, and the code contained within protobuf_srcs/tensorflow_serving is forked from Tensorflow Serving. Please refer to the individual source files within protobuf_srcs for individual file licence information.