The PySOA API ============= Services in PySOA are built around a client-server model with a simple service API. The core abstractions of this API are Jobs and Actions. An Action is a unit of work performed by a Server, and a Job is a container of one or more Actions. Each PySOA request and response encompasses exactly one whole Job. Note that, in PySOA, all "strings" are Unicode strings (``str`` in Python 3 and ``unicode`` in Python 2) and all "bytes" are byte strings (``bytes`` in Python 3 and ``str`` in Python 2). It is an error to interchange these, and will result in Job or Action errors, or possibly even client-side exceptions. .. contents:: Contents :local: :depth: 3 :backlinks: none Basic request flow ++++++++++++++++++ A Client may make a request to a Server by sending it a ``JobRequest`` (in ``dict`` form) containing one or more ``ActionRequest`` objects (also in ``dict`` form), along with some header information. The Server processes the Job by executing each Action in the Job, in order, using the appropriate Action handler method for each Action. The results returned or exceptions raised by each Action handler are collected into a ``JobResponse``. When the Job execution terminates, either because all of the Actions have been processed or due to an exception, the Server returns the ``JobResponse`` (in ``dict`` form) to the Client. Message data structures +++++++++++++++++++++++ PySOA has several key data structures that you should understand before using it. ``JobRequest`` ************** The ``JobRequest`` object contains a list ``actions`` of ``ActionRequest`` objects along with two headers: ``control`` and ``context``. Both of these headers are provided to the action handler when processing each ``ActionRequest``. In general, the ``control`` header is used for flags that affect the execution of the Job itself, such as how to handle Action errors. The ``context`` header is for everything else, including switches (see `Versioning using switches`_), locale, and metadata. All fields in the ``JobRequest`` are required. - ``control`` stores values that affects the execution of the job itself. It's free-form, so technically any key is permitted (your middleware could take advantage of additional keys), but these are the keys currently supported natively in PySOA: + ``continue_on_error``: Tells the Server to continue processing the Job if any Action results in an error (``bool``) - ``context`` stores other information about the request useful to the Job or Action code or middleware. Like ``control``, it is free-form, but these are the keys currently supported natively in PySOA: + ``switches``: A list of switch values (see `Versioning using switches`_) + ``correlation_id``: A unique ID that is generated by the Client and follows the Job and must be passed to any other service calls made while processing the Job and its Actions, it is used to facilitate logging, metrics, and more - ``actions`` is a list containing ``ActionRequests``. ``ActionRequest`` ***************** - ``action`` is the required name of the Action to invoke and must be a string. - ``body`` is a dict containing arguments for the Action and must match the request schema (if any) the request has defined. It is required if and only if the Action code has a required request schema. It must be ``None`` or an empty ``dict`` if and only if the Action code has an empty request schema. ``JobResponse`` *************** - ``actions`` is a required list of ``ActionResponses``. - ``errors`` is a list of ``Errors``, which will be an empty list (not ``None``) if no Job errors occurred (even if some Action errors occurred). ActionResponse ************** - ``action`` is the required name of the action that was invoked and will be a string. - ``body`` is a dict containing the response from the Action. It will always be present, but perhaps as merely an empty ``dict`` if the Action has no response. - ``errors`` is a list of ``Errors``, which will be an empty list if no errors occurred for this ``ActionRequest``. ``Error`` ********* - ``code`` is a required, machine-readable string error code. - ``message`` is a required, human-readable string error message, which your services may (optionally) localize using a locale field included in the ``JobRequest``'s ``context`` header. - ``field`` is an optional identifier of the form ``'field[.subfield[.sub_subfield[....]]]'`` providing the full path of the field in the ``ActionRequest`` that caused the error, if applicable. - ``traceback`` is an optional string containing the formatted exception stacktrace, if any, that applies to the error. - ``variables`` is an optional ``dict`` of variable names and their values, if any, that apply to the error. - ``denied_permissions`` is an optional ``list`` of unicode string permission codes, names, or other symbols. If a lack of permissions is the proximal cause of the error, you might find it useful to return with the error a list of the missing permissions so that the client can adjust, if possible, or inform the user of more useful information. This is fully optional and organization-specific. PySOA itself has no permissions features in it. Servers +++++++ The ``pysoa.server`` module contains everything necessary to write a PySOA service. The ``Action`` class provides the parent class for your service's Actions, which are the main units of business logic, while ``Server`` class provides the framework for running a request-response loop. The ``Server`` is the heart of any PySOA service. It provides a standard Job processing workflow and an interface for subclasses to implement Action handlers. It also provides a simple command line interface. In general, ``Server`` subclasses will need to define two things: the service name and a mapping of Action names to Action handlers. Subclasses may also perform additional setup by overriding the ``setup`` method of the base class. Other override possibilities are ``perform_pre_request_actions``, ``perform_post_request_actions``, and ``perform_idle_actions``, though, in most cases, you should simply use middleware for such special needs. Subclasses should not need to override any other methods on the base class. The ``Action`` class provides an interface allowing subclasses to easily validate input, execute business logic, and validate output. Validation is performed by `Conformity `_ schemas, allowing for simple, declarative input and output checking. It automatically handles validating the ``dict`` returned by the ``run`` method and placing it into an ``ActionResponse`` object. ``Server`` ********** All services provide a class that extends ``Server``. For full documentation of all of its properties and methods, see the `Server reference documentation `_. Class Attributes - ``service_name``: The string name that the service will use to identify itself, and that Clients will use to call it - ``action_class_map``: A mapping of Action names to handlers, which are ``Action`` subclasses (in most cases it will be a ``dict``, but it may be any object that implements ``__getitem__`` and ``__contains__``) - ``use_django``: If this is ``True``, ``Server.main`` will import settings from Django. If it is ``False`` (the default), it will not import or use Django in any way - ``settings_class``: In many cases, you can simply rely on the default settings class (``ServerSettings``), but you may provide some other class that extends ``ServerSettings`` if you want to use the settings framework to bootstrap special settings for your service instead of using some other settings framework (such as Django) Methods - ``setup``: Performs service-specific setup and takes no arguments - ``main``: Class method that allows the ``Server`` to be run from the command line ``Action`` ********** Your Actions do not have to extend ``Action``. An Action may be any callable object that, when invoked with a single argument (the Server settings), returns a new callable object that accepts a single ``ActionRequest`` argument and returns an ``ActionResponse`` object. In practical terms, ``Action`` takes care of much of this heavy lifting for you, so it is advisable that your Actions extend ``Action``. For full documentation of all of its properties and methods, see the `Action reference documentation `_. Class Attributes - ``request_schema``: A Conformity schema defining the structure of the request body. - ``response_schema``: A Conformity schema defining the structure of the response body. Instance Attributes - ``self.settings``: The Server's full settings object (which can be accessed like a ``dict``) Methods - ``validate``: An optionally-overridden method that performs custom validation, takes an ``ActionRequest`` object as input, and raises an ``ActionError`` to signal validation failure (in which case ``run`` will not be invoked) - ``run``: The main business logic method that must be implemented, takes an ``ActionRequest`` as input, and returns a ``dict`` matching the schema defined in ``response_schema`` or raises an ``ActionError``, and will only be invoked if ``validate`` is not overridden or completes without raising any exceptions Server configuration ******************** The ``Server`` base class takes configuration in the form of a dict with the following basic structure, plus any additional settings that you may have defined if you overrode the ``server_settings`` class property: .. code-block:: python { "transport": , "middleware": [, ...], "client_routing": , "logging": , "harakiri": { "timeout": , "shutdown_grace": , } } Key - ````: See `Transport configuration`_ for more details; the base ``Server`` defaults to using the `Redis Gateway Transport`_ - ````: See `Middleware configuration`_ for more details - ````: Configuration for a ``Client`` that can be used to make further service calls during Action processing. See `Client configuration`_. - ````: A dictionary that will be used to configure the Python ``logging`` module at Server startup (`logging config schema `_). - ````: The server will shut down if it is inactive for this many seconds, which may be because the transport receive malfunctioned, or because a Job or Action is taking too long to process - ````: When shutting down after ````, the server will wait this many seconds for any existing Job to finish before aborting the Job and forcing shutdown For full details, view the sections linked above and the `ServerSettings reference documentation `_. Django integration ****************** The ``Server`` class is able to get configuration from Django settings automatically. If the ``use_django`` property on the ``Server`` subclass is ``True``, the ``main`` method will automatically import the Django settings module and look for configuration under the name ``SOA_SERVER_SETTINGS``. The ``Server`` will also perform strategic resource cleanup before and after each request when Django integration is enabled, mimicking the behavior of the following Django features: * The database query log will be reset before each received request is handled. * Old database connections will be closed after each response is sent and also when the server has been idle for some time without handling any requests. * The ``close`` method will be called on all configured Django cache engines. To make your caches work ideally in a PySOA server environment, we recommend you use one or more of the following cache engines in your services, or similarly override the ``close`` method in your own cache engines: * ``pysoa.server.django.cache.PySOARequestScopedMemoryCache`` - The ``close`` method clears the request completely, so that the cache gets cleared at the end of every job request. * ``pysoa.server.django.cache.PySOAProcessScopedMemoryCache`` - The ``close`` method does nothing, so that the cache lasts for the entire server process. * ``pysoa.server.django.cache.PySOAMemcachedCache`` - The ``close`` method closes connections to Memcached only when the server is shutting down (not at the end of every request). * ``pysoa.server.django.cache.PySOAPyLibMCCache`` - The ``close`` method closes connections to Memcached only when the server is shutting down (not at the end of every request), and only on Django versions older than 1.11.0. (`As of Django 1.11.0 `_, the ``PyLibMCCache`` implementation does not close connections. Instead, it lets the library connection management take care of this.) Settings without Django *********************** If ``use_django`` is ``False`` (the default), the ``main`` method will require a command line ``-s`` or ``--settings`` argument. This must be the absolute name of a module, which PySOA will import. PySOA will then look for an attribute on that module named ``SOA_SERVER_SETTINGS`` or ``settings``, in that order of preference. .. _api-versioning-using-switches: Versioning using switches ************************* Switches are like a special argument that every Action in a job gets. In terms of code, switches are simply integers passed by the Client in the ``control`` header of every ``JobRequest``, and then by the Server into every Action in that Job. To provide more flexibility for your switch definitions, switch objects and constants used in PySOA can be *any* object that provides the method ``__int__``, or *any* object that provides the attribute ``value`` whose value provides the method ``__int__``. (Switches must, however, be sent over the wire as simple integers within the PySOA protocol.) Switches are a type of feature flag and came from a need to version individual service Actions, rather than versioning the whole service at once. There are several ways to use switches. Here are just two examples: .. code-block:: python ... from pysoa.server.action.switched import SwitchedAction from my_user_service.switches import USER_VERSION_2_ENABLED ... class UserActionV1(Action): ... # version 1 schema and business logic class UserActionV2(Action): ... # version 2 schema and business logic class UserTransitionAction(SwitchedAction): switch_to_action_map = ( (USER_VERSION_2_ENABLED, UserActionV2), (SwitchedAction.DEFAULT_ACTION, UserActionV1), ) .. code-block:: python ... from my_user_service.constants import USER_VERSION_2_ENABLED ... class UserAction(Action): ... # schema that applies to all versions def run(self, request): if request.switches.is_active(USER_VERSION_2_ENABLED) ... # version 2 business logic else: ... # version 1 business logic In the first example, the helpful ``SwitchedAction`` is utilized. This is a specialized wrapper Action that defers to other Actions based on enabled switches (or to the last or default Action if no matches are found). Using this technique, you can have different request and/or response schemas depending on a switch, effectively applying transitional versioning to the entire service Action. In your ``Server`` class, you just need to map a single action name to your ``UserTransitionAction`` (instead of mapping anything directly to ``UserActionV1`` or ``UserActionV2``), and the code in ``SwitchedAction`` takes care of the rest. For more detailed information about this approach, see the `SwitchedAction reference documentation `_. In the second, simpler example, you only have one Action class (so your request schema and response schema remain unchanged regardless of the switch supplied), but you can still alter the behavior (perhaps with different permissions, validation rules, or storage logic, etc.) by checking whether a switch is active directly within your Action's ``run`` code. Clients +++++++ Code that needs to call one or more services will do so using a ``Client``. A single ``Client`` instance can be configured to call one or more services—you do not need to create a different client for each service. The ``client`` submodule provides the ``Client`` class as well as base classes for settings and middleware. Unlike the ``Server``, ``Client`` will generally not be subclassed unless there is a need to add nonstandard behavior on top of the base ``Client``. The ``Client`` provides both blocking and non-blocking methods, and you should exercise caution when using them together. If you call the non-blocking method to send a request, followed by a blocking method to send-and-receive, you could encounter errors. Be sure you have completed all non-blocking operations before switching to blocking operations. ``Client`` ********** For full documentation of all of these methods, see the `Client reference documentation `_. Methods - ``send_request``: Build and send a Job request and return an integer request ID, which you can then use later to retrieve the request response (this method does not block waiting on a response) - ``get_all_responses``: Return a generator with all outstanding ``JobResponse`` objects for the given service (this method will block or timeout until all requests sent to this service with ``send_request`` have received responses) - ``call_action``: Build and send a Job request with a single Action and return an ``ActionResponse``, blocking until the response is received - ``call_actions``: Build and send a Job request with one or more Actions and return a ``JobResponse``, blocking until the response is received - ``call_actions_parallel``: Build and send multiple Job requests (to a single service), each with exactly one Action, to be handled in any order by multiple service processes, and return the corresponding ``ActionResponse`` objects in the same order the Action requests were submitted, blocking until all responses are received - ``call_jobs_parallel``: Build and send multiple Job requests (to one or more services), each with one or more Actions, to be handled in any order by multiple service processes, and return the corresponding ``JobResponse`` objects in the same order the Job requests were submitted, blocking until all responses are received - ``call_action_future``, ``call_actions_future``, ``call_actions_parallel_future``, ``call_jobs_parallel_future``: Variants of the above methods that return a ``Client.FutureResponse`` object instead of a completed response or responses, allowing you to send requests asynchronously, perform other work, and then use the future object to retrieve the expected responses. +--------------------------------------------------------------------+ |Warning: Chunking and parallel action's calls | +====================================================================+ |If you know that the server response is potentially going to be | |chunked, do not use ``call_actions_parallel`` or | |``call_jobs_parallel`` (or any ``future`` variant) as there is a bug| |that is going to make the request fail due to a race condition. | +--------------------------------------------------------------------+ Client configuration ******************** The ``Client`` class takes configuration in the form of a dict with the following format: .. code-block:: python { : { "transport": , "middleware": [, ...], }, ... } Key - ````: The ``Client`` needs settings for each service that it will call, keyed by service name - ````: See `Transport configuration`_ for more details; the base ``Client`` defaults to using the `Redis Gateway Transport`_. - ````: How long the transport objects should be cached in seconds, defaults to 0 (no cache, slightly lower performance, but required to be 0 in a multi-threaded application) - ````: See `Middleware configuration`_ for more details For full details, view the sections linked above and the `ClientSettings reference documentation `_. Expansions ********** Expansions allow the ``Client.call_***`` methods to automatically "expand" fields in a service response by making further service calls and adding those responses to the original response. (Note: ``send_request`` and ``get_all_responses`` do not perform any expansions; only the blocking methods perform expansions.) Expansions are based on a type system, which is optional and requires extra effort on the part of services. To support expansions, services must include a ``_type`` field in each object in each Action response body. The indicated type must map to an expansion type in the ``Client`` expansion configuration in order to be considered for expansions. The ``Client.call_***`` methods take a keyword argument ``expansions``, which is a dictionary mapping types to expansions. For each ``: `` pair in the argument, the ``Client`` will automatically perform each expansion in the ```` ``list`` for each object of ```` (a string) in the response. Expanded objects can, themselves, be further expanded recursively with the correct arguments, though you should always consider the performance implications of this behavior before using it. Configuring expansions ---------------------- Expansions are configured on the ``Client`` instance by using the ``expansions`` argument on initialization. This argument accepts a dict with the following format: .. code-block:: python { "type_routes": { : { "service": , "action": , "request_field": , "response_field": , }, ... }, "type_expansions": { : { : { "type": , "route": , "source_field": , "destination_field": , "raise_action_errors": , }, ... }, ... }, } Key - ``type_routes`` - ````: The name/key for the expansion route - ````: The name of the service to call in order to expand using this route - ````: The name of the action to call in order to expand using this route - ````: The name of the field to place in the Action request body when making an expansion request through this route (the field value will be a bulk list of expansion identifiers extracted from the objects being expanded) - ````: The name of the field returned in the Action response body that contains the expansion objects (the field value should be a dictionary whose keys are the identifiers from the request field and whose values are the individual objects corresponding to those identifiers) - ``type_expansions`` - ````: A type for which you are defining expansions - ````: The expected type of the objects returned by this expansion, which is used to look up the type in this same ``type_expansions`` dictionary to support nested/recursive expansions - ````: The route to the expansion, which must match a key found in ``type_routes`` - ````: The name of the field on an object of type ```` that contains the identifier that should be passed to the expansion route to perform the expansion - ````: The name of the field (which should not yet exist) on an object of type ```` that will be filled with the expanded value retrieved from the expansion route To satisfy an expansion, the expansion processing code needs to know which service action to call and how to call it. Type routes solve this problem by by giving the expansion processing code all the information it needs to properly call a service action to satisfy an expansion. Type expansions detail the expansions that are supported for each type. If a ``Client`` needs to support expansions for a type, that type must have a corresponding entry in the ``type_expansions`` dictionary, and that expansion's route must have a corresponding entry in the ``type_routes`` dictionary. For full details, view the `ExpansionSettings reference documentation `_. Expansions example ------------------ Consider a ``Client`` with the following expansions config: .. code-block:: python { "type_routes": { "bar_route": { "service": "bar_example", "action": "get_bars", "request_field": "ids", "response_field": "bars", }, }, "type_expansions": { "foo": { "bar": { "type": "bar", "route": "bar_route", "source_field": "bar_id", "destination_field": "bar", }, }, }, } You could then make a call to the ``foo_example`` service using the ``expansions`` argument: .. code-block:: python result = client.call_actions( service_name="foo_example", actions=[ { "action": "get_foos", }, ], expansions={"foo": ["bar"]}, ) The argument ``expansions={"foo": ["bar"]}`` tells the ``Client`` "for each object of type ``foo`` in the response, perform an expansion of type ``bar``". The ``foo_example`` service returns the following response to our ``get_foo`` request: .. code-block:: python { "action": "get_foos", "errors": [], "body": { "foos": [ { "_type": "foo", "id": 1, "name": "One Foo", "bar_id": 2, }, { "_type": "foo", "id": 2, "name": "Two Foo", "bar_id": 6, }, { "_type": "foo", "id": 3, "name": "Red Foo", "bar_id": 6, }, ], }, } Note that the ``foo`` objects contain the field ``bar_id``, which corresponds to the ``source_field`` in the ``bar`` expansion. Using this response, the ``Client`` automatically makes a call to the ``bar_example`` service using the ``bar_id`` values from the ``foo`` response. The call is equivalent to the following (but this is not code you would have to write): .. code-block:: python client.call_action( service_name="bar_example", body={ "action": "get_bars", "body": {"ids": [2, 6]}, }, ) Notice that the ``bar`` IDs have been de-duplicated, so as to avoid unnecessary work done by the route target service (``bar_example``). The ``bar_example`` service returns the following response: .. code-block:: python { "action": "get_bars", "errors": [], "body": { "bars": { 2: { "_type": "bar", "id": 2, "stuff": "baz", }, 6: { "_type": "bar", "id": 6, "stuff": "qux", }, }, }, } The ``bar_example`` response is added to the original response from the ``foo_example`` service, adding the ``bar`` field (``destination_field``) to each object that has a source field (``bar_id``). The final response body looks like: .. code-block:: python { "action": "get_foos", "errors": [], "body": { "foos": [ { "_type": "foo", "id": 1, "name": "One Foo", "bar_id": 2, "bar": { "_type": "bar", "id": 2, "stuff": "baz", }, }, { "_type": "foo", "id": 2, "name": "Two Foo", "bar_id": 6, "bar": { "_type": "bar", "id": 6, "stuff": "qux", }, }, { "_type": "foo", "id": 3, "name": "Red Foo", "bar_id": 6, "bar": { "_type": "bar", "id": 6, "stuff": "qux", }, }, ], }, } Client exceptions ***************** - ``ImproperlyConfigured``: The ``Client`` tried to call a service for which it did not have configuration - ``Client.JobError``: Raised by ``Client.call_***`` methods when the Job response contains job-level errors - ``Client.CallActionError``: Raised by ``Client.call_***`` methods when one or more Actions in the response(s) contain action-level errors Serialization +++++++++++++ The ``Serializer`` class allows Clients and Servers to communicate using a common format. This library provides serializer classes for `MessagePack `_ (the default and recommended) and JSON formats, and the base ``Serializer`` class can be extended to use any format that a developer may wish to use. The ``Serializer`` interface is simple: ``Serializer`` ************** Class Attributes - ``mime_type``: A unique string that identifies the type of serializer used to encode a message, generally of the form ``application/format``, where ``format`` is the lower-case alphanumeric name of the message format (currently, this is unused, but it may be used in the future to allow a server to support multiple serializers simultaneously and use the one matching a MIME type passed from the client) Methods - ``dict_to_blob``: Takes a Python dictionary and serializes it to a binary string - ``blob_to_dict``: Takes a binary string and deserializes it to a Python dictionary Provided serializers ******************** MessagePack Serializer ---------------------- - Backend: `msgpack-python `_ - Types supported: ``bool``, ``int``, ``str`` (``unicode``/2 or ``str``/3), ``dict``, ``list``, ``tuple``, ``bytes`` (``str``/2 or ``bytes``/3), ``date``, ``time``, ``datetime``, ``decimal.Decimal``, and ``currint.Amount`` - Other notes: - Makes no distinction between ``list`` and ``tuple`` types—both types will be deserialized as lists JSON Serializer --------------- - Backend: `json `_ - Types supported: ``bool``, ``int``, ``str`` (``unicode``/2 or ``str``/3), ``dict``, ``list``, ``tuple`` - Other notes: - Makes no distinction between ``list`` and ``tuple`` types—both types will be deserialized as lists - Fairly incomplete at the moment, relative to the MessagePack serializer, and may or may not be improved to support additional types in the future (would require departing from the JSON specification) Serializer configuration ************************ The config schema for ``Serializer`` objects is just the basic PySOA plugin schema: .. code-block:: python { "path": , "kwargs": , } Serializer exceptions ********************* - ``InvalidField``: Raised when the serializer fails to serialize a message and contains the arguments from the original exception raised by the serialization backend's encoding function - ``InvalidMessage``: Raised when the serializer fails to deserialize a message and contains the arguments from the original exception raised by the serialization backend's decoding function. Transport +++++++++ The ``transport`` module provides an interface for sending messages between clients and servers. While the Client and Server concepts deal with the high-level functionality of sending, receiving, and handling requests and responses without any concern about their method of transmission, Transports are responsible for the low-level details of actually transmitting PySOA protocol messages via specific backends. There are two base classes, from which all concrete Transports must inherit: ``ClientTransport`` ******************* Methods - ``send_request_message``: Serialize and send a request message to a service server - ``receive_response_message``: Receive the first available response that a service server has sent back to this client and return a tuple of the request ID and deserialized response message For full details of these methods and their usage, view the `ClientTransport reference documentation `_. ``ServerTransport`` ******************* Methods - ``receive_request_message``: Receive the first available request message that any client has sent to this service and return a tuple of the request ID, the request metadata, and the deserialized request message - ``send_response_message``: Serialize and send a response to the client that sent the corresponding request For full details of these methods and their usage, view the `ServerTransport reference documentation `_. Transport configuration *********************** The configuration schema for ``Transport`` classes is the same as for other PySOA plugins, though transports will generally provide an extended schema with more strict ``kwargs`` values. .. code-block:: python { "path": , "kwargs": , } Transport exceptions ******************** - ``ConnectionError``: The transport failed to connect to its message backend - ``InvalidMessageError``: The transport tried to send or receive a message that was malformed - ``MessageReceiveError``: The transport encountered any non-timeout error while trying to receive a message - ``MessageReceiveTimeout``: The transport timed out while waiting to receive a message - ``MessageSendError``: The transport encountered any non-timeout error while trying to send a message - ``MessageSendTimeout``: The transport timed out while trying to send a message - ``MessageTooLarge``: The message passed to the transport exceeded the maximum size allowed by that transport Redis Gateway Transport *********************** The ``transport.redis_gateway`` module provides a transport implementation that uses Redis (in standard or Sentinel mode) for sending and receiving messages. This is the recommended transport for use with PySOA, as it provides a convenient and performant backend for asynchronous service requests. A single Redis server running on a ``c5.xlarge`` EC2 instance has been tested to handle about 10,000 PySOA requests and responses per second at about 50% CPU usage and about 16,000 PySOA requests and responses per second at about 80% CPU usage. A cluster of three masters of that size can easily handle about 45,000 requests and responses per second. The PySOA Redis Gateway Transport is tested and certified against Redis 5 and Redis 6 and is commonly known to work with Redis 3 and 4 (though those versions are no longer supported). PySOA 2.0 will require Redis 6. Currently, the transport supports the `Redis Python library `_ versions 2.10+ and 3.4+ and does not support versions older than 2.10 or any of the 3.0.x-3.3.x versions. PySOA 2.0 will require at least version 3.5 and may require an even-newer version (this has not been decided yet). Standard and Sentinel modes --------------------------- The Redis Gateway transport has two primary modes of operation: in "standard" mode, the channel layer will connect to a specified list of Redis hosts (which must all be master servers that support both read and write operations), while in "Sentinel" mode, the channel layer will connect to a list of Sentinel hosts and use Sentinel to find one or more Redis masters. In either mode, if there is just one master, all operations will happen against that one master. If there are multiple masters, operations will proceed as follow: 1. The client uses round-robin to pick a master to which to send a request. 2. The client uses a predictable hashing algorithm to pick a master from which to receive a response, based on the response-receiving queue name. 3. The server uses round-robin to pick a master from which to receive requests. 4. Once the server has processed a request and is ready to receive a response, it uses the same hashing algorithm to pick a master to which to send the response, based on the queue name to which it is supposed to send that response, such that it will always send to the same master on which the client is "listening." Configuration ------------- The Redis Gateway transport takes the following extra keyword arguments for configuration: - ``backend_type``: Either "redis.standard" or "redis.sentinel" to specify which Redis backend to use (required) - ``backend_layer_kwargs``: A dictionary of arguments to pass to the backend layer + ``connection_kwargs``: A dictionary of arguments to pass to the underlying Redis client (see the documentation for the `Redis-Py library `_) + ``hosts``: A list of strings (host names / IP addresses) or tuples (host names / IP addresses and ports) for Redis hosts or Sentinels to which to connect (will use "localhost" by default) + ``redis_db``: The Redis database number to use (a shortcut for specifying ``connection_kwargs['db']``) + ``redis_port``: The connection port to use (a shortcut for providing this for every entry in ``hosts`` + ``sentinel_failover_retries``: How many times to retry (with an exponential-backoff delay) getting a connection from the Sentinel when a master cannot be found (cluster is in the middle of a failover) (only for type "redis.sentinel") (fails on the first error by default) + ``sentinel_services``: Which Sentinel services to use (only for type "redis.sentinel") (will be auto-discovered from the Sentinel by default, but that can slow down connection startup) - ``message_expiry_in_seconds``: How long a message may remain in the queue before it is considered expired and discarded (defaults to 60 seconds, and Client code can pass a custom timeout to ``Client`` methods) - ``queue_capacity``: The maximum number of messages a given Redis queue may hold before the transport should stop pushing messages to it (defaults to 10,000) - ``queue_full_retries``: The number of times the transport should retry (with an exponential-backoff delay) sending to a Redis queue that is at capacity before it raises an error and stops trying (defaults to 10) - ``receive_timeout_in_seconds``: How long the transport should block waiting to receive a message before giving up (on the Server, this controls how often the server request-process loops; on the Client, this controls how long before it raises an error for waiting too long for a response, and Client code can pass a custom timeout to ``Client`` methods) (defaults to 5 seconds) - ``default_serializer_config``: A standard serializer configuration as described in `Serializer configuration`_ (defaults to MessagePack), used to determine how requests are serialized (responses are always serialized according to the MIME content type of the request) - ``log_messages_larger_than_bytes``: Defaults to 102,400 bytes, a warning will be logged whenever the transport sends messages larger than this (set this to 0 to disable the warning) - ``maximum_message_size_in_bytes``: Defaults to 102,400 bytes on the client and 256,000 bytes on the server, defines the threshold at which ``MessageTooLarge`` will be raised. - ``chunk_messages_larger_than_bytes``: This option exists only for the Server transport and not for the Client transport and controls the threshold at which responses will be chunked. Chunked responses allows your servers to return very large responses back to clients without blocking single-threaded Redis for long periods of time with the I/O from a single very large response. With chunking, each small chunk will compete for Redis resources as if it were its own response, resulting in an infrastructure more torerable to large responses. By default, this is -1 (disabled). If you configure this value, it must be at least 102,400 bytes, and ``maximum_message_size_in_bytes`` must also be configured to be at least 5 times larger (because maximum message sizes can still be enforced, above which not even chunking is allowed). You will probably also want to increase ``log_messages_larger_than_bytes`` to avoid verbose response logging. Redis Authentication Support ---------------------------- Both the Standard and Sentinel modes of the Redis Gateway transport support Redis authentication, and both traditional password-only authentication and the new ACL user-password authentication added to Redis 6 are supported. ACL user-password authentication requires at least Redis 6 and at least version 3.4.1 of the `Redis Python library `_. Password-only authentication works with Redis 5 and at least version 2.10.0 of the Redis Python library. Before proceeding with password-only authentication, be sure you have read and understand the `Redis "Authentication feature" documentation `_. For ACLs, be sure you have read and understand the `Redis "ACL" documentation `_. To configure password-only authentication, specify it in a ``connection_kwargs`` item within the ``backend_layer_kwargs`` argument (above) of either the standard or Sentinel transports: .. code-block:: python ... 'backend_layer_kwargs': { ... 'connection_kwargs': { ... 'password': 'the_super_secret_redis_password', ... }, ... }, ... Configuring ACL user-password authentication is nearly identical, supplementing ``password`` with ``username``: .. code-block:: python ... 'backend_layer_kwargs': { ... 'connection_kwargs': { ... 'username': 'service_transport_user', 'password': 'the_service_transport_user_password', ... }, ... }, ... Redis TLS Support ----------------- The Standard mode of the Redis Gateway Transport supports TLS when communicating with the Redis server, and the Sentinel mode supports TLS when communicating with the Redis server only, with the Sentinel server only, or both, depending on your setup. Before proceeding with TLS, be sure you have read and understand the `Redis "TLS Support" documentation `_. Using TLS on Standard mode is straightforward and can be demonstrated with following example ``connection_kwargs``: .. code-block:: python ... 'backend_layer_kwargs': { ... 'connection_kwargs': { ... 'ssl': True, 'ssl_ca_certs': '/path/to/ca.crt', 'ssl_certfile': '/path/to/redis.crt', 'ssl_keyfile': '/path/to/redis.key', ... }, ... }, ... In this example, the client is supporting TLS on the server and also proving a client certificate, the default Redis TLS configuration. ``ssl_ca_certs`` is the PEM-encoded file containing the Certificate Authority certificates used to sign both the server certificate and the client certificate. ``ssl_certfile`` is the PEM-encoded file containing the client certificate, and ``ssl_keyfile`` is the TLS key used to generate the client certificate. If you have configured Redis with ``tls-auth-clients no`` to disable client certificates, you do not need the ``ssl_certfile`` and ``ssl_keyfile`` arguments, but you still need the ``ssl_ca_certs`` arguments. TLS authentication when Sentinel is involved is considerably more involved: * If TLS is enabled on your master and replicas but Sentinel does not talk to those servers using TLS, your service and clients will also not be able to talk to Redis using TLS. The Redis Gateway Transport asks Sentinel to give it the address for the current Redis master. If Sentinel is talking to the Redis master on a non-TLS address, it will give the transport that non-TLS address. * If TLS is enabled on your master and replicas and Sentinel talks to those servers using TLS (``sentinel.conf`` contains ``tls-replication yes`` and ``sentinel monitor`` specifies the TLS port), but Sentinel itself is not listening on a TLS port (``tls-port`` is not in ``sentinel.conf``), your service and clients will be able to talk to Redis over TLS but will not be able to talk to Sentinel over TLS. This configuration is actually fine—sensitive data that your services transact is not transmitted over the Sentinel connection; it is transmitted only over the Redis connection. * If TLS is enabled on your master and replicas and Sentinel talks to those servers using TLS (``sentinel.conf`` contains ``tls-replication yes`` and ``sentinel monitor`` specifies the TLS port) and Sentinel itself is also configured to listen on a TLS port (``sentinel.conf`` contains ``tls-port`` and related config options), your service and clients can talk to both Redis and Sentinel over TLS. This is the most-secure configuration (it prevents a MitM attack from redirecting your services and clients to a malicious Redis server). The example below demonstrates the third condition, where the transport can use TLS to talk to both Redis and Sentinel: .. code-block:: python ... 'backend_layer_kwargs': { ... 'connection_kwargs': { ... 'ssl': True, 'ssl_ca_certs': '/path/to/ca.crt', 'ssl_certfile': '/path/to/redis.crt', 'ssl_keyfile': '/path/to/redis.key', ... }, 'sentinel_kwargs': { ... 'ssl': True, 'ssl_ca_certs': '/path/to/ca.crt', 'ssl_certfile': '/path/to/redis.crt', 'ssl_keyfile': '/path/to/redis.key', ... }, ... }, ... As above, this example demonstrates both TLS and client authentication certificates. You can omit ``ssl_certfile`` and ``ssl_keyfile`` if Redis is configured with ``tls-auth-clients no``. If you want to talk only to Redis over TLS and not Sentinel, you can omit all of the ``ssl*`` arguments from ``sentinel_kwargs``. Note that TLSv1.2 is supported basically universally, but TLSv1.3 requires your Python services and clients to be running on a system with at lest OpenSSL 1.1.0 installed. Python running on systems with older versions of OpenSSL will not be able to connect to Redis servers with TLSv1.3. As such, take care when configuring the ``tls-protocols`` option in your Redis configuration. Timeouts and Keep-Alive ----------------------- The Standard and Sentinel modes of the Redis Gateway Transport automatically configure sane defaults for timeouts to ensure proper operation of your transports. However, you may wish to override those. The following demonstrates the *default* values for the Standard mode: .. code-block:: python ... 'backend_layer_kwargs': { ... 'connection_kwargs': { ... 'socket_connect_timeout': 5.0, # seconds 'socket_keepalive': True, ... }, ... }, ... Specify those keys with your own values in your configuration to override those defaults. The following demonstrates the *default* values for the Sentinel mode: .. code-block:: python ... 'backend_layer_kwargs': { ... 'connection_kwargs': { ... 'socket_connect_timeout': 5.0, # seconds 'socket_keepalive': True, ... }, 'sentinel_kwargs': { ... 'socket_connect_timeout': 5.0, # seconds 'socket_timeout': 5.0, # seconds 'socket_keepalive': True, ... }, ... }, ... Specify those keys with your own values in your configuration to override those defaults. Middleware ++++++++++ Middleware for both Server and Client uses an onion calling pattern, where each middleware accepts a callable and returns a callable. Each middleware in the stack is called with the middleware below it, and the base level middleware is called with a base processing method from the ``Server`` or ``Client`` classes. ``ServerMiddleware`` ******************** The ``ServerMiddleware`` class has an interface that allows it to act at a Job level or at an Action level, or both, depending on which part(s) of the interface it implements. It has two methods, ``job`` and ``action``, each of which wraps a callable that does the work of processing a Job or Action. See the ``_ for more information about how to implement Server middleware. ``ClientMiddleware`` ******************** Client middleware works similarly to Server middleware, using an onion calling pattern. Client middleware is built around the client request/response workflow. The ``ClientMiddleware`` class has two methods, ``request`` and ``response``, each of which wraps a callable that does the work of sending or receiving, respectively. See the ``_ for more information about how to implement Client middleware. Middleware configuration ************************ ``Middleware`` classes are configured using the standard PySOA plugin schema, though specific middleware will generally provide an extended schema with more strict ``kwargs`` values. .. code-block:: python { "path": , "kwargs": , } Metrics +++++++ PySOA is capable of recording detailed metrics about the use and performance of its Client and Server transports and sending and receiving processes. If you wish to gather metrics about the performance of PySOA, you will need to enable this metrics recording in your server settings and/or in your client settings and provide an object which PySOA can use to record these metrics. PyMetrics ********* PySOA uses `PyMetrics `_ to record metrics about the behavior and performance of servers, clients, and transports. You can read more about how to use and configure PyMetrics in its documentation. PySOA does not automatically include any sort of library- or service-distinguishing prefix or tags to the metrics it records (see `Which metrics are recorded`_ below). We recommend your configuration append some type of prefix to all all metrics names passed to it (or uses tagging if your metrics backend understands that) so that you can group all PySOA metrics together. When a PySOA server starts up, however, it will look through your metrics settings for the replacement strings ``{{fid}}``, ``[[fid]]``, or ``%%fid%%`` and, if those exist and PySOA forking is in use, those replacement strings will be replaced with the server instance's deterministic fork ID. This is helpful for including the fork ID in tags for distributed gauges. For more information about tags and distributed gauges, see the linked PyMetrics documentation. Which metrics are recorded ************************** These are all the metrics recorded in PySOA: - ``server.transport.redis_gateway.backend.initialize``: A timer indicating how long it took the Redis Gateway server transport to initialize a backend Redis client - ``server.transport.redis_gateway.backend.sentinel.populate_master_client``: A counter incremented each time the Redis Gateway server transport Sentinel backend has to get a new master client for any given service (shard) - ``server.transport.redis_gateway.backend.sentinel.master_not_found_retry``: A counter incremented each time the Redis Gateway server transport Sentinel backend retries getting master info due to master failover (only happens if ``sentinel_failover_retries`` is enabled) - ``server.transport.redis_gateway.send``: A timer indicating how long it takes the Redis Gateway server transport to send a response - ``server.transport.redis_gateway.send.message_size``: A histogram indicating the total size of the response sent back to the client. - ``server.transport.redis_gateway.send.error.missing_reply_queue``: A counter incremented each time the Redis Gateway server transport is unable to send a response because the message metadata is missing the required ``reply_to`` attribute - ``server.transport.redis_gateway.send.serialize``: A timer indicating how long it takes the Redis Gateway transport to serialize a message - ``server.transport.redis_gateway.send.error.message_too_large``: A counter incremented each time the Redis Gateway transport fails to send because it exceeds the maximum configured message size (which defaults to 100KB on the client and 250KB on the server) - ``server.transport.redis_gateway.send.queue_full_retry``: A counter incremented each time the Redis Gateway transport re-tries sending a message because the message queue was temporarily full - ``server.transport.redis_gateway.send.queue_full_retry.retry_{1...n}``: A counter incremented on each queue full retry for a particular retry number - ``server.transport.redis_gateway.send.get_redis_connection``: A timer indicating how long it takes the Redis Gateway transport to get a connection to the Redis cluster or sentinel - ``server.transport.redis_gateway.send.send_message_to_redis_queue``: A timer indicating how long it takes the Redis Gateway transport to push a message onto the queue - ``server.transport.redis_gateway.send.error.connection``: A counter incremented each time the Redis Gateway transport encounters an error retrieving a connection while sending a message - ``server.transport.redis_gateway.send.error.redis_queue_full``: A counter incremented each time the Redis Gateway transport fails to push a message onto a full queue after the maximum configured retries - ``server.transport.redis_gateway.send.error.response``: A counter incremented each time the Redis Gateway transport encounters an error from Redis (logged) while sending a message - ``server.transport.redis_gateway.send.error.unknown``: A counter incremented each time the Redis Gateway transport encounters an unknown error (logged) sending a message - ``server.transport.redis_gateway.receive``: A timer indicating how long it takes the Redis Gateway server transport to receive a response (however, this includes time waiting for an incoming request, so it may not be meaningful) - ``server.transport.redis_gateway.receive.get_redis_connection``: A timer indicating how long it takes the Redis Gateway transport to get a connection to the Redis cluster or sentinel - ``server.transport.redis_gateway.receive.pop_from_redis_queue``: A timer indicating how long it takes the Redis Gateway transport to pop a message from the redis queue (however, this includes time waiting for an incoming message, so it may not be meaningful) - ``server.transport.redis_gateway.receive.error.connection``: A counter incremented each time the Redis Gateway transport encounters an error retrieving a connection while receiving a message - ``server.transport.redis_gateway.receive.error.unknown``: A counter incremented each time the Redis Gateway transport encounters an unknown error (logged) receiving a message - ``server.transport.redis_gateway.receive.deserialize``: A timer indicating how long it takes the Redis Gateway transport to deserialize a message - ``server.transport.redis_gateway.receive.error.message_expired``: A counter incremented each time the Redis Gateway transport receives an expired message - ``server.transport.redis_gateway.receive.error.no_request_id``: A counter incremented each time the Redis Gateway transport receives a message with a missing required Request ID - ``server.error.response_conversion_failure``: A counter incremented each time a response object fails to convert to a dict in the server - ``server.error.job_error``: A counter incremented each time a handled error occurs processing a job - ``server.error.unhandled_error``: A counter incremented each time an unhandled error occurs processing a job - ``server.error.error_formatting_failure``: A counter incremented each time an error occurs handling an error - ``server.error.variable_formatting_failure``: A counter incremented each time an error occurs handling an error - ``server.error.unknown``: A counter incremented each time some unknown error occurs that escaped all other error detection - ``server.idle_time``: A timer indicating how long the server idled between when it sent one response and received the next response (this is a good gauge of how burdened your servers are, such that a high number means your servers are idling a lot and not receiving many requests, and a very low number means your servers are doing a lot of work and you might need to add more servers) - ``server.worker.running``: A distributed gauge whose value is set to 1 when the server starts up and renewed to 1 at least every 5 seconds and set to 0 when the server shuts down. - ``server.worker.busy``: A distributed gauge whose value is set to 1 when the server receives and begins processing a request and set to 0 when the server completes sending a response back to a client. Combined with the ``server.worker.running`` metric, this is particularly useful for calculating the busyness of your service, determining how many workers are running at a given moment. - ``client.middleware.initialize``: A timer indicating how long it took to initialize all middleware when creating a new client handler - ``client.transport.initialize``: A timer indicating how long it took to initialize the transport when creating a new client handler - ``client.transport.redis_gateway.backend.initialize``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.backend.sentinel.populate_master_client``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.backend.sentinel.master_not_found_retry``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.send``: A timer indicating how long it took the Redis Gateway client transport to send a request - ``client.transport.redis_gateway.send.message_size``: A histogram indicating the total size of the request sent to the server. - ``client.transport.redis_gateway.send.serialize``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.send.error.message_too_large``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.send.queue_full_retry``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.send.queue_full_retry.retry_{1...n}``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.send.get_redis_connection``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.send.send_message_to_redis_queue``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.send.error.connection``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.send.error.redis_queue_full``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.send.error.response``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.send.error.unknown``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.receive``: A timer indicating how long it took the Redis Gateway client transport to receive a response (however, this includes time blocking for a response, so it may not be meaningful) - ``client.transport.redis_gateway.receive.get_redis_connection``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.receive.pop_from_redis_queue``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.receive.error.connection``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.receive.error.timeout``: A counter incremented each time a client times out waiting on a response from the server - ``client.transport.redis_gateway.receive.error.unknown``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.receive.deserialize``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.receive.error.message_expired``: Client metric has same meaning as server metric - ``client.transport.redis_gateway.receive.error.no_request_id``: Client metric has same meaning as server metric - ``client.send.excluding_middleware``: A timer indicating how long it took to send a request through the configured transport, excluding any time spent in middleware - ``client.send.including_middleware``: A timer indicating how long it took to send a request through the configured transport, including any time spent in middleware - ``client.receive.excluding_middleware``: A timer indicating how long it took to receive a request through the configured transport, excluding any time spent in middleware (however, this includes time blocking for a response, so it may not be meaningful) - ``client.receive.including_middleware``: A timer indicating how long it took to receive a request through the configured transport, including any time spent in middleware (however, this includes time blocking for a response, so it may not be meaningful) Customizing configuration +++++++++++++++++++++++++ PySOA is configured using `Conformity's Settings feature `_, which helps define complex application settings schemas that are carefully and strictly validated. You can read more about settings and how they work there. PySOA includes subclasses of Conformity's base ``Settings`` class to validate general, client, and server settings. Included ``Settings`` subclasses ******************************** There are several ``Settings`` subclasses provided throughout PySOA, and you can view more about them in the `reference documentation `_. This is a summary of them: ``pysoa.common.settings.SOASettings`` Provides a schema that is shared by both Servers and Clients. It's schema: - ``transport``: Import path and keyword args for a ``Transport`` class. - ``metrics``: Import path and keyword args for a `PyMetrics `_ ``MetricsRecorder`` class (defaults to a no-op/null recorder). - ``middleware``: List of dicts containing import path and keyword args for a ``ClientMiddleware`` or ``ServerMiddleware`` class. Both the ``client`` and ``server`` modules implement their own subclasses that inherit from ``SOASettings``. Developers implementing ``Client`` or ``Server`` subclasses may wish to subclass the respective settings class in order to alter or extend the settings. Client Settings ``pysoa.client.settings.ClientSettings`` extends ``SOASettings`` to provide a client-specific schema. It enforces that ``middleware`` is only Client middleware and that the ``transport`` is only a Client transport. Server Settings ``pysoa.server.settings.ServerSettings`` extends ``SOASettings`` to provide a server-specific schema. It enforces that ``middleware`` is only Server middleware and that the ``transport`` is only a Server transport and also adds: - ``client_routing``: Client settings for any PySOA clients that the server or its middleware will need to create to call other services; if provided, the server adds a ``Client`` instance as key ``client`` to the ``action_request`` dict before passing it to the middleware and as attribute ``client`` to the ``action_request`` object before passing it to the action; each key must be a unicode string service name and each value the corresponding ``ClientSettings``-enforced client settings dict - ``logging``: Settings for configuring Python logging in the standard Python logging configuration format: - ``version``: Must be the value 1 until Python supports something different - ``formatters``: A dict of formatter IDs to dicts of formatter configs - ``filters``: A dict of filter IDs to dicts of filter configs - ``handlers``: A dict of handler IDs to dicts of handler configs - ``loggers``: A dict of logger names to dicts of logger configs - ``root``: The root logger config dict - ``incremental``: A Boolean for whether the configuration is to be interpreted as incremental to the existing configuration (Python defaults this to ``False``, and so does PySOA) - ``disable_existing_loggers``: A Boolean for whether existing loggers are to be disabled (Python defaults this to ``True`` for legacy reasons and ignores its value if ``incremental`` is ``True``; PySOA defaults this value to ``False`` to allow module-level ``getLogger`` calls, and you almost never want to change it to ``True``) - ``harakiri``: Settings for killing long-running jobs that may have run away or frozen or blocked transport processes that may be in a bind, unable to become unblocked; a dict with the following format: - ``timeout``: After this many seconds without finishing processing a request or receiving a transport timeout, the server will attempt to gracefully shut down (the value 0 disables this feature, defaults to 300 seconds) - ``shutdown_grace``: If a graceful shutdown does not succeed, the server will forcefully shut down after this many additional seconds (must be greater than 0, defaults to 30 seconds)