Releases: openziti/ziti
v0.19.3
Release 0.19.3
What's New
- Metric events formatting has changed
Metric Events Changes
Each now gets its own event. Here are two example events:
{
"metric": "xgress.tx_write_time",
"metrics": {
"xgress.tx_write_time.count": 0,
"xgress.tx_write_time.m1_rate": 0,
"xgress.tx_write_time.mean": 0,
"xgress.tx_write_time.p99": 0
},
"namespace": "metrics",
"source_event_id": "62c31ab9-e0ed-48f5-9907-2d2e8c76f393",
"source_id": "pTF3hzUQI",
"timestamp": "2021-02-23T19:33:39.017329033Z"
}
{
"metric": "link.rx.msgsize",
"metrics": {
"link.rx.msgsize.count": 3,
"link.rx.msgsize.mean": 0,
"link.rx.msgsize.p99": 0
},
"namespace": "metrics",
"source_entity_id": "8VEJ",
"source_event_id": "62c31ab9-e0ed-48f5-9907-2d2e8c76f393",
"source_id": "pTF3hzUQI",
"timestamp": "2021-02-23T19:33:39.017329033Z"
}
Changes of note:
- The metric name is now listed
- There's a new
source_event_id
which can be used to link together all the metrics that were
reported at a given time - The timestamp format has been changed to match the other event times. Format is: RFC3339Nano
- Metrics which formerlly had an id in them, such as link and control channel metrics now have the
id extracted. The id is stored in thesource_entity_id
field.
v0.19.2
Release 0.19.2
Bug fixes
- Fix edge router synchronization from stopping after workers exit
- Session validation intervals in the edge router were calculated incorrected
- Notes on the configuration values related to session validation were missing from the 0.19 release
notes. They have been added in the section calledEdge Session Validation
v0.19.1
Release 0.19.1
Bug fixes
- Fix v0.18.x - v0.19.x API Session id incompatibility, all API Sessions and Sessions are deleted during this upgrade
- Fix Edge Router double connect leading to panics during Edge Router REST API rendering
What's New
-
Ziti CLI now has 'Let's Encrypt' PKI support to facilitate TLS connections to Controller from
BrowZer-based apps that use theziti-sdk-js
.-
New command to Register a Let's Encrypt account, then create and install a certificate
Usage:
ziti pki le create -d domain -p path-to-where-data-is-saved [flags]
Flags:
-a, --acmeserver string ACME CA hostname (default "https://acme-v02.api.letsencrypt.org/directory") -d, --domain string Domain for which Cert is being generated (e.g. me.example.com) -e, --email string Email used for registration and recovery contact (default "[email protected]") -h, --help help for create -k, --keytype EC256|EC384|RSA2048|RSA4096|RSA8192 Key type to use for private keys (default RSA4096) -p, --path string Directory to use for storing the data -o, --port string Port to listen on for HTTP based ACME challenges (default "80") -s, --staging Enable creation of 'staging' Certs (instead of production Certs)
-
New command to Display Let's Encrypt certificates and accounts information
Usage:
ziti pki le list -p path-to-where-data-is-saved [flags]
Flags:
-a, --accounts Display Account info -h, --help help for list -n, --names Display Names info -p, --path string Directory where data is stored
-
New command to Renew a Let's Encrypt certificate
Usage:
ziti pki le renew -d domain -p path-to-where-data-is-saved [flags]
Flags:
-a, --acmeserver string ACME CA hostname (default "https://acme-v02.api.letsencrypt.org/directory") --days int The number of days left on a certificate to renew it (default 14) -d, --domain string Domain for which Cert is being generated (e.g. me.example.com) -e, --email string Email used for registration and recovery contact (default "[email protected]") -h, --help help for renew -k, --keytype EC256|EC384|RSA2048|RSA4096|RSA8192 Key type to use for private keys (default RSA4096) -p, --path string Directory where data is stored -r, --reuse-key Used to indicate you want to reuse your current private key for the renewed certificate (default true) -s, --staging Enable creation of 'staging' Certs (instead of production Certs)
-
New command to Revoke a Let's Encrypt certificate
Usage:
ziti pki le revoke -d domain -p path-to-where-data-is-saved [flags]
Flags:
-a, --acmeserver string ACME CA hostname (default "https://acme-v02.api.letsencrypt.org/directory") -d, --domain string Domain for which Cert is being generated (e.g. me.example.com) -e, --email string Email used for registration and recovery contact (default "[email protected]") -h, --help help for revoke -p, --path string Directory where data is stored -s, --staging Enable creation of 'staging' Certs (instead of production Certs)
-
v0.19.0
Release 0.19.0
Breaking Changes
- Edge session validation is now handled at the controller, not the edge router
- Routing across the overlay is now handled in parallel, rather than serially. This changes the
syntax and semantics of a couple of control plane messages between the controller and the
connected routers. See the section below onParallel Routing
for additional details. - API Session synchronization improvements and pluggability
Bug fixes
- ziti ps now supports
router-disconnect
androuter-reconnect
, which disconnects/reconnects the
router from the controller. This allows easier testing of various failure states. Requires that
--debug-ops is passed toziti-router
on startup. - Golang SDK hosted service listeners are now properly closed when they receive close notifications
- Golang SDK now recovers if the session is gone
- Golang SDK now stops some go-routines that were previously left running after the SDK context was
closed - Fix session leak caused by using half close when tunneling UDP connections
- Fix connection leak caused by not closing the UDP connection when it's activity timer expires
API Changes
- Fabric Xctrl instances are now notified when the control channel reconnects
- Fabric Xctrl instances may now provide message decoders for the trace infrastructure so that
custom messages will be properly displayed in trace logs
Edge Session Validation
Before 0.19, edge sessions (note: network sessions, not API sessions) would be sent to edge routers
after they were created. When the edge router received a dial or bind request it would verify that
the session was valid, then request the controller to create a fabric session.
This approach has two downsides.
- There is a race condition where the edge router may receive a dial/bind request before it has
received the session from the controller. It thus has to wait awhile before declaring the
session invalid. - Sessions need to be managed across multiple edge routers, since we don't know where the client
will connect. This adds a lot of control channel traffic.
Since the edge router makes a request to the controller anyway, we can pass the session token and
fingerprints up to the controller and do the verification there. This allows us to minimize the
amount of state the edge router needs to keep synchronized with the controller and removes the race
condition.
Parallel Routing
Prior to 0.19, the Ziti controller would send a Route
message to the terminating router first, to establish terminator endpoint connectivity. If the destination endpoint was unreachable, the entire session setup would be abandoned. If the terminator responded successfully, the controller would then proceed to work through the chain of routers sending Route
messages and creating the appropriate forwarding table entries. This all happened sequentially.
In 0.19 route setup for session creation now happens in parallel. The controller sends Route
commands to all of the routers in the chain (including the terminating router), and waits for responses and/or times out those responses. If all of the participating routers respond affirmatively within the timeout period, the entire session creation succeeds. If any participating router responds negatively, or the timeout period occurs, the session creation attempt fails, updating configured termination weights. Session creation will retry up to a configured number of attempts. Each attempt will perform a fresh path selection to ensure that failed terminators can be excluded from subsequent attempts.
Configuration of Parallel Routing
The terminationTimeoutSeconds
timeout parameter has been removed and will be ignored. The routeTimeoutSeconds
controls the timeout for each route attempt.
#network:
#
# routeTimeoutSeconds controls the number of seconds the controller will wait for a route attempt to succeed.
#
#routeTimeoutSeconds: 10
You'll want to ensure that your participating routers' getSessionTimeout
in the Xgress options is configured to a suitably large enough value to support the configured number of routing attempts, at the configured routing attempt timeout. In the router configuration, the getSessionTimeout
value is configured for your Xgress listeners like this:
listeners:
# basic ssh proxy
- binding: proxy
address: tcp:0.0.0.0:1122
service: ssh
options:
getSessionTimeout: 120s
The new parallel routing implementation also supports a configurable number of session creation attempts. Prior to 0.19, the number of attempts was hard-coded at 3. In 0.19, the number of retries is controlled by the createSessionRetries
parameter, which defaults to 3.
network:
#
# createSessionRetries controls the number of retries that will be attempted to create a circuit (and terminate it)
# for new sessions.
#
createSessionRetries: 5
API Session Synchronization
Prior to 0.19 API Sessions were only capable of being synchronized with connecting/reconnecting
edge routers in a single manner. In 0.19 and forward improvements allow for multiple strategies to be defined
within the same code base. Future releases will be able to introduce configurable and negotiable
strategies.
The default strategy from prior releases, now named 'instant', has been improved to
fix issues that could arise during edge router reconnects where API Sessions would become invalid
on the reconnecting edge router. In addition, the instant strategy now allows for invalid
synchronization detection, resync requests, enhanced logging, and synchronization statuses for edge routers.
Edge Router Synchronization Status
The GET /edge-routers
list and GET /edge-routers/<id>
detail responses now include a syncStatus
field. This value is updated during the lifetime of the edge router's connection to the controller
and will provide insight on its status.
The possible syncStatus
values are as follows:
- "SYNC_NEW" - connection accepted but no strategy actions have been taken
- "SYNC_QUEUED" - connection handed to a strategy and waiting for processing
- "SYNC_HELLO_TIMEOUT" - sync failed due to a hello timeout, requeued for hello
- "SYNC_HELLO" - controller edge hello being sent
- "SYNC_HELLO_WAIT" - hello received from router and queued for processing
- "SYNC_RESYNC_WAIT" - router requested a resync and queued for processing
- "SYNC_IN_PROGRESS" - synchronization processing
- "SYNC_DONE" - synchronization completed, router is now in maintenance updates
- "SYNC_UNKNOWN" - state is unknown, edge router misbehaved, error state
- "SYNC_DISCONNECTED" - strategy was disconnected before finishing, error state
v0.18.10
Release 0.18.10
What's New
- Close terminating xgress instances if no start is received with a configurable timeout
- Timeout is set in the router config under listener/dialer options:
sessionStartTimeout
Default value:3m
- Timeout is set in the router config under listener/dialer options:
- Don't add a second shutdown timer if one is already set
- Allow list/updating router forwarding tables if --debug-ops is passed
- new command
ziti ps route <optional target> <session> <src-address> <dest-address>
- new command
ziti ps dump-routes <optional target>
- new command
- If an xgress session fails in retransmit, sends fault notification to controller so
controller can fix path or remove session, depending on session state
v0.18.9
Release 0.18.9
What's New
- Fix PATCH OS Posture Checks clearing data
- Fix ziti-tunnel panic when removing services. edge#517
- ziti-tunnel tproxy now supports
--lanIf
option to automatically add rules to accept incoming
connections for service connections. edge#519 - Fix orphaned ottca enrollments after `DELETE /cas/'
- Add build info output when starting router/controller
- Remove histograms from xgress dataflow path as they were causing bottlenecks
v0.18.8
Release 0.18.8
What's New
- Websocket binding in Edge Router to support connections from BrowZer-based apps that use
theziti-sdk-js
. Edge Routers support various configurations including a singletls
binding, a
singlews
binding, or having bothtls
andws
bindings simultaneously. If both binding types
are used, they must be specified with different ports. - Edge Router list for current identity
- Fix terminator PATCH, don't update/clear peer data unless requested
- Fix concurrency related crash in router
- Fix resource leak in fabric: forwarder tables weren't always updated after unroute
Websocket Binding
# Example Edge Router config snippet (note new `ws` address type):
listeners:
- binding: edge
address: ws:0.0.0.0:3021
options:
advertise: curt-edge-ws-router:3021
- binding: edge
address: tls:0.0.0.0:3022
options:
advertise: curt-edge-ws-router:3022
- Fix router memory leak: purge session from forwardTable during unroutTimeout
Edge Router List For Current Identity
A new endpoint has been added which will display the list of Edge Routers an authenticated session
has access to via any policy. The records will indicate whether the router is online, its hostname,
and its supported protocols. This endpoint will not return Edge Routers that have not completed
enrollment. Edge Routers that are offline will not have hostname and supported protocol information.
Endpoint: GET /current-identity/edge-routers
Example Output:
{
"data": [
{
"createdAt": "2021-01-27T20:13:18.599Z",
"id": "LolSlAQMq",
"tags": {},
"updatedAt": "2021-01-27T20:13:19.762Z",
"hostname": "",
"isOnline": false,
"name": "er1",
"supportedProtocols": {}
},
{
"createdAt": "2021-01-27T20:13:19.308Z",
"id": "oVzRl6kCq",
"tags": {},
"updatedAt": "2021-01-27T20:13:19.901Z",
"hostname": "127.0.0.1:5002",
"isOnline": true,
"name": "er2",
"supportedProtocols": {
"tls": "tls://127.0.0.1:5002",
"wss": "wss://127.0.0.1:5002"
}
}
],
"meta": {
"filterableFields": [
"id",
"createdAt",
"updatedAt",
"name",
],
"pagination": {
"limit": 10,
"offset": 0,
"totalCount": 2
}
}
}
v0.18.7
Release 0.18.7
What's New
- Update ziti-tunnel service polling
- Now uses new api (when available) to skip refreshing services if no services have been changed
- Polling rate is passed through to sdk, so actual poll rate is now controlled
- Don't panic on router startup, just show error message and exit
- Fix resource leak: go-routine on terminators using the edge_transport binding
- Fix resource leak: retransmission timers for xgress instances not being shutdown when xgress
closed - Control channel metrics now available
- Fix potential deadlock on xgress close
- When closing due to fault notification, wait for data coming from remote to stop, not data from
both local and remote, since local writes may never stop, due to retransmission attempts - Add new config option to router
xgressCloseCheckInterval
, which dictates for how long data flow
from the remote should have stopped before closing xgress after receiving fault limit none
for Edge API Rest requests is now properly limited to 500 elements on list endpoints- The HTTP header
server
is now populated on all responses withziti-controller/vX.Y.Z
Control Channel Metrics
Note: This feature is only available if both controller and router are on 0.18.7 or higher.
The control channels between the controller and routers now generate metrics, including:
ctrl.<router id>.latency
ctrl.<router id>.tx.bytesrate
ctrl.<router id>.tx.msgrate
ctrl.<router id>.tx.msgsize
ctrl.<router id>.rx.bytesrate
ctrl.<router id>.rx.msgrate
ctrl.<router id>.rx.msgsize
There is a new controller config file setting:
ctrlChanLatencyIntervalSeconds
which controls how often the latency probe is sent. Default
value: 10
v0.18.6
Release 0.18.6
What's New
- Fix
ziti install terraform-provider-edgecontroller
v0.18.5
Release 0.18.5
What's New
- Update go-openapi libraries
- xgress_edge refactor, should fix 'failed to dipsatch to fabric' errors
- Update
ziti use
command to work with main branch - MFA Support
- Fix deadlock on session close in router when connection is at capacity
- Fix issue where end of session didn't get sent in some scenarios
MFA Support
Endpoint MFA is available that is based on RFC 4226 (HOTP: An HMAC-Based One-Time Password
Algorithm) and RFC 6238 (TOTP: Time-Based One-Time Password Algorithm). These standards are
compatible with standard "Authenticator" apps such as Google Authenticator and Authy. MFA is
suggested to be used in situations where human operators are involved and additional security is
desired.
Restricting Service Access
Services can now have a Posture Check of type MFA that can be created and associated with a Service
Policy. Service Policies that are associated with an MFA Posture Check will restrict access to
services if a client has not enrolled in MFA and passed an MFA check on each login.
MFA Posture Checks support only the basic Posture Check fields:
- name - a name for the posture check
- typeId - should be "MFA"
- tags - any tags desired for this object
- roleAttributes - role attributes used to select this object from Service Policies
Example:
POST /posture-checks
{
"name": "Any MFA",
"typeId": "MFA",
"roleAttributes": ["mfa"]
}
Admin MFA Management
Admins of the Ziti Edge API can remove MFA from any user. However, they cannot enroll on behalf of
the client. The client will have to initiate MFA enrollment via their client.
Endpoints:
DELETE /identities/<id>/mfa
- remove MFA from an identityGET /identities
- has a new fieldisMfaEnabled
that is true/false based on the identity's MFA
enrollmentGET /identities/<id>/posture-data
- now includes asessionPostureData
field which is a map of
sessionId -> session posture data (including MFA status)
Example Posture Data:
{
"mac": ["03092ac3bc69", "2b6df1dc52d9"],
"domain": "mycorp.com",
"os": {
...
},
processes: [
...
],
sessionPostureData: {
"xV1442s": {
"mfa": {
"passedMfa": true
}
}
}
}
Client MFA Enrollment
Clients must individually enroll in MFA as the enrollment process includes exchanging a symmetric
key. During MFA enrollment the related MFA endpoints will return different data and HTTP status
codes based upon the state of MFA enrollment (enrollment not started, enrollment started, enrolled).
The general MFA enrollment flow is:
- Authenticate as the identity via
POST /authenticate
- Start MFA enrollment via
POST /current-identity/mfa
- Retrieve the MFA provisioning URL or QR code
GET /current-identity/mfa
GET /current-identity/mfa/qr-code
- Use the provisioning URL or QR code with an authentication app such as Google Authenticator,
Authy, etc. - Use a current code from the authenticator to
POST /current-identity/mfa/verify
with the code in
thecode
field{"code": "someCode"}
MFA Endpoints Overview:
This section is an overview for the endpoints. Each endpoint may return errors depending on in input
and MFA status.
GET /current-identity/mfa
- returns the current state of MFA enrollment or 404 Not FoundPOST /current-identity/mfa
- initiates MFA enrollment or 409 ConflictDELETE /current-identity/mfa
- remove MFA enrollment, requires a valid TOTP or recovery codeGET /current-identity/mfa/recovery-codes
- returns the current recovery codes, requires a valid
TOTP CodePOST /current-identity/mfa/recovery-codes
- regenerates recovery codes, requires a valid TOTP
codePOST /current-identity/mfa/verify
- allows MFA enrollment to be completed, requires a valid TOTP
codeGET /current-identity/mfa/qr-code
- returns a QR code for use with QR code scanner, MFA
enrollment must be startedPOST /authenticate/mfa
- allows MFA authentication checks to be completed, requires a valid TOTP
or recovery code
MFA Enrollment Not Started:
GET /current-identity/mfa
- returns HTTP status 404POST /current-identity/mfa
- start MFA enrollment, 200 OkDELETE /current-identity/mfa
- returns 404 Not FoundGET /current-identity/mfa/recovery-codes
- returns 404 Not FoundPOST /current-identity/mfa
- returns 404 Not FoundPOST /current-identity/mfa/verify
- returns 404 Not FoundGET /current-identity/mfa/qr-code
- returns 404 Not Found
MFA Enrollment Started:
GET /current-identity/mfa
- returns the current MFA enrollment and recovery codesPOST /current-identity/mfa
- returns 409 ConflictDELETE /current-identity/mfa
- aborts the current enrollment, a blankcode
may be suppliedGET /current-identity/mfa/recovery-codes
- returns 404 Not FoundPOST /current-identity/mfa
- returns HTTP status 409 ConflictPOST /current-identity/mfa/verify
- validates the suppliedcode
GET /current-identity/mfa/qr-code
- returns a QR code for use with QR code scanner in PNG format
MFA Completed:
GET /current-identity/mfa
- returns the current MFA enrollment, but not recovery codesPOST /current-identity/mfa
- returns 409 ConflictDELETE /current-identity/mfa
- removes MFA, a valid TOTP or recovery code must be suppliedGET /current-identity/mfa/recovery-codes
- shows the current recovery codes, a valid TOTP code
must be suppliedPOST /current-identity/mfa
- returns HTTP status 409 ConflictPOST /current-identity/mfa/verify
- returns HTTP status 409 ConflictGET /current-identity/mfa/qr-code
- returns 404 Not Found
Client MFA Recovery Codes
Client MFA recovery codes are generated during enrollment and can be regenerated at any time with a
valid TOTP code. Twenty codes are generated and are one time use only. Generating new codes replaces
all existing recovery codes.
To view:
GET /current-identity/mfa/recovery-codes
{
"code": "123456"
}
To Generate new codes:
POST /current-identity/mfa/recovery-codes
{
"code": "123456"
}
Authentication
During API Session authentication a new authQuery
field is returned. This field will indicate if
there are any outstanding authentication Posture Queries that need to be fulfilled before
authentication is considered complete.
When MFA authentication is required a field will now appear as an
authQuery
with the following format:
{
...
"token": "c68a187a-f4af-490c-a9dd-a09076511419",
"authQueries": [
...,
{
"typeId": "MFA",
"provider": "ZITI",
"httpMethod": "POST",
"httpUrl": "./authenticate/mfa",
"minLength": 4,
"maxLength": 6,
"format": "alphaNumeric"
},
...
]
}