diff --git a/Component-ExternalSystemGateway.md b/Component-ExternalSystemGateway.md index 80d989e..bb91f7f 100644 --- a/Component-ExternalSystemGateway.md +++ b/Component-ExternalSystemGateway.md @@ -31,10 +31,16 @@ Site clusters (executes calls directly to external systems). Central cluster (st Each external system definition includes: - **Name**: Unique identifier (e.g., "MES", "RecipeManager"). -- **Connection Details**: Endpoint URL, authentication, protocol. -- **Retry Settings**: Max retry count, fixed time between retries (used by Store-and-Forward Engine). +- **Base URL**: The root endpoint URL for the external system (e.g., `https://mes.example.com/api`). +- **Authentication**: One of: + - **API Key**: Header name (e.g., `X-API-Key`) and key value. + - **Basic Auth**: Username and password. +- **Timeout**: Per-system timeout for all method calls (e.g., 30 seconds). Applies to the HTTP request round-trip. +- **Retry Settings**: Max retry count, fixed time between retries (used by Store-and-Forward Engine for transient failures only). - **Method Definitions**: List of available API methods, each with: - Method name. + - **HTTP method**: GET, POST, PUT, or DELETE. + - **Path**: Relative path appended to the base URL (e.g., `/recipes/{id}`). - Parameter definitions (name, type). - Return type definition. @@ -58,6 +64,29 @@ Each database connection definition includes: - Payload includes: connection name, SQL statement, serialized parameter values. - If the database is unavailable, the write is buffered and retried per the connection's retry settings. +## Invocation Protocol + +All external system calls are **HTTP/REST** with **JSON** serialization: + +- The ESG acts as an HTTP client. The external system definition provides the base URL; each method definition specifies the HTTP method and relative path. +- Request parameters are serialized as JSON in the request body (POST/PUT) or as query parameters (GET/DELETE). +- Response bodies are deserialized from JSON into the method's defined return type. +- Credentials (API key header or Basic Auth header) are attached to every request per the system's authentication configuration. + +## Call Timeout & Error Handling + +- Each external system definition specifies a **timeout** that applies to all method calls on that system. +- Error classification determines whether the Store-and-Forward Engine retries the call: + - **Transient failures** (connection refused, timeout, HTTP 5xx): The call is routed to the Store-and-Forward Engine for retry per the system's retry settings. The script does **not** block waiting for eventual delivery — the call is buffered and the script continues. + - **Permanent failures** (HTTP 4xx): No retry. The error is returned **synchronously** to the calling script for handling (log, notify, try different parameters, etc.). The failure is logged to Site Event Logging. +- This classification ensures the S&F buffer is not polluted with requests that will never succeed. + +## Database Connection Management + +- Database connections use **standard ADO.NET connection pooling** per named connection. No custom pool management. +- Pool behavior (max pool size, connection lifetime, etc.) can be tuned via connection string parameters in the database connection definition if needed. +- Synchronous failures on `Database.Connection()` (e.g., unreachable server) return an error to the calling script, consistent with external system permanent failure handling. + ## Dependencies - **Configuration Database (MS SQL)**: Stores external system and database connection definitions. diff --git a/Component-StoreAndForward.md b/Component-StoreAndForward.md index cfeb1a6..2fb84db 100644 --- a/Component-StoreAndForward.md +++ b/Component-StoreAndForward.md @@ -51,6 +51,8 @@ Retry settings are defined on the **source entity** (not per-message): The retry interval is **fixed** (not exponential backoff). Fixed interval is sufficient for the expected use cases. +**Note**: Only **transient failures** are eligible for store-and-forward buffering. For external system calls, transient failures are connection errors, timeouts, and HTTP 5xx responses. Permanent failures (HTTP 4xx) are returned directly to the calling script and are **not** queued for retry. This prevents the buffer from accumulating requests that will never succeed. + ## Buffer Size There is **no maximum buffer size**. Messages accumulate in the buffer until delivery succeeds or retries are exhausted and the message is parked. Storage is bounded only by available disk space on the site node. diff --git a/docs/plans/2026-03-16-external-system-gateway-refinement-design.md b/docs/plans/2026-03-16-external-system-gateway-refinement-design.md new file mode 100644 index 0000000..ab9981b --- /dev/null +++ b/docs/plans/2026-03-16-external-system-gateway-refinement-design.md @@ -0,0 +1,56 @@ +# External System Gateway Refinement — Design + +**Date**: 2026-03-16 +**Component**: External System Gateway (`Component-ExternalSystemGateway.md`) +**Status**: Approved + +## Problem + +The External System Gateway doc lacked specification for the invocation protocol, authentication methods, call timeouts, error classification for store-and-forward decisions, and database connection management. + +## Decisions + +### Invocation Protocol +- **HTTP/REST only** with **JSON** serialization. The ESG is an HTTP client with predefined endpoints. +- Method definitions include HTTP method (GET/POST/PUT/DELETE) and relative path. +- Parameters serialized as JSON body (POST/PUT) or query parameters (GET/DELETE). + +### Outbound Authentication +- Two modes per external system definition: + - **API Key**: Configurable header name and key value. + - **Basic Auth**: Username and password, sent as standard HTTP Authorization header. + +### Call Timeouts +- **Per-system timeout** — one timeout value applies to all methods on a given external system. +- Defined in the external system definition. + +### Error Classification +- **Transient failures** (connection errors, timeouts, HTTP 5xx): Routed to Store-and-Forward for retry. Script does not block. +- **Permanent failures** (HTTP 4xx): No retry. Error returned synchronously to the calling script. Logged to Site Event Logging. +- S&F buffer only accepts transient failures to avoid accumulating unrecoverable requests. + +### Permanent Failure Behavior +- Synchronous error back to script, consistent with DCL write failure handling. + +### Database Connection Pooling +- Standard ADO.NET connection pooling per named connection. No custom pool logic. +- Pool tuning via connection string parameters if needed. + +### Serialization +- JSON only, consistent with REST-only decision. + +## Affected Documents + +| Document | Change | +|----------|--------| +| `Component-ExternalSystemGateway.md` | Updated External System Definition fields (base URL, auth modes, timeout, HTTP method/path per method). Added 3 new sections: Invocation Protocol, Call Timeout & Error Handling, Database Connection Management. | +| `Component-StoreAndForward.md` | Clarified that only transient failures are buffered; 4xx errors are not queued. | + +## Alternatives Considered + +- **SOAP support**: Rejected — REST covers modern integrations; SOAP systems can be fronted by a thin REST wrapper. +- **OAuth2 client credentials**: Rejected — adds token lifecycle complexity at every site for marginal benefit; can be handled by a gateway/proxy. +- **Per-method timeouts**: Rejected — external systems tend to have consistent latency; per-system is the right granularity. +- **All failures retryable**: Rejected — retrying 4xx errors pollutes the S&F buffer with requests that will never succeed. +- **Custom connection pooling**: Rejected — ADO.NET pooling is battle-tested and handles this scenario natively. +- **XML serialization option**: Rejected — JSON-only is consistent with REST-only; XML systems can use a wrapper.