docs: former-api-specs (MES + DNC/Delmia) + inbound compile-error known issue

- former-api-specs/mes: Alarm-API, MoveIn-MoveOut-API, API-key authgaps (from ~/Desktop/mesapi)
- former-api-specs/dnc: Delmia-Integration-API — Delmia document service + WW recipe-download notify (from ~/Desktop/delmiaintegration)
- known-issues: inbound API compile error not client-visible; no api-method validate
This commit is contained in:
Joseph Doherty
2026-06-26 04:13:19 -04:00
parent 33da8c797c
commit 8a78e759c0
6 changed files with 1009 additions and 0 deletions
+18
View File
@@ -0,0 +1,18 @@
# Former API Specs
Reference specifications for the **predecessor / legacy APIs** that ScadaBridge replaces or
interoperates with. These are *not* ScadaBridge's own contracts — they are kept here as
historical reference when porting integrations onto ScadaBridge's Inbound API (#14) and
External System Gateway (#7), and for parity-checking behavior during cutover.
ScadaBridge's own component design docs live under [`docs/requirements/`](../requirements);
this folder holds only the **outgoing/legacy** systems' specs.
## Subfolders
- [`mes/`](mes) — the legacy MES API (ServiceStack "WWSupport / APIServer", a.k.a. `mesapi`)
that bridges MES ↔ AVEVA Wonderware via MXAccess. Source repo: `~/Desktop/mesapi`.
- [`dnc/`](dnc) — the legacy DNC integration: DELMIA Apriso document/recipe download +
Wonderware recipe-download notification (`~/Desktop/delmiaintegration`).
Drop the relevant spec documents into the matching subfolder.
@@ -0,0 +1,206 @@
# Delmia / DNC Integration API — document download & WW recipe-download notification
Reference for the **`delmiaintegration`** solution (`~/Desktop/delmiaintegration`), the legacy
bridge that pulls "proven" manufacturing documents (NC programs / recipes) out of **DELMIA Apriso**
and notifies **AVEVA Wonderware** that a recipe was downloaded for a machine. The recipe/NC-program
push to machines is classic **DNC** (Distributed Numerical Control), hence this lives under
`former-api-specs/dnc/`.
There are **two distinct API surfaces** in this solution:
| # | Surface | Direction | Transport |
|---|---------|-----------|-----------|
| A | **Delmia document web service** | integration → Delmia (consumed) | form-urlencoded POST, **XML** response |
| B | **WW recipe-download notification** | `WWNotifier` CLI → Wonderware receiver | **JSON** POST (`/notify`) |
> Surface B is the **legacy predecessor of the ScadaBridge `DelmiaRecipeDownload` inbound method** —
> same field set and result shape (see *ScadaBridge equivalent* below).
Source files (under `~/Desktop/delmiaintegration`):
- `DelmiaIntegration/DelmiaClient.cs` — Delmia HTTP client (Surface A)
- `DelmiaContracts/*.cs` — XML contracts (`DownloadResult`, `SearchResults`/`SearchResult`, + unused `MachineInfo`/`MachineSearchResults`/`UserInfo`)
- `WWNotifier/Program.cs`, `WWNotifier/CommandLineOptions.cs`, `WWNotifier/Models/RecipeDownload*.cs` — the CLI notifier (Surface B)
- `WWNotifier/App.config``NotifyURL` / `NotifyTimeout`
### End-to-end flow
```
DELMIA Apriso ──(A) RequestProvenDocument / RequestDocument / Search──► DelmiaClient
(intercim "ruleset" web service, XML) │ downloads proven doc/recipe to a path
WWNotifier.exe ──(B) POST /notify {RecipeDownload}──► Wonderware "WW receiver"
(CLI, exit code + YES/NO) ◄── {RecipeDownloadResult} ──
```
---
## Surface A — Delmia document web service (consumed by `DelmiaClient`)
`DelmiaClient` (`DelmiaClient.cs`) is a thin `HttpClient` wrapper. The **base URL is supplied by
the caller** (constructor / `URL` property — there is no default in `DelmiaIntegration/App.config`);
`Timeout` defaults to **30 s**. Every call is an `application/x-www-form-urlencoded` POST to
`{URL}/<Operation>`, and every response is **XML** deserialized with `XmlSerializer` (root
namespace `http://intercim.com/ruleset` — the Apriso/InterCIM heritage).
| Verb | Path | Form fields | Response (XML) | Client method |
|------|------|-------------|----------------|---------------|
| `POST` | `{URL}/RequestProvenDocument` | `username`, `machineID`, `partNumber`, `operationNumber`, `workOrderNumber` | `DownloadResult` | `RequestProvenDocument[Async]` |
| `POST` | `{URL}/RequestDocument` | + `documentKey` | `DownloadResult` | `RequestDocument[Async]` |
| `POST` | `{URL}/Search` | `username`, `machineID`, `partNumber`, `operationNumber` | `SearchResults` | `Search[Async]` |
- **`RequestProvenDocument`** — fetch the single *proven* (released/approved) document for the
part + operation, logging it against `username` / `workOrderNumber`.
- **`Search`** — list candidate documents matching the part/operation (each carries a
`DocumentKey` + `DocumentURL`).
- **`RequestDocument`** — fetch one specific document chosen from a search by `documentKey`.
`username` here is **identity/audit only** — it is a form field, not an authentication credential.
There is **no API key, no `Authorization` header, no TLS requirement** on this surface (see gotchas).
### Response — `DownloadResult` (metadata about the download)
`DownloadResult` describes *who/what/which order+document* was downloaded — it does **not** carry
the file bytes (the file lands at a download path on the Delmia side). Fields:
| Group | Fields |
|-------|--------|
| User | `UserKey` (int), `UserName`, `UserSite` |
| Machine | `MachineKey` (int), `MachineID`, `MachineSite` |
| Order | `WorkOrderNumber`, `ShopOrderKey` (int), `ShopOrderID`, `ShopOrderStatus`, `ShopOrderOperKey` (int), `ShopOrderOperID`, `ShopOrderOperStatus` |
| Document | `DocumentKey` (int), `DocumentName`, `DocumentRev`, `DocumentStatus` |
| Part | `PartID`, `PartRev` |
| Outcome | **`TransferSuccessful`** (bool), `ErrorMessage` (string) |
> **Success is `TransferSuccessful`, not the HTTP status.** Treat `TransferSuccessful == true` as the
> only success signal; `ErrorMessage` carries the reason otherwise.
### Response — `SearchResults` / `SearchResult`
`SearchResults` = `{ List<SearchResult> Results; string ErrorMessage }`. Each `SearchResult`:
`ShopOrderKey`/`ShopOrderID`/`ShopOrderStatus`, `ShopOrderOperKey`/`ShopOrderOperID`/`ShopOrderOperStatus`,
`DocumentKey` (int), `DocumentObjectID` (int, with an `XmlIgnore` `…Specified` flag),
`DocumentName`, `DocumentRev`, `DocumentStatus`, **`DocumentURL`**, `PartID`, `PartRev`.
### Quick reference (curl, Surface A)
```bash
# Search for candidate documents (values from DownloadTestUtil's example)
curl -X POST "{URL}/Search" \
-H "Content-Type: application/x-www-form-urlencoded" \
--data-urlencode "username=dohertj2" \
--data-urlencode "machineID=000005" \
--data-urlencode "partNumber=00444455599" \
--data-urlencode "operationNumber=0100"
# -> <SearchResults xmlns="http://intercim.com/ruleset"> … </SearchResults>
# Request the proven document for that part/operation
curl -X POST "{URL}/RequestProvenDocument" \
-H "Content-Type: application/x-www-form-urlencoded" \
--data-urlencode "username=dohertj2" \
--data-urlencode "machineID=000005" \
--data-urlencode "partNumber=00444455599" \
--data-urlencode "operationNumber=0100" \
--data-urlencode "workOrderNumber=W111111"
```
---
## Surface B — WW recipe-download notification (`WWNotifier` → WW receiver)
`WWNotifier.exe` (`WWNotifier/Program.cs`) is a **command-line tool** invoked by the download
process to tell Wonderware that a recipe/NC file was placed at a path for a machine. It POSTs a
JSON `RecipeDownload` to the configured receiver and interprets the JSON `RecipeDownloadResult`.
### CLI options (`CommandLineOptions`)
| Short | Long | Required | Meaning |
|-------|------|----------|---------|
| `-d` | `--downloadpath` | **yes** | File download path |
| `-m` | `--machine` | **yes** | Machine code |
| `-w` | `--workorder` | **yes** | Work order number |
| `-p` | `--partnumber` | **yes** | Part / item number |
| `-s` | `--seqop` | no | Job step / sequence number |
| `-u` | `--username` | no | Operator username |
### Endpoint & payload
`POST {NotifyURL}` with a JSON body, expecting a JSON reply. From `WWNotifier/App.config`:
`NotifyURL = http://wonder-app-vd01.zmr.zimmer.com:9001/notify`, `NotifyTimeout = 30` (seconds).
`NotifyURL` may be a **comma-separated list** — each is tried in order and the **first success
wins** (failover).
**Request — `RecipeDownload` (JSON)**
| Field | Type | Source |
|-------|------|--------|
| `MachineCode` | string | `--machine` |
| `DownloadPath` | string | `--downloadpath` |
| `WorkOrderNumber` | string | `--workorder` |
| `PartNumber` | string | `--partnumber` |
| `JobStepNumber` | string | `--seqop` |
| `Username` | string | `--username` |
**Response — `RecipeDownloadResult` (JSON)**: `{ "Result": bool, "ResultText": string }`.
### Process contract (stdout + exit code)
The caller (Delmia) reads `WWNotifier`'s console output and exit code:
- On success → prints **`YES`**; exit code `0`.
- On any failure (arg parse, missing config, all receivers failed, `Result == false`) → prints
**`NO`** plus a reason line; `Environment.ExitCode = -1`.
```bash
WWNotifier.exe -m Z28061 -d "C:\recipes\wo111111.nc" -w W111111 -p P111111 -s 0100 -u chamalas
# stdout: YES (or: NO\n<reason>)
# POSTs {"MachineCode":"Z28061","DownloadPath":"C:\\recipes\\wo111111.nc", … } to /notify
```
---
## Other contracts (defined but unused)
`MachineInfo`, `MachineSearchResults` (`{ List<MachineInfo> Results; string ErrorMessage }`), and
`UserInfo` exist in `DelmiaContracts` (same `intercim.com/ruleset` namespace) but are **not wired to
any `DelmiaClient` operation** — scaffolding for machine-lookup / user-lookup endpoints that were
never implemented. Shapes, for reference: `MachineInfo` = `MachineKey`/`MachineID`/`MachineName`/
`DownloadPath`/`MachineDescription`/`MachineSite`/`MachineStatus`; `UserInfo` = `UserKey`/`UserName`/
`UserSite`/`IsActive`.
---
## Behavior notes & gotchas
- **No authentication on either surface.** Surface A's `username` is just a logged form field;
Surface B sends no credential at all. Both are plaintext HTTP. (Contrast the ScadaBridge inbound
API, which requires `X-API-Key`/`Bearer` — see `../mes/authgaps.md`.)
- **Surface A ignores HTTP status.** `DelmiaClient` never calls `EnsureSuccessStatusCode`; a non-2xx
body is handed straight to `XmlSerializer`, which typically throws → the `catch` returns a generic
`TransferSuccessful = false` / `"Failed to call Delmia web service at '<URL>'."`. The real HTTP
error is lost.
- **Sync methods block on `.Result`.** `RequestProvenDocument`/`RequestDocument`/`Search` call
`.Result` on the async POST (deadlock-prone in some contexts); async variants exist.
- **`WWNotifier` has a latent NPE in its error path.** On a notify exception it logs
`error.InnerException.Message` (`Program.cs:129`); if `InnerException` is null this throws inside
the `catch`, masking the original error.
- **`NotifyURL` failover is first-success-wins**, in list order; a slow first endpoint costs up to
`NotifyTimeout` before the next is tried.
- **Surface A base URL is caller-supplied** (no config default), so the effective Delmia endpoint
depends on whoever constructs `DelmiaClient` (e.g. `TestUI`/`DelmiaClientUI`).
---
## ScadaBridge equivalent (porting note)
- **Surface B → ScadaBridge Inbound API `DelmiaRecipeDownload`.** The legacy `WWNotifier.exe` + the
`/notify` WW receiver are replaced by `POST /api/DelmiaRecipeDownload` (authenticated with
`X-API-Key`/`Bearer`). The contract is identical: request `{ MachineCode, DownloadPath,
WorkOrderNumber, PartNumber, JobStepNumber, Username }` → response `{ Result, ResultText }`. The
inbound script routes to the site via `Route.To(MachineCode).Call("ProcessRecipeDownload", …)`.
The CLI's comma-list failover is superseded by Traefik active-node routing; the `YES`/`NO` + exit
code contract becomes the HTTP status + JSON body.
- **Surface A (Delmia document service) has no direct ScadaBridge equivalent** — retrieving proven
documents from DELMIA Apriso remains an external concern (it would be an External System Gateway
call if pulled into ScadaBridge).
This file documents the **legacy** `delmiaintegration` contracts for reference/parity during that
migration.
+266
View File
@@ -0,0 +1,266 @@
# WWSupport MES API — Alarm Status API
Reference for the alarm-status endpoints exposed by the **WWSupport / APIServer** ServiceStack
service. The API reads live machine-alarm state out of AVEVA Wonderware (System Platform /
Galaxy) via MXAccess and returns it to MES callers.
- **Service host:** `AppHost` (`AppSelfHostBase`, name `"APIServer"`) — `APIServer/APIServer/AppHost.cs`
- **Service implementation:** `MesServices``APIServer/APIServer.ServiceInterface/MesServices.cs`
- **Business logic:** `MesNotifier``APIServer/APIServer.ServiceInterface/MesNotifier.cs`
- **Framework:** ServiceStack 6.0.2, self-hosted; SQL Server (OrmLite, `SqlServer2016Dialect`)
> Serialization note: `JsConfig.IncludeNullValues = true`, so null fields **are** emitted in JSON
> responses (e.g. `"AckDT": null`).
---
## Endpoints
| Verb | Route | Request DTO | Response DTO |
|------|-------|-------------|--------------|
| `POST` | `/mes/alarmstatus` | `AlarmStatusRequest` | `AlarmStatusResponse` |
| `POST` | `/mes/simplealarmstatus` | `SimpleAlarmStatusRequest` | `AlarmStatusResponse` |
Both endpoints return the **same** `AlarmStatusResponse` shape. Both are dispatched through
ServiceStack `Any(...)` handlers in `MesServices`, which resolve the singleton `MesNotifier` and
call `AlarmStatus(...)` / `SimpleAlarmStatus(...)`.
The service also enables `PostmanFeature` and `OpenApiFeature` (Swagger), so a running instance
exposes a browsable contract and a Postman collection.
---
## Authentication & authorization
All MES services are decorated with:
```csharp
[Authenticate]
[RequiredRole("MESAPI")]
public class MesServices : Service { ... }
```
The caller must be authenticated **and** hold the `MESAPI` role. Auth is configured in
`AppHost.Configure`:
- **API key** — `ApiKeyAuthProvider`
- `SessionCacheDuration = 30 minutes`
- `RequireSecureConnection = false` (HTTP is accepted; TLS not enforced)
- An API key whose `Environment == "test"` is routed to the `TestDb` connection instead of the
production DB (`AppHost.GetDbConnection`).
- **LDAP** — `LdapAuthProvider` (directory credentials).
- `AllowGetAuthenticateRequests = true`.
User/role data is persisted with `OrmLiteAuthRepository` (`UseDistinctRoleTables = true`) in the
same SQL Server database.
---
## `POST /mes/simplealarmstatus`
The convenience endpoint: identify a machine by SAPID, get back its MES-relevant alarms.
### Request — `SimpleAlarmStatusRequest`
| Field | Type | Notes |
|-------|------|-------|
| `SAPID` | string | SAP identifier of the machine. Required. |
```json
{ "SAPID": "100012345" }
```
### Behavior
1. Look up the `Machine` by `SAPID`. If not found → `WasSuccessful = false`,
`ErrorText = "Failed to find machine with SAPID '<id>'"`.
2. Select that machine's alarms **filtered to `FlaggedForMES = true` only**.
3. Read live tag state and return every alarm that is **currently triggered** (`InAlarm == true`).
Notes specific to this endpoint:
- Always **flagged-only** (cannot return non-MES alarms).
- Does **not** honor an "include acked" toggle — acked alarms are always included (with
`StatusCode = "Triggered.Acked"`).
---
## `POST /mes/alarmstatus`
The full endpoint: pick the machine by any one of several keys, and filter the alarm set.
### Request — `AlarmStatusRequest`
| Field | Type | Notes |
|-------|------|-------|
| `MachineFilter` | `MachineFilter` | Selects the machine (see below). Defaults to empty. |
| `AlarmFilter` | `AlarmFilter` | Filters the alarms (see below). Defaults to "all flagged + unflagged, triggered + acked". |
#### `MachineFilter`
| Field | Type | Notes |
|-------|------|-------|
| `MachineID` | int? | DB primary key. |
| `SAPID` | string | SAP identifier. |
| `ZTag` | string | Z-tag identifier. |
| `Code` | string | Machine code (also the MXAccess tag prefix). |
**Resolution precedence** (first non-empty wins): `SAPID``Code``ZTag``MachineID`.
At least one selector must be supplied; if all are empty/unmatched the call fails with
`"Failed to find machine with given machine filter '<dump>'"`. A supplied-but-unmatched selector
fails with a selector-specific message, e.g. `"Failed to find machine with Code '<code>'"`.
#### `AlarmFilter`
| Field | Type | Default | Effect |
|-------|------|---------|--------|
| `NameFilter` | string | `null` | Case-insensitive **substring** match on alarm `Name`. |
| `MinSeverity` | int? | `null` | Keep alarms with `Severity >= MinSeverity`. |
| `MaxSeverity` | int? | `null` | Keep alarms with `Severity <= MaxSeverity`. |
| `IncludeTriggered` | bool | `true` | **Currently unused** — see Behavior notes. |
| `IncludeAcked` | bool | `true` | When `false`, acked alarms are excluded from the result. |
| `FlaggedOnly` | bool | `false` | When `true`, restrict to `FlaggedForMES = true` alarms. |
```json
{
"MachineFilter": { "SAPID": "100012345" },
"AlarmFilter": {
"MinSeverity": 500,
"FlaggedOnly": true,
"IncludeAcked": false
}
}
```
### Behavior
1. Resolve the machine via `MachineFilter` precedence (above).
2. Load all `MachineAlarm` rows for that machine, then apply the in-process `AlarmFilter` in this
order: `FlaggedOnly``MinSeverity``MaxSeverity``NameFilter`.
3. Read live tag state; return every remaining alarm that is **triggered** (`InAlarm == true`),
skipping acked alarms when `IncludeAcked == false`.
---
## Response — `AlarmStatusResponse`
| Field | Type | Notes |
|-------|------|-------|
| `WasSuccessful` | bool | `false` on any lookup or tag-read failure. |
| `ErrorText` | string | Populated when `WasSuccessful == false`. |
| `Alarms` | `AlarmInfo[]` | Triggered alarms matching the request. **Cleared if `WasSuccessful == false`.** |
### `AlarmInfo`
| Field | Type | Source |
|-------|------|--------|
| `Name` | string | `MachineAlarm.Name`. |
| `HierarchicalName` | string | `"{Machine.Code}.{Name}"`. |
| `Description` | string | Live `…​.DescAttrName` tag value. |
| `IsFlaggedForMES` | bool | `MachineAlarm.FlaggedForMES`. |
| `Severity` | int | `MachineAlarm.Severity` (0999). |
| `StatusCode` | string | `"Triggered"`, or `"Triggered.Acked"` when acked. |
| `TriggeredDT` | DateTime | Live `…​.TimeAlarmOn` tag value. |
| `AckDT` | DateTime? | Live `…​.TimeAlarmAcked` tag value (null if unacked). |
| `AckComment` | string | Live `…​.AckMsg` tag value. |
### Example response
```json
{
"WasSuccessful": true,
"ErrorText": null,
"Alarms": [
{
"Name": "HighVacuumFault",
"HierarchicalName": "Z28061.HighVacuumFault",
"Description": "High vacuum sensor out of range",
"IsFlaggedForMES": true,
"Severity": 800,
"StatusCode": "Triggered.Acked",
"TriggeredDT": "2026-06-25T01:14:22",
"AckDT": "2026-06-25T01:16:05",
"AckComment": "Investigating - day shift"
}
]
}
```
---
## How alarm state is read (MXAccess)
Alarm configuration (name, severity, MES flag) lives in SQL; **live state** is read from
Wonderware at request time:
- For each `MachineAlarm`, an `AlarmTagset` (`AlarmTagset.cs`) builds MXAccess tag paths of the
form `{Machine.Code}.{Alarm.Name}.<suffix>`, where `<suffix>` is one of:
`Quality`, `InAlarm`, `TimeAlarmOn`, `DescAttrName`, `Acked`, `TimeAlarmAcked`, `AckMsg`.
- `MesNotifier` subscribes via `LMXProxyServerClass.AdviseSupervisory` (handle registered as
`"MesNotifier"`), waits for the first data change per tag, then unsubscribes.
- **Quality gate:** every alarm's `Quality` must be OPC-Good (`192`); otherwise the whole request
fails with `ErrorText = "Failed to read machine alarm status"` and `Alarms` is cleared.
- Detail tags (`TimeAlarmOn`, `DescAttrName`, `Acked`, `TimeAlarmAcked`, `AckMsg`) are only read
for alarms whose `InAlarm` is true.
- **Per-request timeout:** 30 seconds (`CancellationTokenSource(30000)`); on timeout the pending
reads resolve as failed.
---
## Underlying data model — `MachineAlarm`
SQL table backing alarm configuration (`APIServer.ServiceModel/DTO/MachineAlarm.cs`):
| Column | Type | Constraints |
|--------|------|-------------|
| `MachineAlarmID` | int | PK, auto-increment |
| `MachineID` | int | FK → `Machine` |
| `Machine` | `Machine` | OrmLite `[Reference]` |
| `Name` | string | Required, ≤ 256 |
| `Severity` | int | 0999 |
| `FlaggedForMES` | bool | Marks an alarm as MES-relevant |
| `LastUpdate` | DateTime | |
| `LastUpdateBy` | string | Required, ≤ 128 |
| `OtherData` | string | Max-length text |
---
## Behavior notes & gotchas
- **`AlarmFilter.IncludeTriggered` is declared but never used.** Both endpoints only ever return
alarms whose `InAlarm` tag is `true`; setting `IncludeTriggered = false` has no effect.
- **Only triggered alarms are returned.** There is no "all configured alarms" mode — an alarm not
currently in alarm state never appears, regardless of filters.
- **`/simplealarmstatus` is flagged-only and ignores ack filtering**; use `/mes/alarmstatus` with
`FlaggedOnly` / `IncludeAcked` for control over those.
- **`MachineFilter` precedence matters** — supplying both `SAPID` and `Code` uses `SAPID`; the
others are ignored.
- **All-or-nothing on read errors.** A single bad-quality tag fails the whole response (success
flag flips, alarm list is emptied) rather than returning a partial set.
- **Severity bounds are inclusive** on both `MinSeverity` and `MaxSeverity`.
- The route attributes carry placeholder Swagger text (`Summary = "POST Summary"`,
`Notes = "Notes"`); these are cosmetic.
---
## Quick reference (curl)
```bash
# Simple — flagged alarms for one machine by SAPID
curl -X POST http://<host>:<port>/mes/simplealarmstatus \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-d '{"SAPID":"100012345"}'
# Full — filtered alarms, machine chosen by Code
curl -X POST http://<host>:<port>/mes/alarmstatus \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-d '{
"MachineFilter": { "Code": "Z28061" },
"AlarmFilter": { "MinSeverity": 500, "IncludeAcked": false }
}'
```
> Exact auth header format depends on how `ApiKeyAuthProvider` is configured for the deployment
> (bearer token vs. HTTP Basic with the key as username). Confirm against the live Swagger/Postman
> metadata for the target server.
@@ -0,0 +1,339 @@
# WWSupport MES API — Move In / Move Out API
Reference for the batch **move-in / move-out** endpoints exposed by the **WWSupport / APIServer**
ServiceStack service. These endpoints hand a container/work-order payload to a machine's
`MesReceiver` object in AVEVA Wonderware (System Platform / Galaxy) over MXAccess, using a
flag-based request/response handshake, and read back the resulting batch id (and, for move-out,
recorded cycle data from SQL).
- **Service host:** `AppHost` (`AppSelfHostBase`, name `"APIServer"`) — `APIServer/APIServer/AppHost.cs`
- **Service implementation:** `MesServices``APIServer/APIServer.ServiceInterface/MesServices.cs`
- **Business logic:** `MesNotifier``APIServer/APIServer.ServiceInterface/MesNotifier.cs`
- **Framework:** ServiceStack 6.0.2, self-hosted; SQL Server (OrmLite, `SqlServer2016Dialect`)
> Serialization note: `JsConfig.IncludeNullValues = true`, so null fields **are** emitted in JSON
> responses (e.g. `"BatchID": null`).
This document covers `/mes/movein` and `/mes/moveout`. The two alarm endpoints
(`/mes/alarmstatus`, `/mes/simplealarmstatus`) are documented in [`Alarm-API.md`](Alarm-API.md).
---
## Endpoints
| Verb | Route | Request DTO | Response DTO |
|------|-------|-------------|--------------|
| `POST` | `/mes/movein` | `MoveInRequest` | `MoveInResponse` |
| `POST` | `/mes/moveout` | `MoveOutRequest` | `MoveOutResponse` |
Both are dispatched through ServiceStack `Any(...)` handlers in `MesServices`, which resolve the
singleton `MesNotifier` and call `MoveIn(...)` / `MoveOut(...)`.
> The handlers call the async business methods with a blocking `.Result`
> (`mesNotifier.MoveIn(request).Result`), so each request occupies its thread until the
> handshake completes or the 30 s timeout fires.
The service also enables `PostmanFeature` and `OpenApiFeature` (Swagger), so a running instance
exposes a browsable contract and a Postman collection.
---
## Authentication & authorization
Identical to the alarm endpoints — all MES services are decorated with:
```csharp
[Authenticate]
[RequiredRole("MESAPI")]
public class MesServices : Service { ... }
```
The caller must be authenticated **and** hold the `MESAPI` role. Configured in `AppHost.Configure`:
- **API key** — `ApiKeyAuthProvider` (`SessionCacheDuration = 30 min`, `RequireSecureConnection = false`).
An API key whose `Environment == "test"` is routed to the `TestDb` connection instead of the
production DB (`AppHost.GetDbConnection`). **Caveat:** that redirect only applies to OrmLite
lookups — the move-out cycle-data read uses a raw `ConnectionStrings["BatchDB"]` connection and
is *not* affected by the test-key redirect (see gotchas).
- **LDAP** — `LdapAuthProvider`.
- `AllowGetAuthenticateRequests = true`.
Default API-key transport is the standard `Authorization` header (`Bearer <key>`, or HTTP Basic
with the key as username); `?apikey=<key>` also works since `AllowInHttpParams` defaults on.
---
## `POST /mes/movein`
Hands a container + its work orders to a machine's `MesReceiver` to start/stage a batch.
### Request — `MoveInRequest`
| Field | Type | Notes |
|-------|------|-------|
| `SAPID` | string | SAP identifier of the target machine. Used to resolve the `Machine` row. Required. |
| `OperatorName` | string | Operator initiating the move-in. |
| `JobSequenceNumber` | string | Job sequence number. |
| `ContainerNumber` | string | MES container number. |
| `WorkOrders` | `WorkOrderInfo[]` | Work orders being moved in (default empty). Each item: `WorkOrderNumber` (string), `PartNumber` (string). |
```json
{
"SAPID": "100012345",
"OperatorName": "chamalas",
"JobSequenceNumber": "50",
"ContainerNumber": "cont-012",
"WorkOrders": [
{ "WorkOrderNumber": "W111111", "PartNumber": "P111111" }
]
}
```
### Response — `MoveInResponse`
| Field | Type | Notes |
|-------|------|-------|
| `WasSuccessful` | bool | `true` only if the machine reported `MoveInSuccessfulFlag = true`. |
| `ErrorText` | string | Failure reason, or the machine's `MoveInErrorText` on a completed-but-failed move. |
| `BatchID` | int? | The machine's `MoveInBatchID`, **only when non-zero**; otherwise null. |
### Behavior
1. **30 s budget** (`CancellationTokenSource(30000)`).
2. **Resolve machine:** `db.Single<Machine>(x => x.SAPID == request.SAPID)`. If not found →
`WasSuccessful = false`, `ErrorText = "Failed to find machine with SAPID '<id>'"` (early return).
3. **Subscribe** to the machine's move-in tagset (`MesMoveInTagset`, all under
`{Machine.Code}.MesReceiver.*`) via `AdviseSupervisory`; wait for the first value of each.
If any subscription fails → `"Failed to connect to machine"`.
4. **Ready gate:** require `MoveInReadyFlag == true`, else `"Machine move in ready flag not set to true"`.
5. If still successful, perform the **handshake**:
- Arm a watcher for `MoveInCompleteFlag → true` (`OnValue`).
- Write the payload tags (see table) **and** set `MoveInFlag = true`. If any write is not
acknowledged → `"Failed to write move in information to machine"`.
- Wait for `MoveInCompleteFlag`:
- **Completed:** `WasSuccessful = MoveInSuccessfulFlag`; `ErrorText = MoveInErrorText`;
`BatchID = MoveInBatchID` (only if ≠ 0).
- **Timed out:** `"Timeout waiting for move in information to be processed"`.
6. **Unsubscribe** all tags.
### Tags written / read (`{Machine.Code}.MesReceiver.*`)
| Property | MXAccess tag suffix | Type | Direction | Source / meaning |
|----------|---------------------|------|-----------|------------------|
| `MoveInReadyFlag` | `MoveInReadyFlag` | bool | read (gate) | Machine must be ready to accept a move-in. |
| `MoveInMesContainerNumber` | `MoveInMesContainerNum` | string | write | `request.ContainerNumber`. |
| `MoveInOperatorName` | `MoveInOperatorName` | string | write | `request.OperatorName`. |
| `MoveInJobSequenceNumber` | `MoveInJobSequenceNumber` | string | write | `request.JobSequenceNumber`. |
| `MoveInNumberWorkOrders` | `MoveInNumberWorkOrders` | int | write | `request.WorkOrders.Count`. |
| `MoveInPartNumbers` | `MoveInPartNumbers[]` | string[] | write | `WorkOrders[*].PartNumber`, **fixed length 50**. |
| `MoveInWorkOrderNumbers` | `MoveInWorkOrderNumbers[]` | string[] | write | `WorkOrders[*].WorkOrderNumber`, **fixed length 50**. |
| `MoveInFlag` | `MoveInFlag` | bool | write (trigger) | Set `true` to start processing. |
| `MoveInCompleteFlag` | `MoveInCompleteFlag` | bool | watch | Machine sets `true` when done. |
| `MoveInSuccessfulFlag` | `MoveInSuccessfulFlag` | bool | read (result) | Machine's success verdict. |
| `MoveInErrorText` | `MoveInErrorText` | string | read (result) | Machine's error message. |
| `MoveInBatchID` | `MoveInBatchID` | int | read (result) | Created batch id (0 = none). |
---
## `POST /mes/moveout`
Closes out a container's work orders on a machine's `MesReceiver`, and (when the machine is
configured for it) returns the recorded cycle data for the resulting batch.
### Request — `MoveOutRequest`
Same as `MoveInRequest` **minus `JobSequenceNumber`**:
| Field | Type | Notes |
|-------|------|-------|
| `SAPID` | string | SAP identifier of the target machine. Required. |
| `OperatorName` | string | Operator initiating the move-out. |
| `ContainerNumber` | string | MES container number. |
| `WorkOrders` | `WorkOrderInfo[]` | Work orders being moved out. Each: `WorkOrderNumber`, `PartNumber`. |
```json
{
"SAPID": "100012345",
"OperatorName": "chamalas",
"ContainerNumber": "cont-012",
"WorkOrders": [
{ "WorkOrderNumber": "W111111", "PartNumber": "P111111" }
]
}
```
### Response — `MoveOutResponse`
| Field | Type | Notes |
|-------|------|-------|
| `WasSuccessful` | bool | `true` only if the machine reported `MoveOutSuccessfulFlag = true`. |
| `ErrorText` | string | Failure reason, or the machine's `MoveOutErrorText`. |
| `BatchID` | int? | The machine's `MoveOutBatchID` (non-zero), **and** only when cycle storage is enabled (below). |
| `Data` | `MoveOutData[]` | Recorded cycle values (empty unless cycle storage is enabled). Defaults to `[]`. |
#### `MoveOutData`
| Field | Type | Source |
|-------|------|--------|
| `BatchId` | int | `MachineCycle.MachineBatchId`. |
| `CycleId` | int | `MachineCycle.MachineCycleId`. |
| `ValueName` | string | One of the cycle-data keys (below). |
| `Value` | object | The value for that key (`null` becomes empty string in SQL via `COALESCE`). |
### Behavior
Steps 15 mirror move-in (resolve `Machine` by `SAPID`; subscribe `MesMoveOutTagset`; require
`MoveOutReadyFlag == true`; write payload + `MoveOutFlag = true`; wait for `MoveOutCompleteFlag`).
On completion: `WasSuccessful = MoveOutSuccessfulFlag`, `ErrorText = MoveOutErrorText`.
**Cycle-data read (move-out only).** If `MoveOutBatchID != 0` **and** the machine's `OtherData`
contains the literal `"StoreCycleDataForMES"`:
- `BatchID = MoveOutBatchID`.
- Open a **raw** `SqlConnection` on `ConnectionStrings["BatchDB"]` and run a parameterized query
(`@machineBatchID = MoveOutBatchID`) against `BT.dbo.MachineCycle`, expanding each cycle's
`OtherData` JSON (via `OPENJSON … CROSS APPLY (VALUES …)`) into one `MoveOutData` row per
key/value pair. Only rows where `ISJSON(mc.OtherData) = 1` are processed.
```sql
-- Shape of the cycle-data query (BT.dbo.MachineCycle, WHERE MachineBatchId = @machineBatchID)
SELECT mc.MachineBatchId, mc.MachineCycleId, v.ValueName, COALESCE(v.Value, N'') AS Value
FROM BT.dbo.MachineCycle mc
CROSS APPLY OPENJSON(mc.OtherData) WITH ( /* one column per key below */ ) od
CROSS APPLY (VALUES (N'ProgramNum', od.ProgramNum), /* … one row per key … */ ) AS v(ValueName, Value)
WHERE mc.MachineBatchId = @machineBatchID AND ISJSON(mc.OtherData) = 1;
```
**Cycle-data keys extracted** (19 per cycle): `ProgramNum`, `DewPointStart`, `SegmentStart2`,
`HighVacEndSeg1`, `SegmentStart3`, `SegmentStart4`, `SoakStartTime`, `SegmentStart5`,
`SoakEndTime`, `DurationFinalSoak`, `MaxSoakTemp`, `MinSoakTemp`, `MaxSoakPressure`,
`MinSoakPressure`, `SegmentStart6`, `QuenchTemp`, `DewPointMax`, `StartTimestamp`, `EndTimestamp`.
### Tags written / read (`{Machine.Code}.MesReceiver.*`)
Same set as move-in with the `MoveOut` prefix, **minus `JobSequenceNumber`**:
`MoveOutReadyFlag` (gate), `MoveOutMesContainerNum` / `MoveOutOperatorName` /
`MoveOutNumberWorkOrders` / `MoveOutPartNumbers[]` / `MoveOutWorkOrderNumbers[]` (write),
`MoveOutFlag` (trigger), `MoveOutCompleteFlag` (watch), `MoveOutSuccessfulFlag` /
`MoveOutErrorText` / `MoveOutBatchID` (result).
---
## How the handshake works (MXAccess)
`MesNotifier` holds a single process-wide `LMXProxyServerClass` (handle registered as
`"MesNotifier"`, wired to `OnDataChange` + `OnWriteComplete`). For each request it:
1. **Advise**`AddItem(path)` then `AdviseSupervisory`; the first `OnDataChange` per tag
resolves that tag's read task (success = quality `192` / OPC-Good). Values are coerced to the
tag's CLR type via `Convert.ChangeType`.
2. **Write**`LMXProxyServerClass.Write`; `OnWriteComplete` resolves the write task with the
driver's success flag.
3. **Watch**`Tag.OnValue(target)` completes when a subsequent `OnDataChange` reports a value
equal to the target (used for the `…CompleteFlag → true` step).
4. **Unadvise** — every tag is unsubscribed and removed at the end of the request.
Everything is bounded by the per-request **30 s** `CancellationTokenSource`; on expiry, all
pending reads/writes/watches resolve as `false`.
The move-in/move-out contract is therefore a classic flag protocol on the machine's `MesReceiver`:
**read ready → write payload + set request flag → wait for complete flag → read success/error/batch**.
---
## Underlying data model — `Machine`
SQL table backing machine lookup (`APIServer.ServiceModel/DTO/Machine.cs`); `Code` is also the
MXAccess tag prefix and `SAPID` is the move-in/move-out selector:
| Column | Type | Constraints |
|--------|------|-------------|
| `MachineID` | int | PK, auto-increment |
| `Code` | string | Required, ≤ 50 — **MXAccess tag prefix** (`{Code}.MesReceiver.*`) |
| `Name` | string | Required, ≤ 50 |
| `ZTag` | string | ≤ 10 |
| `SAPID` | string | ≤ 10 — **move-in/out selector** |
| `Description` | string | ≤ 256 |
| `TimeZone` | string | Required, ≤ 128 |
| `MultipleBatch` / `MultipleCycle` / `Active` | bool | |
| `LastUpdate` | DateTime | |
| `LastUpdateBy` | string | Required, ≤ 128 |
| `OtherData` | string | Max-length text — **gates cycle storage** when it contains `"StoreCycleDataForMES"` |
---
## Behavior notes & gotchas
- **Work-order arrays are fixed length 50.** `ToFixedLength(50)` always writes exactly 50-element
string arrays (padded with `null`); a 51st+ work order is **silently dropped**. `PartNumbers`
and `WorkOrderNumbers` are written as two parallel arrays — index *i* of one corresponds to
index *i* of the other (positional pairing from the same `WorkOrders` list). `NumberWorkOrders`
carries the true count.
- **Connect-failure message gets clobbered.** The "connect" check and the "ready flag" check are
sequential `if`s with no early return; if the subscribe fails, `MoveInReadyFlag.Value` is its
default (`false`), so the ready-flag check also fires and **overwrites** `ErrorText` with
`"…ready flag not set to true"`. A connection problem can surface as a ready-flag error.
- **`BatchID` is null unless the machine reports a non-zero batch id.** Move-out additionally
requires cycle storage to be enabled before it sets `BatchID`.
- **Cycle data is opt-in per machine.** `Data` is empty unless `Machine.OtherData` contains
`"StoreCycleDataForMES"` *and* `MoveOutBatchID != 0`.
- **Cycle-data read bypasses the test-DB redirect.** It uses a raw `ConnectionStrings["BatchDB"]`
connection against hard-coded `BT.dbo.MachineCycle`, so a `test`-environment API key (which
redirects *OrmLite* to `TestDb`) still reads cycle data from `BatchDB`.
- **`WasSuccessful` reflects the machine, not just the transport.** Even with a clean handshake,
`WasSuccessful` is whatever the machine wrote to `…SuccessfulFlag`, and `ErrorText` is the
machine's `…ErrorText`.
- **Synchronous blocking.** `MesServices` calls `.Result` on the async methods; combined with the
single shared MXAccess proxy, throughput is effectively serialized per request thread.
- **Machine lookup uses `db.Single`** on `SAPID`; returns null when unmatched (handled), and the
first match if `SAPID` is non-unique.
- The route attributes carry placeholder Swagger text (`Summary = "POST Summary"`,
`Notes = "Notes"`); cosmetic only.
---
## Quick reference (curl)
```bash
# Move in a container + work order
curl -X POST http://<host>:<port>/mes/movein \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-d '{
"SAPID":"100012345",
"OperatorName":"chamalas",
"JobSequenceNumber":"50",
"ContainerNumber":"cont-012",
"WorkOrders":[{"WorkOrderNumber":"W111111","PartNumber":"P111111"}]
}'
# Move out the same container (returns cycle Data when the machine stores it for MES)
curl -X POST http://<host>:<port>/mes/moveout \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-d '{
"SAPID":"100012345",
"OperatorName":"chamalas",
"ContainerNumber":"cont-012",
"WorkOrders":[{"WorkOrderNumber":"W111111","PartNumber":"P111111"}]
}'
```
> Exact auth header format depends on how `ApiKeyAuthProvider` is configured for the deployment
> (bearer token vs. HTTP Basic with the key as username). Confirm against the live Swagger/Postman
> metadata for the target server.
---
## ScadaBridge equivalent (porting note)
ScadaBridge re-implements these flows as **Inbound API methods** (`POST /api/{method}`,
`X-API-Key` header — *not* the ServiceStack `Authorization`/`apikey` scheme) that route to a
site's `MesReceiver` instance script via `Route.To(<instanceCode>).Call("MesMoveIn"/"MesMoveOut", …)`:
- `IpsenMESMoveIn` / `MesMoveIn``/mes/movein`; `MesMoveOut``/mes/moveout`.
- The ready/trigger/complete **flag handshake moves into the site instance script** (Site Runtime),
rather than the central API driving MXAccess tags directly.
- Machine resolution by SAP id is a `Database.QuerySingleAsync<string>("BTDB", "SELECT … Machine WHERE SAPID=@s")`
inside the inbound script (see `docs/plans/2026-06-16-ipsen-mes-movein-design.md`).
This file documents the **legacy** WWSupport contract for reference/parity during that migration.
+121
View File
@@ -0,0 +1,121 @@
# WWSupport MES API — API key authentication: support & gaps
How the legacy **WWSupport / APIServer** ServiceStack service authenticates API-key callers, and
the API-key-specific gaps in it **as of 2026-06-25** (the state of the `~/Desktop/mesapi` copy).
Captured as reference for the ScadaBridge migration so the replacement Inbound API does not inherit
these weaknesses.
> Scope: defensive review of an internally-owned legacy service — weaknesses and remediations, no
> exploit steps. Limited to API-key auth (other auth/config concerns are out of scope here).
Source files: `APIServer/AppHost.cs`, `APIServer.ServiceInterface/MesServices.cs`,
`APIServer/App.config`.
---
## What's supported (API key)
- **`ApiKeyAuthProvider`** is registered in the host `AuthFeature` (`AppHost.cs:63-70`) with
`SessionCacheDuration = 30 min` and `RequireSecureConnection = false`.
- **Keys are DB-backed.** Users, roles, and keys live in SQL via `OrmLiteAuthRepository`
(`UseDistinctRoleTables = true`); `InitSchema()` creates the tables at startup
(`AppHost.cs:53-59`). There is no API-key config in `App.config` — keys are issued/stored by the
ServiceStack auth repository.
- **Authorization.** A valid key authenticates the request; the key's user must also hold the
`MESAPI` role to reach any operation (`MesServices.cs:6-8`):
```csharp
[Authenticate]
[RequiredRole("MESAPI")]
public class MesServices : Service { ... } // movein, moveout, alarmstatus, simplealarmstatus
```
- **Transports for the key** (ServiceStack defaults, nothing overridden):
`Authorization: Bearer <key>`, HTTP Basic with the key as the username, or — since
`AllowInHttpParams` defaults on — `?apikey=<key>` in the query string/form.
- **Per-key environment tag.** A key whose `Environment == "test"` is routed to a `TestDb`
connection instead of production (`AppHost.GetDbConnection`, `AppHost.cs:102-108`).
- **Transport.** The listener is plain HTTP — `http://*:9501/` (DEV) / `http://*:9500/` (QA/PROD)
(`App.config:57,68,80`); no HTTPS listener is configured (relevant to gap #1).
---
## Gaps & risks (worst first)
### 1. 🟠 High — the key is exposed in transit and in logs
Three settings compound:
- **No TLS** — plain-HTTP listeners (`App.config:57`) and `RequireSecureConnection = false`
(`AppHost.cs:69`), so the key crosses the network in cleartext to any on-path observer.
- **Key accepted in the URL** — `AllowInHttpParams` (default on) makes `?apikey=<key>` valid, so
keys land in proxy logs and browser history.
- **Request logging persists it** — the enabled `RequestLogsFeature` writes request data to a CSV
on disk (`AppHost.cs:79-86`), so a query-string key can be written to the log file.
**Fix:** terminate TLS and set `RequireSecureConnection = true`; set
`ApiKeyAuthProvider { AllowInHttpParams = false }` so keys must travel in the `Authorization`
header; redact auth fields from the request log and restrict the log file's ACLs.
### 2. 🟡 Medium — keys are not scoped to methods
A valid key + the `MESAPI` role grants access to **every** `[RequiredRole("MESAPI")]` endpoint
(move-in, move-out, both alarm reads). There is no per-key allow-list of methods, so one leaked
integration key exposes the entire MES surface.
**Fix:** scope keys per integration (separate roles/keys for read vs. move-in vs. move-out), the
way the ScadaBridge Inbound API does (keys carry an explicit method allow-list).
### 3. 🟢 LowMedium — the per-key `test` redirect is half-wired
`GetDbConnection` opens a `"TestDb"` connection for `Environment == "test"` keys
(`AppHost.cs:104-106`), but **no `TestDb` connection string is defined** in `App.config` — such
requests would throw. Separately, the move-out cycle-data read uses a raw
`ConnectionStrings["BatchDB"]` connection (see `MoveIn-MoveOut-API.md`), so it ignores the
redirect and reads production data even with a `test` key.
**Fix:** define `TestDb` (or remove the redirect), and route the raw cycle-data read through the
same environment-aware connection so a `test` key never touches production.
---
## Remediation checklist (priority order)
- [ ] **Enforce TLS** + `RequireSecureConnection = true`; stop accepting the key in the URL
(`AllowInHttpParams = false`); redact/secure the request logs (#1).
- [ ] **Scope API keys per method/integration** instead of one coarse `MESAPI` role (#2).
- [ ] **Fix or remove the `test` → `TestDb` redirect** and make the raw cycle-data read
environment-aware (#3).
---
## Contrast: ScadaBridge Inbound API
ScadaBridge uses an **`X-API-Key` header** on the data plane (`POST /api/{method}`), validated
server-side, with each key **scoped to an explicit method allow-list** — versus one coarse
`MESAPI` role granting all endpoints here. This file documents the legacy API-key posture so that
difference is intentional and verifiable during cutover.
### Accepted auth transports
| Transport | mesapi (ServiceStack `ApiKeyAuthProvider`) | ScadaBridge Inbound API |
|-----------|--------------------------------------------|-------------------------|
| `Authorization: Bearer <key>` | ✅ | ✅ — `Bearer sbk_<keyId>_<secret>` (the `Bearer ` prefix is optional; a bare token in `Authorization` also works) |
| `Authorization: Basic <base64("<key>:")>` (key as username) | ✅ | ❌ |
| `X-API-Key: <key>` | ❌ (not supported by stock ServiceStack) | ✅ — raw token `sbk_<keyId>_<secret>` |
| `?apikey=<key>` (query string / form param) | ✅ (`AllowInHttpParams` defaults on) | ❌ — headers only |
| Session cookie after first auth (`ss-id`/`ss-pid`) | ✅ (30-min `SessionCacheDuration`) | ❌ — stateless; every request re-presents the key |
Evidence: mesapi is stock `new ApiKeyAuthProvider(AppSettings)` with no header customization (so
ServiceStack defaults apply); ScadaBridge logic is `EndpointExtensions.cs:83-95`.
Notes:
- **The only common header is `Authorization: Bearer`** — the portable choice for a client that
must talk to both.
- **`X-API-Key` is ScadaBridge-only**; **`Basic` and `?apikey=` are mesapi-only.** A `curl -H
"X-API-Key: …"` authenticates ScadaBridge but is *rejected* by mesapi.
- **Precedence when two are sent:** ScadaBridge — `Authorization` wins over `X-API-Key`; mesapi —
ServiceStack checks `Authorization` (Bearer/Basic) before the `apikey` param.
- **mesapi is looser, ScadaBridge is tighter:** mesapi accepts the key in the URL and over a
long-lived session cookie (more leak surface — see gap #1); ScadaBridge restricts to two
headers, no URL param, no session.
- **Token shape & verification:** mesapi keys are opaque ServiceStack keys checked against the auth
repo; ScadaBridge keys are structured `sbk_<keyId>_<secret>` verified by a peppered-HMAC
constant-time compare and **scoped to specific method names** (case-sensitive).
@@ -0,0 +1,59 @@
# Known issue — inbound API method compile errors are not client-visible; no on-demand validation (2026-06-25 session)
**Status:** OPEN · **Found:** 2026-06-25 · **Context:** live ops session on `wonder-app-vd03` — three deployed inbound methods (`IpsenMESMoveIn`, `MesMoveIn`, `MesMoveOut`) returned `Script compilation failed for this method` after being authored from a design doc that used the wrong DB-helper name (`Database.QuerySingle<T>` instead of the shipped async `Database.QuerySingleAsync<T>`). Diagnosing the *actual* Roslyn error required an SSH dive into the central log; nothing in the CLI or the Management/data-plane API surfaces it.
**Components:** Inbound API (#14), CLI (#19), Management Service (#18)
Issues are listed worst-first. Severities are author estimates. Neither item caused data loss — once the scripts were corrected via `UpdateApiMethod` they compiled and ran (verified with a live `MesMoveIn` test against the `Z28061Sim` instance: `{"WasSuccessful":true,"ErrorText":"","BatchID":0}`).
Related: the runtime mechanics behind both items are captured in the recall notes `inbound-known-bad-method-cache` and `scadabridge-inbound-db-helper-querysingleasync`. The root-cause doc fix shipped in `66bbbb7a` / `33da8c79`.
---
## 1. The real inbound-script compile error is server-log-only; there is no `api-method validate`
**Severity:** Medium · **Components:** Inbound API (#14), CLI (#19), Management Service (#18)
**Symptom:** When an inbound method's script fails to compile, every caller of `POST /api/{method}` gets the same generic body — `Script compilation failed for this method` — with no diagnostic. The actual Roslyn error (e.g. `'InboundDatabaseHelper' does not contain a definition for 'QuerySingle'`) is written **only** to the central server log. There is no CLI command and no API verb to (a) retrieve the last compile error for a method, or (b) compile/validate a method's script on demand the way templates can be validated.
**Reproduction (this session):**
```bash
curl -s -X POST http://wonder-app-vd03.zmr.zimmer.com:8085/api/IpsenMESMoveIn \
-H "X-API-Key: <key>" -H "Content-Type: application/json" -d '{ ...MoveIn... }'
# -> {"WasSuccessful":false,"ErrorText":null,"...":"Script compilation failed for this method"} (no detail)
```
The only way to get the real cause was:
```
ssh -tt -p 2222 -i ~/.ssh/servecli_wonder dohertj2@wonder-app-vd03.zmr.zimmer.com
# grep E:\ApiInstall\ScadaBridge\central\logs\scadabridge-central-<date>.log for "script compilation failed"
```
**Root cause:** `InboundScriptExecutor` deliberately returns a non-leaky generic message to the data plane (`InboundScriptExecutor.cs:299` and `:311``"Script compilation failed for this method"`), while the genuine diagnostic only ever reaches the logger:
- `InboundScriptExecutor.cs:182-183``LogWarning("API method {Method} script compilation failed: {Errors}", …)`
- `InboundScriptExecutor.cs:197``LogError(ex, "Failed to compile API method {Method} script", …)`
Returning the raw Roslyn text to an *external, API-key* caller is the right default (it can leak code/internal type names), but it means an **operator/admin** has no first-class channel to that text either. Contrast `template validate` (CLI `template validate --id`, README §Template) which runs a real compile and returns the diagnostics — there is no `api-method validate` equivalent (`grep` for `ValidateApiMethod`/`CompileApiMethod`/`RecompileApiMethod` across `src/` returns nothing; `ApiMethodCommands.cs` registers only `list`/`get`/`create`/`update`/`delete`).
**Impact:** turns a one-line fix into a host-access investigation. Authoring/repairing an inbound script becomes "update → fire a request → if it fails, SSH into the host and read the log → repeat," instead of "validate → read the error → fix."
**Suggested fix (pick one or both):**
1. **`api-method validate --id <int>`** (CLI #19) backed by a management `ValidateApiMethodCommand` (#18) that runs `CompileAndRegister`'s compile path and returns the structured diagnostics to the *authenticated, role-gated* management caller (never the data plane). Mirror `template validate`.
2. Surface the **last compile state** on `api-method get` — e.g. `LastCompileError` (string, null when clean) + `IsKnownBad` (bool) — so an operator can see why a method is failing without re-firing it. Keep the data-plane `/api/{method}` body generic as-is.
---
## 2. The `_knownBadMethods` cache is neither observable per-method nor resettable without a full-replace update
**Severity:** Low-Medium · **Components:** Inbound API (#14)
**Symptom:** Once a method's script fails to compile, its name is recorded in an in-memory bad-methods set and every later request **short-circuits** to the generic message *without recompiling*. Consequences observed/known:
- Fixing the stored script **directly in the config DB does not take effect** — the running process keeps serving the compile-failed message until a management `UpdateApiMethod` (which calls `CompileAndRegister`) or a service restart. (This is exactly why one of the three methods this session, `MesMoveIn`, had to be re-saved via the management API even though its stored script was already fine.)
- There is no way to *see* whether a given method is currently in the bad set (only an aggregate `internal int KnownBadMethodCount`, not exposed to any client), and no lightweight way to force a recompile short of a full-entity replace.
**Root cause:** `InboundScriptExecutor` (`InboundScriptExecutor.cs`):
- declares the cache at `:39` (`ConcurrentDictionary<string, byte> _knownBadMethods`),
- short-circuits on it at `:58` / `:298`, adds to it at `:62`,
- and clears an entry in **exactly one place**`CompileAndRegister` (`:103`, removal at `:118`). A direct DB row edit never runs `CompileAndRegister`, so the stale entry persists.
The cache itself is correct and desirable (it stops every request re-running a doomed Roslyn compile). The gap is purely **observability + a targeted reset**.
**Suggested fix:**
- Expose per-method state via the item-1 fix (`IsKnownBad` / `LastCompileError` on `api-method get`).
- Add a lightweight **`api-method recompile --id`** (management `RecompileApiMethodCommand`) that re-runs `CompileAndRegister` for one method without requiring the caller to round-trip the whole entity (script + timeout + parameterDefinitions + returnDefinition) — today `UpdateApiMethod` is full-replace, so an operator must re-send every field just to bust the cache. This is the smaller, lower-risk sibling of item 1's validate verb.