@@ -21,6 +21,9 @@ dotnet run --project src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Cli -- --help
|
||||
| `-P` / `--plc-type` | `Slc500` | Slc500 / MicroLogix / Plc5 / LogixPccc |
|
||||
| `--timeout-ms` | `5000` | Per-operation timeout — see precedence note below |
|
||||
| `--retries` | `0` | Retry count on transient `BadCommunicationError` (PR 9 / #252) |
|
||||
| `--demote-failure-threshold` | `3` | **PR ablegacy-12 / #255** — consecutive comm failures before the device is auto-demoted |
|
||||
| `--demote-for-ms` | `30000` | **PR ablegacy-12 / #255** — auto-demote cool-down window in ms |
|
||||
| `--no-demote` | off | **PR ablegacy-12 / #255** — disable auto-demote entirely (counters still tick) |
|
||||
| `--verbose` | off | Serilog debug output |
|
||||
|
||||
Family ↔ CIP-path cheat sheet:
|
||||
@@ -84,6 +87,37 @@ otopcua-ablegacy-cli probe -g ab://192.168.1.20/1,0
|
||||
otopcua-ablegacy-cli probe -g ab://192.168.1.30/ -P MicroLogix -a S:0
|
||||
```
|
||||
|
||||
`probe` output (PR ablegacy-12 / #255) reports both `Health` (driver health
|
||||
state) and `Host state`. The latter is sourced from `IHostConnectivityProbe`
|
||||
and surfaces `Demoted` when the auto-demote threshold has tripped — a fast
|
||||
visual signal that the CLI is short-circuiting future reads against this
|
||||
device until the cool-down expires:
|
||||
|
||||
```text
|
||||
Gateway: ab://192.168.1.20/1,0
|
||||
PLC type: Slc500
|
||||
Health: Degraded
|
||||
Host state: Demoted
|
||||
Last error: libplctag status -33 reading N7:0
|
||||
```
|
||||
|
||||
### Auto-demote knobs
|
||||
|
||||
```powershell
|
||||
# Trip after just one comm failure, hold for 60s.
|
||||
otopcua-ablegacy-cli read -g ab://192.168.1.20/1,0 -a N7:0 -t Int `
|
||||
--demote-failure-threshold 1 --demote-for-ms 60000
|
||||
|
||||
# Opt out of auto-demote — stresses the link without short-circuiting.
|
||||
otopcua-ablegacy-cli read -g ab://192.168.1.20/1,0 -a N7:0 -t Int --no-demote
|
||||
```
|
||||
|
||||
The CLI is a one-shot test client — auto-demote primarily matters in the
|
||||
server-side multi-device deployment, where a single demoted PLC can no
|
||||
longer block reads against its healthy peers. Use the CLI flags to
|
||||
reproduce a flapping-link scenario locally before tuning the server-side
|
||||
`appsettings.json` `Demote` block.
|
||||
|
||||
### `read`
|
||||
|
||||
```powershell
|
||||
|
||||
@@ -7,10 +7,12 @@ directly without going through a separate diagnostics RPC. Mirrors the AB CIP
|
||||
|
||||
Closes #253 (PR ablegacy-10).
|
||||
|
||||
## The seven counters
|
||||
## The nine counters
|
||||
|
||||
Each device managed by the `AbLegacyDriver` exposes seven read-only nodes under
|
||||
`AbLegacy/<host>/_Diagnostics/<name>`:
|
||||
Each device managed by the `AbLegacyDriver` exposes nine read-only nodes under
|
||||
`AbLegacy/<host>/_Diagnostics/<name>`. The first seven shipped in PR ablegacy-10;
|
||||
`DemoteCount` + `LastDemotedUtc` arrived with PR ablegacy-12 / #255 (auto-demote
|
||||
on comm failure).
|
||||
|
||||
| Name | Type | Semantics |
|
||||
|---|---|---|
|
||||
@@ -21,6 +23,8 @@ Each device managed by the `AbLegacyDriver` exposes seven read-only nodes under
|
||||
| `LastErrorCode` | Int32 | Most recent libplctag status code on a failed read; `0` when no error has been seen since the last reset. |
|
||||
| `LastErrorMessage` | String | Most recent libplctag error message on a failed read; empty when no error has been seen since the last reset. |
|
||||
| `CommFailures` | Int64 | Count of read failures mapped to `BadCommunicationError`. Spans transient libplctag throws + retried-out chains so operators see a single "wire fell off" counter. |
|
||||
| `DemoteCount` | Int64 | **PR ablegacy-12** — cumulative auto-demote events for this device. Bumps every time the driver crosses the consecutive-failure threshold and arms a fresh cool-down window. Cumulative across `ReinitializeAsync` (preserved through redeploys) so a flapping link surfaces as a steadily climbing counter. |
|
||||
| `LastDemotedUtc` | String | **PR ablegacy-12** — ISO-8601 UTC timestamp of the most recent auto-demotion. Empty string when this device has never been demoted. |
|
||||
|
||||
**Address shape**: `_Diagnostics/<deviceHostAddress>/<name>` —
|
||||
e.g. `_Diagnostics/ab://10.0.0.5/1,0/RequestCount`.
|
||||
@@ -34,10 +38,11 @@ user-config tag node, just under a reserved sibling folder.
|
||||
|
||||
| Trigger | Effect |
|
||||
|---|---|
|
||||
| `ReinitializeAsync` | Every counter for every device resets to zero, plus `LastErrorMessage` clears to empty. |
|
||||
| `ShutdownAsync` | Same as Reinitialize — counters drop with the device map. |
|
||||
| `ReinitializeAsync` | Every counter for every device resets to zero, plus `LastErrorMessage` clears to empty. **PR ablegacy-12 exception:** `DemoteCount` + `LastDemotedUtc` survive the reinit so an operator redeploying mid-incident doesn't lose the flapping-link history. |
|
||||
| `ShutdownAsync` | All counters drop with the device map (including `DemoteCount`). |
|
||||
| Driver process restart | Counters start at zero. |
|
||||
| Probe transition Stopped→Running | **No automatic reset** — counters are cumulative across reconnect events so operators can spot intermittent links by watching `CommFailures` keep climbing. |
|
||||
| Probe transition Demoted→Running | **PR ablegacy-12** — early-clear of the active demote window, but the cumulative `DemoteCount` stays put. |
|
||||
|
||||
There is no in-process "reset" RPC at the time of writing. If you need to
|
||||
clear counters without a redeploy, kick a `ReinitializeAsync` from the Admin
|
||||
@@ -99,14 +104,85 @@ overview dashboard, plus a faster rate (1 s) on `LastErrorMessage` /
|
||||
short-circuit makes every read O(1) — there's no penalty for fast polling
|
||||
of the counter itself, only the OPC UA subscription bookkeeping.
|
||||
|
||||
## Auto-demote on comm failure (PR ablegacy-12 / #255)
|
||||
|
||||
When a device fails N consecutive reads or probes the driver marks it
|
||||
**Demoted** for a configurable cool-down window. Reads against a demoted
|
||||
device short-circuit with `BadCommunicationError` *without invoking
|
||||
libplctag* — that's the whole point of the feature: one slow PLC sharing
|
||||
the driver thread can't starve faster peers reading from healthy hosts on
|
||||
the same `AbLegacyDriver` instance.
|
||||
|
||||
### Configuration
|
||||
|
||||
Per-device, optional. `null` keeps the documented defaults (auto-demote
|
||||
**enabled** with 3 failures / 30 s).
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"Devices": [
|
||||
{
|
||||
"HostAddress": "ab://10.0.0.5/1,0",
|
||||
"PlcFamily": "Slc500",
|
||||
"Demote": {
|
||||
"FailureThreshold": 3, // default 3
|
||||
"DemoteForMs": 30000, // default 30s
|
||||
"Enabled": true // default true
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
| Knob | Default | Notes |
|
||||
|---|---|---|
|
||||
| `FailureThreshold` | `3` | Consecutive comm failures before the device is demoted. A successful read or probe resets the tally. Terminal failures (`BadNodeIdUnknown`, `BadTypeMismatch`, …) **do not count** — they're config / decoder mismatches, not field outages. |
|
||||
| `DemoteForMs` | `30000` (30s) | Cool-down window. Reads while this is active short-circuit; a successful probe clears it early. |
|
||||
| `Enabled` | `true` | Set to `false` to keep the diagnostic counters but skip the auto-throttle. The failure tally still ticks but never arms the cool-down. |
|
||||
|
||||
### Recovery
|
||||
|
||||
Three ways out of Demoted, in order of likelihood:
|
||||
|
||||
1. **Probe success** — the per-device probe loop (`Probe.Enabled = true`,
|
||||
default address `S:0`) is the fast path. The next probe iteration after
|
||||
demotion will exercise the wire; on success it clears
|
||||
`DemotedUntilUtc` immediately and transitions the host to `Running`.
|
||||
2. **Window expiry** — once `DemoteForMs` elapses the demote marker
|
||||
clears on the next read attempt. The read goes through; if it fails,
|
||||
the failure tally keeps counting from where it left off (so a
|
||||
permanently-down device re-arms the window after one more consecutive
|
||||
failure rather than having to repeat the full threshold).
|
||||
3. **`ReinitializeAsync`** — clears `ConsecutiveFailures` +
|
||||
`DemotedUntilUtc` outright. Cumulative `DemoteCount` survives.
|
||||
|
||||
### Observability
|
||||
|
||||
`DemoteCount` is the headline counter — it bumps once per demotion event,
|
||||
not per short-circuited read. A device that flaps every hour for a week
|
||||
shows `DemoteCount = ~168` on Friday afternoon, which is the operator
|
||||
signal you actually want.
|
||||
|
||||
`LastDemotedUtc` is the ISO-8601 UTC timestamp of the most recent
|
||||
demotion. Bind it on a per-device tile alongside `DemoteCount` for
|
||||
"flapping link" alerting.
|
||||
|
||||
### Host-state surface
|
||||
|
||||
A demoted device reports `HostState.Demoted` (new in PR ablegacy-12
|
||||
on `Core.Abstractions/IHostConnectivityProbe.cs`). Consumers that
|
||||
predate the new value (the central `HostStatusPublisher`) safely treat
|
||||
it as `Stopped` — no schema migration needed.
|
||||
|
||||
## Cross-references
|
||||
|
||||
- [`AbLegacyDiagnosticTags.cs`](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDiagnosticTags.cs)
|
||||
— counter store + read short-circuit
|
||||
- [`AbLegacyDriver.cs`](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriver.cs)
|
||||
— increment sites in `ReadAsync`, discovery emission in `DiscoverAsync`
|
||||
— increment sites in `ReadAsync`, discovery emission in `DiscoverAsync`,
|
||||
auto-demote bookkeeping in `RecordFailureAndMaybeDemote` + `ProbeLoopAsync`
|
||||
- [`AbLegacy-Test-Fixture.md`](AbLegacy-Test-Fixture.md) — `AbLegacyDiagnosticsTests`
|
||||
+ collision-rejection contract
|
||||
+ `AbLegacyAutoDemoteTests` + collision-rejection contract
|
||||
- [AB CIP `_System/` parallel](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/AbCipSystemTagSource.cs)
|
||||
— same pattern with the CIP-specific six entries (incl. writeable
|
||||
`_RefreshTagDb` trigger)
|
||||
|
||||
@@ -53,12 +53,31 @@ supplies a `FakeAbLegacyTag`.
|
||||
counters: 5 reads (3 ok / 2 fail) → `RequestCount=5`, `ResponseCount=3`,
|
||||
`ErrorCount=2`; `LastErrorCode` reflects the most recent libplctag status;
|
||||
`RetryCount` increments per retry attempt beyond the first; counters reset
|
||||
on `ReinitializeAsync`; discovery emits exactly 7 diagnostic variables per
|
||||
device under `_Diagnostics/`; collision rejection at `InitializeAsync` for
|
||||
user tags shadowing reserved names or `_Diagnostics/` addresses; the
|
||||
`_Diagnostics/<host>/<name>` short-circuit returns the live snapshot through
|
||||
`ReadAsync` without bumping `RequestCount`; two devices keep counters
|
||||
independent.
|
||||
on `ReinitializeAsync`; discovery emits the canonical diagnostic variables
|
||||
per device under `_Diagnostics/` (now 9 with PR ablegacy-12); collision
|
||||
rejection at `InitializeAsync` for user tags shadowing reserved names or
|
||||
`_Diagnostics/` addresses; the `_Diagnostics/<host>/<name>` short-circuit
|
||||
returns the live snapshot through `ReadAsync` without bumping
|
||||
`RequestCount`; two devices keep counters independent.
|
||||
- `AbLegacyAutoDemoteTests` — **PR ablegacy-12 / #255** auto-demote on comm
|
||||
failure: 3 consecutive failures arm the demote window and surface
|
||||
`HostState.Demoted`; subsequent reads short-circuit with
|
||||
`BadCommunicationError` *without invoking libplctag* (verified via
|
||||
`factory.Tags["N7:0"].ReadCount` not advancing); successful read resets
|
||||
the consecutive-failure counter; failure-success-failure pattern doesn't
|
||||
cross the threshold; `DemoteCount` + `LastDemotedUtc` surface via
|
||||
`_Diagnostics/`; `Enabled=false` opts out (failures still count, demotion
|
||||
never fires); `ReinitializeAsync` clears the active window but preserves
|
||||
cumulative `DemoteCount`; cool-down expiry allows the next read through;
|
||||
two devices in one driver — one faulty, one healthy — proves the faulty
|
||||
side's demotion doesn't starve the healthy side; `BadNodeIdUnknown`
|
||||
(terminal) does not count toward the comm-failure tally; DTO JSON
|
||||
round-trip preserves `FailureThreshold` / `DemoteForMs` / `Enabled` at
|
||||
the per-device level; `HostState.Demoted` enum value is wired through
|
||||
`Core.Abstractions`. Companion integration test in
|
||||
`tests/.../IntegrationTests/AbLegacyAutoDemoteTests.cs` runs the
|
||||
two-device-one-unreachable scenario against a live ab_server fixture
|
||||
using `127.0.0.1:1` as the unreachable peer.
|
||||
- `RsLogixSymbolImportTests` — ablegacy-11 / #254 RSLogix CSV symbol-import parser:
|
||||
canonical 8-row CSV (one row per N/F/B/L/ST/T/C/R) → 8 typed
|
||||
`AbLegacyTagDefinition`s with the right `DataType`; header + comment-line
|
||||
|
||||
Reference in New Issue
Block a user