docs: clarify FeatherlessClient.embed() rationale (verified 500 + empty embedding catalog)

Updates the docstring + test docstring for the NotImplementedError stub
shipped in T112 (Phase 4.5). Original wording said Featherless 'does
not expose /v1/embeddings'; verified the endpoint actually responds
but always returns HTTP 500 with type='completions_error' for every
model tried (text-embedding-3-small, BAAI/bge-small-en-v1.5,
sentence-transformers/all-MiniLM-L6-v2, etc.) and /v1/models has no
embedding-class entries. Stub behavior unchanged.
This commit is contained in:
Joseph Doherty
2026-04-27 11:39:53 -04:00
parent a03f664407
commit b3d78c1603
2 changed files with 30 additions and 19 deletions
+11 -8
View File
@@ -1,10 +1,12 @@
"""Tests for FeatherlessClient (Phase 4.5+).
Phase 4.5 adds an ``embed()`` method to the LLMClient Protocol (T112).
Featherless does not expose an OpenAI-compatible ``/v1/embeddings``
endpoint, so its implementation deliberately raises
``NotImplementedError`` to surface the gap clearly. The
``generate_embedding`` wrapper catches this and degrades to the
Featherless's OpenAI-compatible surface routes ``/v1/embeddings`` but
every request returns HTTP 500 ``{"type": "completions_error"}`` (the
router accepts the URL but the backend has no embedding handler), and
``/v1/models`` lists no embedding-class models. The implementation
raises ``NotImplementedError`` rather than ship a request that always
errors; ``generate_embedding`` catches it and degrades to the
zero-vector fallback (the existing T107 warning path).
If/when Featherless ships embeddings, swap the body for a real call to
@@ -20,10 +22,11 @@ from chat.llm.featherless import FeatherlessClient
@pytest.mark.asyncio
async def test_featherless_embed_raises_not_implemented():
"""Featherless does not expose ``/v1/embeddings`` — embed() must
raise ``NotImplementedError`` so callers (``generate_embedding``)
can degrade to the fallback zero vector + warning rather than
silently producing useless output."""
"""Featherless's ``/v1/embeddings`` always 500s with
``"completions_error"`` and its model catalog has no embedding
class — embed() must raise ``NotImplementedError`` so callers
(``generate_embedding``) can degrade to the fallback zero vector
+ warning rather than silently producing useless output."""
client = FeatherlessClient(api_key="test-key")
with pytest.raises(NotImplementedError) as excinfo:
await client.embed("hello world", model="bge-small-en-v1.5")