added

[CAPSULE:0.1.13] ⎯ 2025-03-20

about 2 months ago

Feature update with a new embeddings model, flash attention and improved ranking capabilities.

Added

New embeddings model.
Flash attention for faster processing.
max_tokens support for ranking requests.

Changed

Expanded context window for embeddings.
Improved ranking logic.

Fixed

LLM context window bug.
General resource optimization.