How to Read This Changelog
How to Read This Changelog
Each release is documented with:
- Added: New features and capabilities
- Changed: Modifications to existing functionality
- Deprecated: Features that will be removed in future versions
- Removed: Features that have been removed
- Fixed: Bug fixes and corrections
- Security: Security-related improvements
Version Format
Honcho follows Semantic Versioning:- MAJOR version for incompatible API changes
- MINOR version for backwards-compatible functionality additions
- PATCH version for backwards-compatible bug fixes
Honcho API and SDK Changelogs
- Honcho API
- Python SDK
- TypeScript SDK
v3.0.3 (Current)
Added
- Consolidated session context into a single DB session with 40/60 token budget allocation between summary and messages
- Observation validation via
ObservationInputPydantic schema with partial-success support and batch embedding with per-observation fallback - Peer card hard cap of 40 facts with case-insensitive deduplication and whitespace normalization
- Safe integer coercion (
_safe_int) for all LLM tool inputs to handle non-integer values like"Infinity" - Embedding pre-computation and reuse across multiple search calls in dialectic and representation flows
- Peer existence validation in dialectic chat endpoints — raises ResourceNotFoundException instead of silently failing
- Logging filter to suppress noisy
GET /metricsaccess logs - Oolong long-context aggregation benchmark (synth and real variants, 1K–4M token context windows)
- MolecularBench fact quality evaluation (ambiguity, decontextuality, minimality scoring)
- CoverageBench information recall evaluation (gold fact extraction, coverage matching, QA verification)
- LoCoMo summary-as-context baseline evaluation
- Webhook delivery tests, dependency lifecycle tests, queue cleanup tests, summarizer fallback tests
- Parallel test execution via pytest-xdist with worker-specific databases
test_reasoning_levels.pyscript for LOCOM dataset testing across reasoning levels
Changed
- Workspace deletion is now async — returns 202 Accepted, validates no active sessions (409 Conflict), cascade-deletes in background
- Redis caching layer now stores plain-dict instead of ORM objects, with v2-prefixed keys, storage, resilient
safe_cache_set/safe_cache_deletehelpers, and deferred post-commit cache invalidation - All
get_or_create_*CRUD operations now use savepoints (db.begin_nested()) instead of commit/rollback for race condition prevention - Reconciler vector sync uses direct ORM mutation instead of batch parameterized UPDATE statements
- Summarizer enforces hard word limit in prompt and creates fallback text for empty summaries with
summary_tokens = 0 - Blocked Gemini responses (SAFETY, RECITATION, PROHIBITED_CONTENT, BLOCKLIST) now raise
LLMErrorto trigger retry/backup-provider logic - Gemini client explicitly sets
max_output_tokensfrommax_tokensparameter - All deriver and metrics collector logging replaced with structured
logging.getLogger(__name__)calls - Dreamer specialist prompts updated to enforce durable-facts-only peer cards with max 40 entries and deduplication
GetOrCreateResultchanged fromNamedTupletodataclasswithasync post_commit()method- FastAPI upgraded from 0.111.0 to 0.131.0; added pyarrow dependency
- Queue status filtering to only show user-facing tasks (representation, summary, dream); excludes internal infrastructure tasks
Fixed
- JWT timestamp bug —
JWTParams.twas evaluated once at class definition time instead of per-instance - Session cache invalidation on deletion was missing
get_peer_card()now properly propagatesResourceNotFoundExceptioninstead of swallowing itset_peer_card()ensures peer exists viaget_or_create_peers()before updating- Backup provider failover with proper tool input type safety
- Removed
setup_admin_jwt()from server startup - Sentry coroutine detection switched from
asyncio.iscoroutinefunctiontoinspect.iscoroutinefunction
Removed
explicit.pyandobex.pybenchmarks replaced by coverage.py and molecular.py- Claude Code review automation workflow (
.github/workflows/claude.yml) - Coverage reporting from default pytest configuration
v3.0.2
Added
- Documentation for reasoning_level and Claude Code plugin
Changed
- Gave dreaming sub-agents better prompting around peer card creation, tweaked overall prompts
Fixed
- Added message-search fallback for memory search tool, necessary in fresh sessions
- Made FLUSH_ENABLED a config value
- Removed N+1 query in search_messages
v3.0.0
Added
- Agentic Dreamer for intelligent memory consolidation using LLM agents
- Agentic Dialectic for query answering using LLM agents with tool use
- Reasoning levels configuration for dialectic (
minimal,low,medium,high,max) - Prometheus token tracking for deriver and dialectic operations
- n8n integration
- Cloud Events for auditable telemetry
- External Vector Store support for turbopuffer and lancedb with reconciliation flow
Changed
- API route renaming for consistency
- Dreamer and dialectic now respect peer card configuration settings
- Observations renamed to Conclusions across API and SDKs
- Deriver to buffer representation tasks to normalize workloads
- Local Representation tasks to create singular QueueItems
- getContext endpoint to use
search_queryrather than forcelast_user_message
Fixed
- Dream scheduling bugs
- Summary creation when start_message_id > end_message_id
- Cashews upgrade to prevent NoScriptError
- Memory leak in
accumulate_metriccall
Removed
- Peer card configuration from message configuration; peer cards no longer created/updated in deriver process
v2.5.1
Fixed
- Backwards compatibility for
message_idsfield in documents to handle legacy tuple format
v2.5.0
Added
- Message level configurations
- CRUD operations for observations
- Comprehensive test cases for harness
- Peer level get_context
- Set Peer Card Method
- Manual dreaming trigger endpoint
Changed
- Configurations to support more flags for fine-grained control of the deriver, peer cards, summaries, etc.
- Working Representations to support more fine-grained parameters
Fixed
- File uploads to match
MessageCreatestructure - Cache invalidation strategy
v2.4.3
Added
- Redis caching to improve DB IO
- Backup LLM provider to avoid failures when a provider is down
Changed
- QueueItems to use standardized columns
- Improved Deduplication logic for Representation Tasks
- More finegrained metrics for representation, summary, and peer card tasks
- DB constraint to follow standard naming conventions
v2.4.2
v2.4.1
v2.4.0
Added
- Unified
Representationclass - vllm client support
- Periodic queue cleanup logic
- WIP Dreaming Feature
- LongMemEval to Test Bench
- Prometheus Client for better Metrics
- Performance metrics instrumentation
- Error reporting to deriver
- Workspace Delete Method
- Multi-db option in test harness
Changed
- Working Representations are Queried on the fly rather than cached in metadata
- EmbeddingStore to RepresentationFactory
- Summary Response Model to use public_id of message for cutoff
- Semantic across codebase to reference resources based on
observerandobserved - Prompts for Deriver & Dialectic to reference peer_id and add examples
Get Contextroute returns peer card and representation in addition to messages and summaries- Refactoring logger.info calls to logger.debug where applicable
Fixed
- Gemini client to use async methods
v2.3.3
v2.3.2
Added
- Get peer cards endpoint (
GET /v2/peers/{peer_id}/card) for retrieving targeted peer context information
Changed
- Replaced Mirascope dependency with small client implementation for better control
- Optimized deriver performance by using joins on messages table instead of storing token count in queue payload
- Database scope optimization for various operations
- Batch representation task processing for ~10x speed improvement in practice
Fixed
- Separated clean and claim work units in queue manager to prevent race conditions
- Skip locked ActiveQueueSession rows on delete operations
- Langfuse SDK integration updates for compatibility
- Added configurable maximum message size to prevent token overflow in deriver
- Various minor bugfixes
v2.3.0
Added
getSummariesendpoint to get all available summaries for a session directly- Peer Card feature to improve context for deriver and dialectic
Changed
- Session Peer limit to be based on observers instead, renamed config value to
SESSION_OBSERVERS_LIMIT Messagescan take a custom timestamp for thecreated_atfield, defaulting to the current timeget_contextendpoint returns detailedSummaryobject rather than just summary content- Working representations use a FIFO queue structure to maintain facts rather than a full rewrite
- Optimized deriver enqueue by prefetching message sequence numbers (eliminates N+1 queries)
Fixed
- Deriver uses
get_contextinternally to prevent context window limit errors - Embedding store will truncate context when querying documents to prevent embedding token limit errors
- Queue manager to schedule work based on available works rather than total number of workers
- Queue manager to use atomic db transactions rather than long lived transaction for the worker lifecycle
- Timestamp formats unified to ISO 8601 across the codebase
- Internal get_context method’s cutoff value is exclusive now
v2.2.0
Added
- Arbitrary filters now available on all search endpoints
- Search combines full-text and semantic using reciprocal rank fusion
- Webhook support (currently only supports queue_empty and test events, more to come)
- Small test harness and custom test format for evaluating Honcho output quality
- Added MCP server and documentation for it
Changed
- Search has 10 results by default, max 100 results
- Queue structure generalized to handle more event types
- Summarizer now exhaustive by default and tuned for performance
Fixed
- Resolve race condition for peers that leave a session while sending messages
- Added explicit rollback to solve integrity error in queue
- Re-introduced Sentry tracing to deriver
- Better integrity logic in get_or_create API methods
v2.1.2
Fixed
- Summarizer module to ignore empty summaries and pass appropriate one to get_context
- Structured Outputs calls with OpenAI provider to pass strict=True to Pydantic Schema
v2.1.1
Added
- Test harness for custom Honcho evaluations
- Better support for session and peer aware dialectic queries
- Langfuse settings
- Added recent history to dialectic prompt, dynamic based on new context window size setting
Fixed
- Summary queue logic
- Formatting of logs
- Filtering by session
- Peer targeting in queries
Changed
- Made query expansion in dialectic off by default
- Overhauled logging
- Refactor summarization for performance and code clarity
- Refactor queue payloads for clarity
v2.1.0
Added
- File uploads
- Brand new “ROTE” deriver system
- Updated dialectic system
- Local working representations
- Better logging for deriver/dialectic
- Deriver Queue Status no longer has redundant data
Fixed
- Document insertion
- Session-scoped and peer-targeted dialectic queries work now
- Minor bugs
Removed
- Peer-level messages
Changed
- Dialectic chat endpoint takes a single query
- Rearranged configuration values (LLM, Deriver, Dialectic, History->Summary)
v2.0.4
Fixed
- Migration/provision scripts did not have correct database connection arguments, causing timeouts
v2.0.2
Fixed
- Database initialization was misconfigured and led to provision_db script failing: switch to consistent working configuration with transaction pooler
v2.0.1
Added
- Ergonomic SDKs for Python and TypeScript (uses Stainless underneath)
- Deriver Queue Status endpoint
- Complex arbitrary filters on workspace/session/peer/message
- Message embedding table for full semantic search
Changed
- Overhauled documentation
- BasedPyright typing for entire project
- Resource filtering expanded to include logical operators
Fixed
- Various bugs
- Use new config arrangement everywhere
- Remove hardcoded responses
v2.0.0
Added
- Ability to get a peer’s working representation
- Metadata to all data primitives (Workspaces, Peers, Sessions, Messages)
- Internal metadata to store Honcho’s state no longer exposed in API
- Batch message operations and enhanced message querying with token and message count limits
- Search and summary functionalities scoped by workspace, peer, and session
- Session context retrieval with summaries and token allocatio
- HNSW Index for Documents Table
- Centralized Configuration via Environment Variables or config.toml file
Changed
- New architecture centered around the concept of a “peer” replaces the former “app”/“user”/“session” paradigm
- Workspaces replace “apps” as top-level namespace
- Peers replace “users”
- Sessions no longer nested beneath peers and no longer limited to a single user-assistant model. A session exists independently of any one peer and peers can be added to and removed from sessions.
- Dialectic API is now part of the Peer, not the Session
- Dialectic API now allows queries to be scoped to a session or “targeted” to a fellow peer
- Database schema migrated to adopt workspace/peer/session naming and structure
- Authentication and JWT scopes updated to workspace/peer/session hierarchy
- Queue processing now works on ‘work units’ instead of sessions
- Message token counting updated with tiktoken integration and fallback heuristic
- Queue and message processing updated to handle sender/target and task types for multi-peer scenarios
Fixed
- Improved error handling and validation for batch message operations and metadata
- Database Sessions to be more atomic to reduce idle in transaction time
Removed
- Metamessages removed in favor of metadata
- Collections and Documents no longer exposed in the API, solely internal
- Obsolete tests for apps, users, collections, documents, and metamessages
v1.1.0
Added
- Normalize resources to remove joins and increase query performance
- Query tracing for debugging
Changed
/listendpoints to not require a request bodymetamessage_typetolabelwith backwards compatability- Database Provisioning to rely on alembic
- Database Session Manager to explicitly rollback transactions before closing the connection
Fixed
- Alembic Migrations to include initial database migrations
- Sentry Middleware to not report Honcho Exceptions
v1.0.0
Added
- JWT based API authentication
- Configurable logging
- Consolidated LLM Inference via
ModelClientclass - Dynamic logging configurable via environment variables
Changed
- Deriver & Dialectic API to use Hybrid Memory Architecture
- Metamessages are not strictly tied to a message
- Database provisioning is a separate script instead of happening on startup
- Consolidated
session/chatandsession/chat/streamendpoints
Previous Releases
For a complete history of all releases, see our GitHub Releases page.Getting Help
If you encounter issues using the Honcho API or its SDKs:- Open an issue on GitHub
- Join our Discord community for support