Skip to content

KV cache management #7621

@iseeyuan

Description

@iseeyuan

Although KV cache is enabled, it's not modularized.

  • Architecture can be unified or configurable for different backends.
  • APIs provided for higher-level use cases, like different eviction schemes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: llmIssues related to LLM examples and apps, and to the extensions/llm/ code

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions