Skip to content

Query Workflow

Detailed Query Workflow

  1. Query Reception:
  2. Queries are broadcasted via pub/sub channels to all relevant nodes.
  3. Each node decides if it has the matching vector slice to respond.

  4. Vector Search and Model Inference:

  5. Nodes execute vector similarity searches with Vulkan-accelerated GPUs using ggml libraries.
  6. MNN models are loaded dynamically based on the type of query.

  7. Response Generation and Aggregation:

  8. Partial responses from each node are aggregated into a final response by a designated aggregator node or function.
sequenceDiagram
    User ->> PubSub: Sends Query
    PubSub ->> Node1: Broadcast Query
    PubSub ->> Node2: Broadcast Query
    Node1 ->> Node1: Search Vectors (Vulkan + ggml)
    Node2 ->> Node2: Inference (MNN Model)
    Node1 -->> Aggregator: Partial Response 1
    Node2 -->> Aggregator: Partial Response 2
    Aggregator ->> User: Final Response