Sourcebot logo
    /

    Repo Status Progression

    Public

    Recent Chats

    Sign in to access your chat history.

    what are the different status' that a repo can be in, and how do they progress through them (ex NEW, IN_INDEX_QUEUE, etc)?

    Answer

    A repository's indexing status is managed by the RepoIndexingStatus enum, defined in packages/db/prisma/schema.prisma schema.prisma:13-22. The status indicates the current state of a repository within the indexing and garbage collection lifecycle.

    The possible statuses are:

    • NEW
    • IN_INDEX_QUEUE
    • INDEXING
    • INDEXED
    • FAILED
    • IN_GC_QUEUE
    • GARBAGE_COLLECTING
    • GARBAGE_COLLECTION_FAILED

    The progression through these statuses is handled by the RepoManager class in packages/backend/src/repoManager.ts repoManager.ts:35-622.

    Indexing Flow

    1. NEW: A repository starts with this status. The fetchAndScheduleRepoIndexing function repoManager.ts:131-170 periodically queries for repositories with this status.

    2. IN_INDEX_QUEUE: Repositories with a NEW status are moved to IN_INDEX_QUEUE by the scheduleRepoIndexingBulk function repoManager.ts:85-128, which also adds them to the indexing queue.

    3. INDEXING: The runIndexJob function repoManager.ts:292-345, executed by a worker, picks up a job from the queue and updates the repository's status to INDEXING repoManager.ts:312-319.

    4. INDEXED: Upon successful indexing, the onIndexJobCompleted function repoManager.ts:347-361 is triggered, setting the status to INDEXED repoManager.ts:352-360.

    5. FAILED: If indexing fails, the onIndexJobFailed function repoManager.ts:363-386 updates the status to FAILED repoManager.ts:377-384. A repository can also be marked as FAILED if it remains in the INDEXING state for an excessive amount of time, as determined by fetchAndScheduleRepoTimeouts repoManager.ts:591-605.

    Garbage Collection Flow

    1. IN_GC_QUEUE: Repositories that are no longer needed (e.g., have no connections or belong to an inactive organization) are identified by fetchAndScheduleRepoGarbageCollection repoManager.ts:412-467. This function then calls scheduleRepoGarbageCollectionBulk repoManager.ts:392-410 to change their status to IN_GC_QUEUE.

    2. GARBAGE_COLLECTING: A garbage collection worker executes runGarbageCollectionJob repoManager.ts:469-499, which updates the status to GARBAGE_COLLECTING repoManager.ts:474-481.

    3. DELETED: If garbage collection is successful, the onGarbageCollectionJobCompleted function repoManager.ts:500-510 is called, which then deletes the repository from the database repoManager.ts:505-509.

    4. GARBAGE_COLLECTION_FAILED: If garbage collection fails, onGarbageCollectionJobFailed repoManager.ts:512-535 is called, and the repository status is updated to GARBAGE_COLLECTION_FAILED repoManager.ts:526-533.

    schema.prisma
    repoManager.ts