CODESEEDSign in
Explore projects
TypeScriptadvancedai

Semantic Code Search Engine

Semantic search engine for source code: indexes a Git repo with code embeddings, searches by intent, and returns relevant functions.

5 steps

Project steps

  1. 01

    AST parsing

    tree-sitter extracts functions/classes from JS/TS/Python with docstrings and signature.

  2. 02

    Embedding generation

    text-embedding-3-small for each function; batch processing.

  3. 03

    pgvector storage

    PostgreSQL with pgvector extension; HNSW index for nearest neighbor.

  4. 04

    Search API

    POST /search {query, language?, limit} → relevant functions with score.

  5. 05

    Incremental indexing

    Detects modified files via git diff and re-indexes only those.

Recommended resources

Ready to build this?

Fork the repo on GitHub and start building. A mentor will review your code when you open a PR.

5 steps

Tech stack

TypeScripttree-sitterOpenAI EmbeddingspgvectorFastify