repo-guide
The repo-guide
repository contains a tool that uses AI to generate guides for code repositories. It leverages large language models (LLMs) to analyze code and directory structures and produce documentation in Markdown format, suitable for use with tools like MkDocs.
Dependencies:
bleach
andbleach-allowlist
: Used for sanitizing HTML content, preventing potential security vulnerabilities when processing code snippets.click
: Used for building command-line interfaces.gitpython
: Used for interacting with Git repositories, retrieving file contents, and respecting.gitignore
rules.llm
andllm-gemini
: Used to interface with LLMs, such as Gemini, for generating documentation from code. Thellm
library provides a general interface for interacting with LLMs, whilellm-gemini
provides specific support for Google's Gemini models.mkdocs
andmkdocs-material
: Used for generating static HTML documentation from Markdown files.mkdocs-material
provides a specific theme to use with MkDocs.tqdm
: Used for displaying progress bars during documentation generation.tiktoken
: Used for counting tokens in text, which helps in managing LLM costs.magika
: Used to detect file types, helping to skip binary files. (Optional dependency)pytest
: Used for running tests. (Development dependency)
These dependencies are managed by pyproject.toml
, which specifies the project's metadata, dependencies, and build configuration. The file also defines optional dependencies for magika, development, and pre-commit hooks.
Key Subdirectories:
src
: Contains the source code for therepo-guide
tool.src/repo_guide
: Holds the core logic, including the command-line interface (cli.py
) and theDocGenerator
class responsible for crawling directories, constructing prompts for the LLM, and generating documentation. The entry point issrc/repo_guide/__main__.py
, which calls thecli
function defined insrc/repo_guide/cli.py
. Thecli.py
file defines the command-line interface using theclick
library.
tests
: Contains tests for validating the behavior of therepo-guide
tool. The most interesting file in this directory istests/test_repo_guide.py
, which includes tests for prompt construction and end-to-end documentation generation..github
: Contains configuration files for GitHub Actions workflows..github/workflows
: Contains YAML files defining automated workflows for testing, publishing the package to PyPI, and deploying the generated documentation to GitHub Pages. The key files includepublish_site.yml
,test.yml
, andpublish.yml
.
Key Files:
README.md
: Provides an overview of therepo-guide
tool, including installation instructions, usage examples, and troubleshooting tips.pyproject.toml
: Specifies the project's metadata, dependencies, and build configuration. It uses the Hatch build system. It also defines development dependencies likepytest
..python-version
: Specifies the Python version to use for the project (3.11)..gitignore
: Specifies intentionally untracked files that Git should ignore..pre-commit-config.yaml
: Configures pre-commit hooks for code formatting and linting using Ruff.LICENSE
: Contains the Apache 2.0 license..gitattributes
: Configures how Git handles line endings.
How it works:
The repo-guide
tool takes a path to a Git repository (or a subdirectory within it) as input. It then uses the DocGenerator
class to:
- Crawl the directory structure: It recursively traverses the input directory, identifying files and subdirectories.
- Filter files: It excludes files based on
.gitignore
rules, binary file detection (usingmagika
if installed), and user-defined ignore patterns. - Construct prompts: For each directory, it creates an XML prompt containing the directory's relative path, a list of its subdirectories with links to their generated README files, and the contents of the files within that directory. This prompt is carefully structured to provide the LLM with the necessary context to generate a meaningful description.
- Generate documentation: It sends the prompt to the LLM (e.g., Gemini) and receives a generated description in Markdown format.
- Write documentation: It writes the generated Markdown to a file named
README.md
within the corresponding directory in the output directory. - Create MkDocs configuration: It generates a basic
mkdocs.yml
file to configure MkDocs for serving or building the generated documentation. - Serve or deploy (optional): It can optionally start a local MkDocs server to preview the documentation or build and deploy the documentation to GitHub Pages.
The GitHub Actions workflows automate the testing, publishing, and deployment processes, ensuring that the tool is continuously tested and that the generated documentation is kept up-to-date on GitHub Pages.