7 Commits

Author SHA1 Message Date
gitea-actions[bot]
41ed6e747b bump version to v0.7.0 2025-10-02 02:15:03 +00:00
0f837495fc fix formatting
All checks were successful
Security Scan / dependency-check (push) Successful in 38s
Security Scan / security (push) Successful in 42s
Test Suite / lint (push) Successful in 31s
Test Suite / test (3.11) (push) Successful in 1m28s
Test Suite / build (push) Successful in 39s
2025-10-01 19:08:59 -07:00
d66a20ddda rework server startup and cli
Some checks failed
Security Scan / dependency-check (push) Successful in 43s
Security Scan / security (push) Successful in 47s
Test Suite / lint (push) Failing after 29s
Test Suite / test (3.11) (push) Successful in 1m28s
Test Suite / build (push) Has been skipped
This changes the dockerfile as well.
2025-10-01 19:04:27 -07:00
gitea-actions[bot]
0d4145df06 bump version to v0.6.4 2025-10-01 14:54:17 +00:00
dfcfe4fd7c update release process and README
All checks were successful
Test Suite / lint (push) Successful in 30s
Security Scan / security (push) Successful in 36s
Security Scan / dependency-check (push) Successful in 29s
Test Suite / test (3.11) (push) Successful in 1m37s
Test Suite / build (push) Successful in 36s
2025-10-01 07:38:56 -07:00
314151e525 bump version to v0.6.3, first pypi release
Some checks failed
Security Scan / dependency-check (push) Failing after 38s
Security Scan / security (push) Successful in 46s
Test Suite / lint (push) Failing after 37s
Release / test (push) Failing after 22s
Release / build-and-release (push) Has been skipped
Test Suite / test (3.11) (push) Successful in 1m36s
Test Suite / build (push) Has been skipped
2025-10-01 06:16:06 -07:00
a93556132b add workflow dispatch - v0.6.2
All checks were successful
Test Suite / lint (push) Successful in 31s
Test Suite / build (push) Successful in 50s
Test Suite / test (3.11) (push) Successful in 1m32s
Release / test (push) Successful in 1m3s
Release / build-and-release (push) Successful in 35s
Security Scan / security (push) Successful in 49s
Security Scan / dependency-check (push) Successful in 52s
2025-09-20 10:04:18 -07:00
17 changed files with 377 additions and 176 deletions

View File

@@ -0,0 +1,52 @@
name: Bump Version and Release
on:
workflow_dispatch:
inputs:
bump_type:
description: 'Version bump type'
required: true
type: choice
options:
- patch
- minor
- major
jobs:
bump-and-release:
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
token: ${{ secrets.GITEA_TOKEN }}
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Bump version
id: bump
run: |
python bump_version.py ${{ github.event.inputs.bump_type }}
NEW_VERSION=$(grep -oP 'version = "\K[^"]+' pyproject.toml)
echo "version=$NEW_VERSION" >> $GITHUB_OUTPUT
echo "tag=v$NEW_VERSION" >> $GITHUB_OUTPUT
- name: Commit and tag
run: |
git config user.name "gitea-actions[bot]"
git config user.email "gitea-actions[bot]@users.noreply.gitea.io"
git add pyproject.toml
git commit -m "bump version to v${{ steps.bump.outputs.version }}"
git tag v${{ steps.bump.outputs.version }}
- name: Push changes
run: |
git push origin main
git push origin v${{ steps.bump.outputs.version }}

View File

@@ -66,8 +66,8 @@ jobs:
echo "## Installation" >> release-notes.md
echo "" >> release-notes.md
echo '```bash' >> release-notes.md
echo 'uv sync' >> release-notes.md
echo 'uv run python main.py' >> release-notes.md
echo 'pip install embeddingbuddy' >> release-notes.md
echo 'embeddingbuddy serve' >> release-notes.md
echo '```' >> release-notes.md
- name: Create Release

View File

@@ -4,6 +4,7 @@ on:
push:
tags:
- 'v[0-9]+.[0-9]+.[0-9]+'
workflow_dispatch:
env:
REGISTRY: ghcr.io

33
.github/workflows/pypi-release.yml vendored Normal file
View File

@@ -0,0 +1,33 @@
name: PyPI Release
on:
push:
tags:
- 'v[0-9]+.[0-9]+.[0-9]+'
workflow_dispatch:
jobs:
pypi-publish:
runs-on: ubuntu-latest
permissions:
contents: read
id-token: write # For trusted publishing
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install uv
uses: astral-sh/setup-uv@v4
- name: Build package
run: |
uv build
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1

View File

@@ -21,29 +21,23 @@ uv sync
**Run the application:**
Development mode (with auto-reload):
Using the CLI (recommended):
```bash
uv run run_dev.py
# Production mode (no debug, no auto-reload)
embeddingbuddy serve
# Development mode (debug + auto-reload on code changes)
embeddingbuddy serve --dev
# Debug logging only (no auto-reload)
embeddingbuddy serve --debug
# With custom host/port
embeddingbuddy serve --host 0.0.0.0 --port 8080
```
Production mode (with Gunicorn WSGI server):
```bash
# First install production dependencies
uv sync --extra prod
# Then run in production mode
uv run run_prod.py
```
Legacy mode (basic Dash server):
```bash
uv run main.py
```
The app will be available at <http://127.0.0.1:8050>
The app will be available at <http://127.0.0.1:8050> by default
**Run tests:**
@@ -204,6 +198,52 @@ Uses modern Python stack with uv for dependency management:
- **Testing:** pytest for test framework
- **Dev Tools:** uv for package management
## CI/CD and Release Management
### Repository Setup
This project uses a **dual-repository workflow**:
- **Primary repository:** Gitea instance at `git.hawt.cloud` (read-write)
- **Mirror repository:** GitHub (read-only mirror)
### Workflow Organization
**Gitea Workflows (`.gitea/workflows/`):**
- **`bump-and-release.yml`** - Manual version bumping workflow
- Runs `bump_version.py` to update version in `pyproject.toml`
- Commits changes and creates git tag
- Pushes to Gitea (main branch + tag)
- Triggered manually via workflow_dispatch with choice of patch/minor/major bump
- **`release.yml`** - Automated release creation
- Triggered when version tags are pushed
- Runs tests, builds packages
- Creates Gitea release with artifacts
- **`test.yml`** - Test suite execution
- **`security.yml`** - Security scanning
**GitHub Workflows (`.github/workflows/`):**
- **`docker-release.yml`** - Builds and publishes Docker images
- **`pypi-release.yml`** - Publishes packages to PyPI
- These workflows are read-only (no git commits/pushes) and create artifacts only
### Release Process
1. Run manual bump workflow on Gitea: **Actions → Bump Version and Release**
2. Select version bump type (patch/minor/major)
3. Workflow commits version change and pushes tag to Gitea
4. Tag push triggers `release.yml` on Gitea (creates release)
5. GitHub mirror receives tag and triggers artifact builds (Docker, PyPI)
### Version Management
Use `bump_version.py` for version updates:
```bash
python bump_version.py patch # 0.3.0 -> 0.3.1
python bump_version.py minor # 0.3.0 -> 0.4.0
python bump_version.py major # 0.3.0 -> 1.0.0
```
## Development Guidelines
**When adding new features:**

View File

@@ -23,9 +23,6 @@ COPY pyproject.toml uv.lock ./
# Copy source code (needed for editable install)
COPY src/ src/
COPY main.py .
COPY wsgi.py .
COPY run_prod.py .
COPY assets/ assets/
# Change ownership of source files before building (lighter I/O)
@@ -59,10 +56,7 @@ RUN chown appuser:appuser /app
# Copy files from builder with correct ownership
COPY --from=builder --chown=appuser:appuser /app/.venv /app/.venv
COPY --from=builder --chown=appuser:appuser /app/src /app/src
COPY --from=builder --chown=appuser:appuser /app/main.py /app/main.py
COPY --from=builder --chown=appuser:appuser /app/assets /app/assets
COPY --from=builder --chown=appuser:appuser /app/wsgi.py /app/wsgi.py
COPY --from=builder --chown=appuser:appuser /app/run_prod.py /app/run_prod.py
# Switch to non-root user
USER appuser
@@ -86,5 +80,5 @@ EXPOSE 8050
HEALTHCHECK --interval=30s --timeout=10s --start-period=30s --retries=3 \
CMD python -c "import requests; requests.get('http://localhost:8050/', timeout=5)" || exit 1
# Run application with Gunicorn in production
CMD ["python", "run_prod.py"]
# Run application in production mode (no debug, no auto-reload)
CMD ["embeddingbuddy", "serve"]

View File

@@ -28,6 +28,57 @@ documents and prompts to understand how queries relate to your content.
- **Sidebar layout** with controls on left, large visualization area on right
- **Real-time visualization** optimized for small to medium datasets
## Quick Start
### Installation
**Option 1: Install with uv (recommended)**
```bash
# Install as a CLI tool (no need to clone the repo)
uv tool install embeddingbuddy
# Run the application
embeddingbuddy serve
```
**Option 2: Install with pip/pipx**
```bash
# Install with pipx (isolated environment)
pipx install embeddingbuddy
# Or install with pip
pip install embeddingbuddy
# Run the application
embeddingbuddy
```
**Option 3: Run with Docker**
```bash
# Pull and run the Docker image
docker run -p 8050:8050 ghcr.io/godber/embedding-buddy:latest
```
The application will be available at <http://127.0.0.1:8050>
### Using the Application
1. **Open your browser** to <http://127.0.0.1:8050>
2. **Upload your data**:
- Drag and drop an NDJSON file containing embeddings (see Data Format below)
- Optionally upload a second file with prompts to compare against documents
3. **Choose visualization settings**:
- Select dimensionality reduction method (PCA, t-SNE, or UMAP)
- Choose 2D or 3D visualization
- Pick color coding (by category, subcategory, or tags)
4. **Explore**:
- Click points to view full content
- Toggle prompt visibility
- Rotate and zoom 3D plots
## Data Format
EmbeddingBuddy accepts newline-delimited JSON (NDJSON) files for both documents
@@ -73,26 +124,18 @@ uv sync
2. **Run the application:**
**Development mode** (with auto-reload):
```bash
uv run run_dev.py
```
# Production mode (no debug, no auto-reload)
embeddingbuddy serve
**Production mode** (with Gunicorn WSGI server):
# Development mode (debug + auto-reload on code changes)
embeddingbuddy serve --dev
```bash
# Install production dependencies
uv sync --extra prod
# Debug logging only (no auto-reload)
embeddingbuddy serve --debug
# Run in production mode
uv run run_prod.py
```
**Legacy mode** (basic Dash server):
```bash
uv run main.py
# Custom host/port
embeddingbuddy serve --host 0.0.0.0 --port 8080
```
3. **Open your browser** to <http://127.0.0.1:8050>
@@ -180,10 +223,8 @@ src/embeddingbuddy/
│ └── interactions.py # User interaction callbacks
└── utils/ # Utility functions
main.py # Application runner (at project root)
main.py # Application runner (at project root)
run_dev.py # Development server runner
run_prod.py # Production server runner
# CLI entry point
embeddingbuddy serve # Main CLI command to start the server
```
### Testing

10
main.py
View File

@@ -1,10 +0,0 @@
from src.embeddingbuddy.app import create_app, run_app
def main():
app = create_app()
run_app(app)
if __name__ == "__main__":
main()

View File

@@ -1,6 +1,6 @@
[project]
name = "embeddingbuddy"
version = "0.6.1"
version = "0.7.0"
description = "A Python Dash application for interactive exploration and visualization of embedding vectors through dimensionality reduction techniques."
readme = "README.md"
requires-python = ">=3.11"
@@ -17,6 +17,10 @@ dependencies = [
"opensearch-py>=3.0.0",
]
[project.scripts]
embeddingbuddy = "embeddingbuddy.cli:main"
embeddingbuddy-serve = "embeddingbuddy.app:serve"
[project.optional-dependencies]
test = [
"pytest>=8.4.1",

View File

@@ -1,32 +0,0 @@
#!/usr/bin/env python3
"""
Development runner with auto-reload enabled.
This runs the Dash development server with hot reloading.
"""
import os
from src.embeddingbuddy.app import create_app, run_app
def main():
"""Run the application in development mode with auto-reload."""
# Force development settings
os.environ["EMBEDDINGBUDDY_ENV"] = "development"
os.environ["EMBEDDINGBUDDY_DEBUG"] = "true"
# Check for OpenSearch disable flag (optional for testing)
# Set EMBEDDINGBUDDY_OPENSEARCH_ENABLED=false to test without OpenSearch
opensearch_status = os.getenv("EMBEDDINGBUDDY_OPENSEARCH_ENABLED", "true")
opensearch_enabled = opensearch_status.lower() == "true"
print("🚀 Starting EmbeddingBuddy in development mode...")
print("📁 Auto-reload enabled - changes will trigger restart")
print("🌐 Server will be available at http://127.0.0.1:8050")
print(f"🔍 OpenSearch: {'Enabled' if opensearch_enabled else 'Disabled'}")
print("⏹️ Press Ctrl+C to stop")
app = create_app()
# Run with development server (includes auto-reload when debug=True)
run_app(app, debug=True)
if __name__ == "__main__":
main()

View File

@@ -1,52 +0,0 @@
#!/usr/bin/env python3
"""
Production runner using Gunicorn WSGI server.
This provides better performance and stability for production deployments.
"""
import os
import subprocess
import sys
from src.embeddingbuddy.config.settings import AppSettings
def main():
"""Run the application in production mode with Gunicorn."""
# Force production settings
os.environ["EMBEDDINGBUDDY_ENV"] = "production"
os.environ["EMBEDDINGBUDDY_DEBUG"] = "false"
# Disable OpenSearch by default in production (can be overridden by setting env var)
if "EMBEDDINGBUDDY_OPENSEARCH_ENABLED" not in os.environ:
os.environ["EMBEDDINGBUDDY_OPENSEARCH_ENABLED"] = "false"
print("🚀 Starting EmbeddingBuddy in production mode...")
print(f"⚙️ Workers: {AppSettings.GUNICORN_WORKERS}")
print(f"🌐 Server will be available at http://{AppSettings.GUNICORN_BIND}")
print("⏹️ Press Ctrl+C to stop")
# Gunicorn command
cmd = [
"gunicorn",
"--workers", str(AppSettings.GUNICORN_WORKERS),
"--bind", AppSettings.GUNICORN_BIND,
"--timeout", str(AppSettings.GUNICORN_TIMEOUT),
"--keep-alive", str(AppSettings.GUNICORN_KEEPALIVE),
"--access-logfile", "-",
"--error-logfile", "-",
"--log-level", "info",
"wsgi:application"
]
try:
subprocess.run(cmd, check=True)
except KeyboardInterrupt:
print("\n🛑 Shutting down...")
sys.exit(0)
except subprocess.CalledProcessError as e:
print(f"❌ Error running Gunicorn: {e}")
sys.exit(1)
except FileNotFoundError:
print("❌ Gunicorn not found. Install it with: uv add gunicorn")
print("💡 Or run in development mode with: python run_dev.py")
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,14 +1,20 @@
import dash
import dash_bootstrap_components as dbc
from .config.settings import AppSettings
from .ui.layout import AppLayout
from .ui.callbacks.data_processing import DataProcessingCallbacks
from .ui.callbacks.visualization import VisualizationCallbacks
from .ui.callbacks.interactions import InteractionCallbacks
"""
EmbeddingBuddy application factory and server functions.
This module contains the main application creation logic with imports
moved inside functions to avoid loading heavy dependencies at module level.
"""
def create_app():
"""Create and configure the Dash application instance."""
import os
import dash
import dash_bootstrap_components as dbc
from .ui.layout import AppLayout
from .ui.callbacks.data_processing import DataProcessingCallbacks
from .ui.callbacks.visualization import VisualizationCallbacks
from .ui.callbacks.interactions import InteractionCallbacks
# Get the project root directory (two levels up from this file)
project_root = os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
@@ -124,6 +130,9 @@ def _register_client_side_callbacks(app):
def run_app(app=None, debug=None, host=None, port=None):
"""Run the Dash application with specified settings."""
from .config.settings import AppSettings
if app is None:
app = create_app()
@@ -134,6 +143,68 @@ def run_app(app=None, debug=None, host=None, port=None):
)
if __name__ == "__main__":
def serve(host=None, port=None, dev=False, debug=False):
"""Start the EmbeddingBuddy web server.
Args:
host: Host to bind to (default: 127.0.0.1)
port: Port to bind to (default: 8050)
dev: Development mode - enable debug logging and auto-reload (default: False)
debug: Enable debug logging only, no auto-reload (default: False)
"""
import os
from .config.settings import AppSettings
# Determine actual values to use
actual_host = host if host is not None else AppSettings.HOST
actual_port = port if port is not None else AppSettings.PORT
# Determine mode
# --dev takes precedence and enables both debug and auto-reload
# --debug enables only debug logging
# No flags = production mode (no debug, no auto-reload)
use_reloader = dev
use_debug = dev or debug
# Only print startup messages in main process (not in Flask reloader)
if not os.environ.get("WERKZEUG_RUN_MAIN"):
mode = "development" if dev else ("debug" if debug else "production")
print(f"Starting EmbeddingBuddy in {mode} mode...")
print("Loading dependencies (this may take a few seconds)...")
print(f"Server will start at http://{actual_host}:{actual_port}")
if use_reloader:
print("Auto-reload enabled - server will restart on code changes")
app = create_app()
run_app(app)
# Suppress Flask development server warning in production mode
if not use_debug and not use_reloader:
import warnings
import logging
# Suppress the werkzeug warning
warnings.filterwarnings("ignore", message=".*development server.*")
# Set werkzeug logger to ERROR level to suppress the warning
werkzeug_logger = logging.getLogger("werkzeug")
werkzeug_logger.setLevel(logging.ERROR)
# Use Flask's built-in server with appropriate settings
app.run(
debug=use_debug, host=actual_host, port=actual_port, use_reloader=use_reloader
)
def main():
"""Legacy entry point - redirects to cli module.
This is kept for backward compatibility but the main CLI
is now in embeddingbuddy.cli for faster startup.
"""
from .cli import main as cli_main
cli_main()
if __name__ == "__main__":
main()

67
src/embeddingbuddy/cli.py Normal file
View File

@@ -0,0 +1,67 @@
"""
Lightweight CLI entry point for EmbeddingBuddy.
This module provides a fast command-line interface that only imports
heavy dependencies when actually needed by subcommands.
"""
import argparse
import sys
def main():
"""Main CLI entry point with minimal imports for fast help text."""
parser = argparse.ArgumentParser(
prog="embeddingbuddy",
description="EmbeddingBuddy - Interactive embedding visualization tool",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
embeddingbuddy serve # Production mode (no debug, no auto-reload)
embeddingbuddy serve --dev # Development mode (debug + auto-reload)
embeddingbuddy serve --debug # Debug logging only (no auto-reload)
embeddingbuddy serve --port 8080 # Custom port
embeddingbuddy serve --host 0.0.0.0 # Bind to all interfaces
""",
)
subparsers = parser.add_subparsers(
dest="command", help="Available commands", metavar="<command>"
)
# Serve subcommand
serve_parser = subparsers.add_parser(
"serve",
help="Start the web server",
description="Start the EmbeddingBuddy web server for interactive visualization",
)
serve_parser.add_argument(
"--host", default=None, help="Host to bind to (default: 127.0.0.1)"
)
serve_parser.add_argument(
"--port", type=int, default=None, help="Port to bind to (default: 8050)"
)
serve_parser.add_argument(
"--dev",
action="store_true",
help="Development mode: enable debug logging and auto-reload",
)
serve_parser.add_argument(
"--debug", action="store_true", help="Enable debug logging (no auto-reload)"
)
args = parser.parse_args()
if args.command == "serve":
# Only import heavy dependencies when actually running serve
from embeddingbuddy.app import serve
serve(host=args.host, port=args.port, dev=args.dev, debug=args.debug)
else:
# No command specified, show help
parser.print_help()
sys.exit(0)
if __name__ == "__main__":
main()

View File

@@ -69,7 +69,7 @@ class AppSettings:
TEXT_PREVIEW_LENGTH = 100
# App Configuration
DEBUG = os.getenv("EMBEDDINGBUDDY_DEBUG", "True").lower() == "true"
DEBUG = os.getenv("EMBEDDINGBUDDY_DEBUG", "False").lower() == "true"
HOST = os.getenv("EMBEDDINGBUDDY_HOST", "127.0.0.1")
PORT = int(os.getenv("EMBEDDINGBUDDY_PORT", "8050"))

View File

@@ -0,0 +1,12 @@
"""
WSGI entry point for production deployment.
Use this with a production WSGI server like Gunicorn.
"""
from embeddingbuddy.app import create_app
# Create the application instance
application = create_app()
# For compatibility with different WSGI servers
app = application

2
uv.lock generated
View File

@@ -412,7 +412,7 @@ wheels = [
[[package]]
name = "embeddingbuddy"
version = "0.5.1"
version = "0.6.4"
source = { editable = "." }
dependencies = [
{ name = "dash" },

20
wsgi.py
View File

@@ -1,20 +0,0 @@
"""
WSGI entry point for production deployment.
Use this with a production WSGI server like Gunicorn.
"""
from src.embeddingbuddy.app import create_app
# Create the application instance
application = create_app()
# For compatibility with different WSGI servers
app = application
if __name__ == "__main__":
# This won't be used in production, but useful for testing
from src.embeddingbuddy.config.settings import AppSettings
application.run(
host=AppSettings.HOST,
port=AppSettings.PORT,
debug=False
)