Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Nix Flake Development Environment

The Neko Agent project uses a sophisticated Nix flake to provide reproducible, cross-platform development environments with specialized configurations for different use cases. This document provides comprehensive documentation of all flake features, development shells, and usage patterns.

Overview

The flake (flake.nix) is designed around multiple specialized development environments that cater to different aspects of the project:

  • AI/ML Development - GPU-accelerated environments with CUDA support
  • Documentation - Publishing and development of project documentation
  • Container Operations - Docker and Neko server management
  • Performance Optimization - CPU-optimized builds with architecture-specific flags
  • TEE Deployment - Trusted Execution Environment deployment with attestation
  • Registry Management - Multi-registry container deployment support
  • Cross-Platform Support - Works on x86_64-Linux and aarch64-Darwin (Apple Silicon)
graph TB
    subgraph "Nix Flake Architecture"
        Flake[flake.nix]
        Inputs[External Inputs]
        Overlays[Custom Overlays]
        Shells[Development Shells]
        Packages[Docker Images]
        Apps[Utility Apps]
    end
    
    subgraph "External Dependencies"
        Nixpkgs[nixpkgs/nixos-unstable]
        MLPkgs[nixvital/ml-pkgs]
    end
    
    subgraph "Custom Overlays"
        WebRTC[WebRTC Stack]
        ML[ML Libraries]
        Audio[Audio Processing]
        Optimization[CPU Optimization]
    end
    
    subgraph "Development Shells"
        Default[default]
        GPU[gpu]
        AI[ai]
        Neko[neko]
        Docs[docs]
        CPUOpt[cpu-opt]
        GPUOpt[gpu-opt]
    end
    
    Flake --> Inputs
    Flake --> Overlays
    Flake --> Shells
    Flake --> Packages
    Flake --> Apps
    
    Inputs --> Nixpkgs
    Inputs --> MLPkgs
    
    Overlays --> WebRTC
    Overlays --> ML
    Overlays --> Audio
    Overlays --> Optimization
    
    Shells --> Default
    Shells --> GPU
    Shells --> AI
    Shells --> Neko
    Shells --> Docs
    Shells --> CPUOpt
    Shells --> GPUOpt
    Shells --> TEE

Flake Inputs

External Dependencies

inputs = {
  nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
  ml-pkgs.url = "github:nixvital/ml-pkgs";
};
InputSourcePurpose
nixpkgsnixos-unstableLatest packages and system libraries
ml-pkgsnixvital/ml-pkgsSpecialized ML/AI packages (PyTorch, CUDA)

Why nixos-unstable?

  • Latest packages - Access to newest versions of AI/ML libraries
  • CUDA support - Most recent NVIDIA driver and toolkit support (CUDA 12.8)
  • Python ecosystem - Up-to-date Python packages for transformers and WebRTC
  • Security updates - Timely security patches for all dependencies

Build Metadata and Reproducibility

The flake includes comprehensive build metadata for reproducible builds and attestation:

buildInfo = rec {
  timestamp = "${year}-${month}-${day}T${hour}:${minute}:${second}Z";
  revision = self.rev or self.dirtyRev or "unknown";
  shortRev = builtins.substring 0 8 revision;
  version = if (self ? rev) then shortRev else "${shortRev}-dirty";
  nixpkgsRev = nixpkgs.rev or "unknown";
  
  imageMetadata = {
    "org.opencontainers.image.title" = "Neko Agent";
    "org.opencontainers.image.created" = timestamp;
    "org.opencontainers.image.revision" = revision;
    "dev.neko.build.reproducible" = "true";
  };
};

Custom Overlay System

The flake uses a comprehensive overlay system to provide packages not available in standard Nixpkgs:

WebRTC and Media Stack

nekoOverlays = [
  (import ./overlays/pylibsrtp.nix)     # Secure RTP protocol
  (import ./overlays/aioice.nix)        # Async ICE implementation  
  (import ./overlays/aiortc.nix)        # WebRTC for Python
  # ... more overlays
];
OverlayPackagePurpose
pylibsrtp.nixpylibsrtpSecure Real-time Transport Protocol for WebRTC
aioice.nixaioiceAsynchronous ICE (Interactive Connectivity Establishment)
aiortc.nixaiortcWebRTC implementation for Python with media support

AI/ML and Audio Processing

OverlayPackagePurpose
streaming.nixstreamingMosaicML Streaming for training data
f5-tts.nixf5-ttsF5-TTS voice synthesis model
vocos.nixvocosNeural vocoder for audio generation
ema-pytorch.nixema-pytorchExponential Moving Average for PyTorch
transformers-stream-generator.nixtransformers-stream-generatorStreaming text generation
bitsandbytes.nixbitsandbytes8-bit optimizers for PyTorch

Pi-Zero PyTorch Dependencies

The flake includes comprehensive packaging for pi-zero-pytorch and its dependencies:

OverlayPackagePurpose
pi-zero-pytorch/pi-zero-pytorch.nixpi-zero-pytorchMain π0 implementation in PyTorch
pi-zero-pytorch/einx.nixeinxUniversal tensor operations with Einstein notation
pi-zero-pytorch/x-transformers.nixx-transformersTransformer architectures library
pi-zero-pytorch/rotary-embedding-torch.nixrotary-embedding-torchRotary positional embeddings
pi-zero-pytorch/accelerated-scan.nixaccelerated-scanAccelerated scan operations
pi-zero-pytorch/bidirectional-cross-attention.nixbidirectional-cross-attentionCross-attention mechanisms
pi-zero-pytorch/hl-gauss-pytorch.nixhl-gauss-pytorchGaussian operations for ML
pi-zero-pytorch/evolutionary-policy-optimization.nixevolutionary-policy-optimizationEvolution strategies

Performance Optimization

OverlayPackagePurpose
cached-path.nixcached-pathEfficient file caching utilities
znver2-flags.nixnekoZnver2EnvAMD Zen2 CPU optimization flags
vmm-cli.nixvmm-cliVirtual machine management CLI

Example Znver2 Optimization:

# Generated environment variables for AMD Zen2 CPUs
export NIX_CFLAGS_COMPILE="-O3 -pipe -march=znver2 -mtune=znver2 -fno-plt"
export RUSTFLAGS="-C target-cpu=znver2 -C target-feature=+sse2,+sse4.2,+avx,+avx2,+fma,+bmi1,+bmi2"

External ML Packages

ml-pkgs.overlays.torch-family  # Provides torch-bin, torchvision-bin, etc.

Benefits:

  • Pre-compiled binaries - Faster setup without compilation
  • CUDA integration - Proper CUDA toolkit linkage
  • Consistent versions - Matching PyTorch ecosystem versions

Development Shells

1. Default Shell (default)

Purpose: Basic Python development with CPU-only PyTorch.

Usage:

nix develop
# or
nix develop .#default

Includes:

  • Python Environment: PyTorch CPU, Transformers, WebRTC stack
  • System Tools: FFmpeg, Git, Curl, Just, pkg-config
  • Node.js Ecosystem: Node 20, NPM for AI tools
  • AI CLI Tools: OpenAI Codex, Anthropic Claude Code (auto-installed)

Python Packages:

# Core ML/AI
transformers
torch (CPU)
torchvision
pillow
accelerate

# WebRTC and networking
websockets
av (PyAV for video processing)
pylibsrtp
aioice
aiortc

# Data and streaming
streaming (MosaicML)
f5-tts
numpy
scipy
zstandard
xxhash
tqdm

# Monitoring
prometheus-client

When to Use:

  • Initial project setup and exploration
  • Development on systems without NVIDIA GPUs
  • Testing compatibility with CPU-only environments
  • CI/CD pipelines where GPU access is unavailable

2. GPU Shell (gpu)

Purpose: GPU-accelerated development with CUDA 12.8 support.

Usage:

nix develop .#gpu

NVIDIA hosts: When running outside NixOS you will typically need nixGL to expose the system GPU. Use:

NIXPKGS_ALLOW_UNFREE=1 nix run --impure github:nix-community/nixGL#nixGLNvidia -- nix develop .#gpu

This wraps the GPU shell with the right OpenGL/EGL libraries from the host driver.

Additional Features over Default:

  • CUDA Toolkit 12.8 - Complete CUDA development environment
  • cuDNN and NCCL - Optimized neural network and communication libraries
  • GPU-enabled PyTorch - Tensor operations on NVIDIA GPUs
  • Environment Variables - Automatic CUDA path and library configuration

CUDA Environment Setup:

# Automatically configured
export CUDA_HOME=/nix/store/.../cuda-12.8
export CUDA_PATH=$CUDA_HOME
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH

# GPU control
export NVIDIA_VISIBLE_DEVICES=all
export NVIDIA_DRIVER_CAPABILITIES=compute,utility
export CUDA_MODULE_LOADING=LAZY
export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

Verification Commands:

# Check CUDA installation
nvidia-smi
nvcc --version

# Test PyTorch GPU support
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
python -c "import torch; print(f'CUDA devices: {torch.cuda.device_count()}')"

When to Use:

  • AI model inference and training
  • GPU-accelerated image/video processing
  • Development requiring CUDA libraries
  • Performance-critical workloads

3. AI Shell (ai)

Purpose: Lightweight environment focused on AI development tools.

Usage:

nix develop .#ai

Includes:

  • Core System Tools - FFmpeg, Git, networking utilities
  • Node.js Environment - Node 20, NPM
  • AI CLI Tools - Automatic installation of OpenAI and Anthropic CLIs
  • Minimal Footprint - No heavy ML libraries, faster startup

AI Tools Installed:

# OpenAI Codex CLI
npm install -g @openai/codex

# Anthropic Claude Code CLI  
npm install -g @anthropic-ai/claude-code

Environment Setup:

# NPM global packages in project directory
export NPM_CONFIG_PREFIX=$PWD/.npm-global
export PATH=$NPM_CONFIG_PREFIX/bin:$PATH

When to Use:

  • AI-assisted development workflows
  • Code generation and review tasks
  • Integration with AI development services
  • Quick environment for AI tool testing

4. Neko Shell (neko)

Purpose: Container and Neko server management.

Usage:

nix develop .#neko

Container Stack:

  • Colima - Lightweight Docker runtime for macOS/Linux
  • Docker & Docker Compose - Container orchestration
  • Docker Buildx - Multi-platform image building
  • Networking Tools - curl, jq for API interaction

Custom Scripts:

# Neko service management script
neko-services up      # Start Neko server
neko-services down    # Stop services
neko-services logs    # View container logs
neko-services status  # Check service status
neko-services restart # Restart services
neko-services update  # Pull latest images and restart

Colima Configuration:

# Automatically configured VM
colima start --vm-type vz --cpu 2 --memory 4 \
  --mount-type sshfs --mount "~:w"

Docker Environment:

# Automatic Docker socket configuration
export DOCKER_HOST="unix://$HOME/.colima/default/docker.sock"

When to Use:

  • Neko server development and testing
  • Container image building and deployment
  • Docker-based development workflows
  • Local testing of production deployments

5. Documentation Shell (docs)

Purpose: Documentation development, building, and publishing.

Usage:

nix develop .#docs

Documentation Stack:

  • mdBook - Rust-based documentation generator
  • mdBook Extensions:
    • mdbook-mermaid - Diagram support
    • mdbook-linkcheck - Link validation
    • mdbook-toc - Table of contents generation
  • Sphinx - Python documentation with reStructuredText support
  • Node.js - For additional tooling and preprocessing

Python Documentation Tools:

sphinx              # Documentation generator
sphinx-rtd-theme     # Read the Docs theme
myst-parser          # Markdown support for Sphinx
sphinxcontrib-mermaid # Mermaid diagrams in Sphinx

Available Commands:

# From inside docs/
mdbook serve --open     # Development server with live reload
mdbook build           # Build static documentation
mdbook test            # Test code examples and links

# Sphinx alternative
sphinx-build -b html source build/

When to Use:

  • Writing and editing project documentation
  • Building documentation for deployment
  • Testing documentation changes locally
  • Contributing to API reference and guides

6. CPU-Optimized Shell (cpu-opt)

Purpose: Performance-optimized CPU development.

Usage:

nix develop .#cpu-opt

Optimization Features:

  • Architecture-Specific Compilation - Znver2 flags for AMD CPUs
  • Optimized Python Environment - Performance-tuned package builds
  • Compiler Optimizations - -O3, -march=znver2, -mtune=znver2

Generated Optimization Flags (Linux only):

# Compiler flags
export NIX_CFLAGS_COMPILE="-O3 -pipe -march=znver2 -mtune=znver2 -fno-plt"

# Rust flags
export RUSTFLAGS="-C target-cpu=znver2 -C target-feature=+sse2,+sse4.2,+avx,+avx2,+fma,+bmi1,+bmi2 -C link-arg=-Wl,-O1 -C link-arg=--as-needed"

When to Use:

  • Performance-critical CPU workloads
  • Benchmarking and optimization work
  • Production builds targeting specific CPU architectures
  • Environments where every bit of CPU performance matters

7. GPU-Optimized Shell (gpu-opt)

Purpose: Maximum performance GPU development with optimizations.

Usage:

nix develop .#gpu-opt

Combined Optimizations:

  • All GPU features - CUDA 12.8, cuDNN, NCCL
  • CPU optimizations - Znver2 flags for host code
  • PyTorch optimizations - Optimized builds with CPU and GPU acceleration
  • Memory optimizations - Advanced CUDA memory management

GPU-Specific Optimizations:

# Target specific GPU architecture (configurable)
export TORCH_CUDA_ARCH_LIST=8.6  # RTX 30xx series

# Memory allocation strategy
export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

Performance Verification:

# Check optimizations are active
echo $NIX_CFLAGS_COMPILE  # Should show znver2 flags
echo $TORCH_CUDA_ARCH_LIST  # Should show target GPU architecture

# Benchmark performance
python -c "
import torch
import time
x = torch.randn(1000, 1000, device='cuda')
start = time.time()
torch.mm(x, x)
print(f'GPU matrix multiply: {time.time() - start:.4f}s')
"

When to Use:

  • Maximum performance AI inference
  • GPU-accelerated training workloads
  • Performance benchmarking and optimization
  • Production deployments requiring peak performance

8. TEE Shell (tee)

Purpose: Trusted Execution Environment deployment and attestation.

Usage:

nix develop .#tee

TEE Deployment Stack:

  • Phala Cloud CLI - Modern CLI for TEE deployments
  • Legacy VMM CLI - Compatible with older dstack systems
  • Docker & Docker Compose - Container orchestration
  • Bun Runtime - Fast JavaScript runtime
  • Reproducible Image Builder - Attestation-ready container building

Available Commands:

# Modern Phala CLI
phala auth login <api-key>      # Authenticate with Phala Cloud
phala status                    # Check authentication status
phala cvms list                 # List Confidential VMs
phala nodes                     # List available TEE nodes

# Legacy VMM CLI (if needed)
vmm-cli lsvm                    # List virtual machines
vmm-cli lsimage                 # List available images
vmm-cli lsgpu                   # List available GPUs

# Reproducible builds
nix run .#build-images          # Build reproducible images
nix run .#deploy-to-tee         # Deploy with attestation metadata
nix run .#verify-attestation    # Verify TEE attestation

Multi-Registry Support:

# Deploy to ttl.sh (ephemeral registry)
NEKO_REGISTRY=ttl.sh NEKO_TTL=1h nix run .#deploy-to-tee
nix run .#deploy-to-ttl 24h

# Deploy to GitHub Container Registry
NEKO_REGISTRY=ghcr.io/your-org nix run .#deploy-to-tee

# Deploy to Docker Hub
NEKO_REGISTRY=docker.io/your-org nix run .#deploy-to-tee

# Deploy to local registry
NEKO_REGISTRY=localhost:5000/neko nix run .#deploy-to-tee

When to Use:

  • Deploying to Trusted Execution Environments
  • Creating attestable, reproducible deployments
  • Multi-registry container management
  • TEE-based inference deployments
  • Confidential computing workloads

Docker Images and Packages

The flake builds optimized Docker images for production deployment:

Available Images

The flake now builds multiple specialized images for different components:

# Build all images
nix run .#build-images

# Agent images
nix build .#neko-agent-docker-generic
nix build .#neko-agent-docker-opt

# Capture images  
nix build .#neko-capture-docker-generic
nix build .#neko-capture-docker-opt

# YAP (TTS) images
nix build .#neko-yap-docker-generic
nix build .#neko-yap-docker-opt

# Train images
nix build .#neko-train-docker-generic
nix build .#neko-train-docker-opt

1. Generic CUDA Image (neko-agent-docker-generic)

Target: neko-agent:cuda12.8-generic

Features:

  • Portable CUDA - Includes PTX for forward compatibility
  • CUDA 12.8 - Full toolkit and libraries
  • Python Environment - All dependencies with torch-bin
  • Broad GPU Support - Works on any CUDA 8.6+ GPU

Configuration:

# Environment variables
CUDA_HOME=/nix/store/.../cuda-12.8
LD_LIBRARY_PATH=$CUDA_HOME/lib64:$CUDA_HOME/lib
CUDA_MODULE_LOADING=LAZY
TORCH_CUDA_ARCH_LIST=8.6+PTX  # Forward compatibility

Use Cases:

  • Multi-GPU deployment environments
  • Cloud platforms with varying GPU types
  • Development and testing across different hardware

2. Optimized Image (neko-agent-docker-opt)

Target: neko-agent:cuda12.8-sm86-v3

Features:

  • Specific GPU targeting - Optimized for RTX 30xx series (sm_86)
  • CPU optimizations - Znver2 architecture flags
  • Smaller size - No PTX, specific architecture only
  • Maximum performance - All available optimizations enabled

Configuration:

# Optimized environment
TORCH_CUDA_ARCH_LIST=8.6  # Specific architecture only
NIX_CFLAGS_COMPILE="-O3 -pipe -march=znver2 -mtune=znver2 -fno-plt"
RUSTFLAGS="-C target-cpu=znver2 ..."  # Rust optimizations

Use Cases:

  • Production deployments with known hardware
  • Performance-critical applications
  • Cost-optimized cloud instances

Image Building System

# Helper function for consistent container structure
mkRoot = paths: pkgs.buildEnv {
  name = "image-root";
  inherit paths;
  pathsToLink = [ "/bin" ];
};

# Generic image build
neko-agent-docker-generic = pkgs.dockerTools.buildImage {
  name = "neko-agent:cuda12.8-generic";
  created = "now";
  copyToRoot = mkRoot ([
    runnerGeneric
    pyEnvGeneric
    cuda.cudatoolkit
    cuda.cudnn
    cuda.nccl
    pkgs.bashInteractive
  ] ++ commonSystemPackages);
  config = {
    Env = baseEnv ++ [
      "CUDA_HOME=${cuda.cudatoolkit}"
      "LD_LIBRARY_PATH=${cuda.cudatoolkit}/lib64:${cuda.cudnn}/lib"
      "TORCH_CUDA_ARCH_LIST=8.6+PTX"
    ];
    WorkingDir = "/workspace";
    Entrypoint = [ "/bin/neko-agent" ];
  };
};

Utility Apps

The flake provides comprehensive utility applications for common tasks:

Documentation Apps

# Build documentation
nix run .#docs-build

# Serve documentation with live reload
nix run .#docs-serve

# Check documentation for issues
nix run .#docs-check

Build and Deployment Apps

# Build all Docker images with attestation metadata
nix run .#build-images

# TEE deployment with multi-registry support
nix run .#deploy-to-tee
nix run .#deploy-to-ttl 24h              # Quick ttl.sh deployment
nix run .#push-to-ttl 1h                 # Just push to ttl.sh

# Attestation verification
nix run .#verify-attestation <app-id> <expected-hash>

Container Registry Apps

# Local registry management
nix run .#start-registry                 # HTTP registry with auth
nix run .#start-registry-https           # HTTPS with Tailscale certs
nix run .#stop-registry

# Public exposure
nix run .#start-tailscale-funnel         # Expose via Tailscale Funnel
nix run .#start-cloudflare-tunnel        # Expose via Cloudflare Tunnel

Registry Configuration Examples:

# Environment variables for registry customization
NEKO_REGISTRY_PORT=5000
NEKO_REGISTRY_USER=neko
NEKO_REGISTRY_PASSWORD=pushme
NEKO_REGISTRY_DATA_DIR=$PWD/registry-data
NEKO_REGISTRY_AUTH_DIR=$PWD/auth
NEKO_REGISTRY_CERTS_DIR=$PWD/certs

# Tailscale Funnel setup
NEKO_REGISTRY=your-device.tail-scale.ts.net/neko

# Cloudflare Tunnel setup
NEKO_CF_TUNNEL_NAME=neko-registry
NEKO_CF_HOSTNAME=registry.example.com

Common Development Workflows

Initial Setup

# Clone repository
git clone <repo-url>
cd neko-agent

# Enter development environment
nix develop .#gpu  # or .#default for CPU-only

# Verify setup
python -c "import torch; print(torch.cuda.is_available())"

AI Development Workflow

# 1. Enter GPU environment
nix develop .#gpu

# 2. Load environment variables (if .env exists)
# Automatically loaded by shell hook

# 3. Test model loading
python -c "
from transformers import Qwen2VLForConditionalGeneration
model = Qwen2VLForConditionalGeneration.from_pretrained('showlab/ShowUI-2B')
print('Model loaded successfully')
"

# 4. Run agent
uv run src/agent.py --task "Navigate to google.com"

Documentation Development

# 1. Enter docs environment
nix develop .#docs

# 2. Start development server
nix run .#docs-serve
# Opens browser to http://localhost:3000

# 3. Edit files in docs/src/
# Changes automatically reload in browser

# 4. Build for deployment
nix run .#docs-build

Container Development

# 1. Enter container environment
nix develop .#neko

# 2. Start Neko server
neko-services up

# 3. Check status
neko-services status

# 4. View logs
neko-services logs neko

# 5. Test connection
curl http://localhost:8080/health

Performance Optimization

# 1. Use optimized environment
nix develop .#gpu-opt

# 2. Verify optimizations
echo $NIX_CFLAGS_COMPILE
echo $TORCH_CUDA_ARCH_LIST

# 3. Run performance benchmarks
python benchmarks/inference_speed.py

# 4. Build optimized container
nix build .#neko-agent-docker-opt

TEE Deployment Workflow

# 1. Enter TEE environment
nix develop .#tee

# 2. Build reproducible images
nix run .#build-images

# 3. Deploy to TEE (with registry choice)
# Option A: Use ttl.sh for testing
nix run .#deploy-to-ttl 1h

# Option B: Use GitHub Container Registry
NEKO_REGISTRY=ghcr.io/your-org nix run .#deploy-to-tee

# Option C: Use local registry (start it first)
nix run .#start-registry  # In another terminal
NEKO_REGISTRY=localhost:5000/neko nix run .#deploy-to-tee

# 4. Verify attestation (inside TEE)
nix run .#verify-attestation <app-id> <compose-hash>

# 5. Check deployment status
phala cvms list  # Modern CLI
# or
vmm-cli lsvm    # Legacy CLI

Multi-Registry Development

# Setup local registry for testing
nix run .#start-registry

# Push images to multiple registries
docker tag neko-agent:latest localhost:5000/neko/agent:v1
docker push localhost:5000/neko/agent:v1

# Use Tailscale for team access
nix run .#start-tailscale-funnel

# Use Cloudflare for public access
nix run .#start-cloudflare-tunnel

Environment Variables and Configuration

Automatic .env Loading

All development shells automatically load .env files:

# .env file example
NEKO_WS=ws://localhost:8080/api/ws
NEKO_LOGLEVEL=DEBUG
CUDA_VISIBLE_DEVICES=0
TORCH_CUDA_ARCH_LIST=8.6

Common Environment Variables

VariablePurposeDefaultSet By
CUDA_HOMECUDA installation pathAuto-detectedGPU shells
CUDA_VISIBLE_DEVICESGPU selectionallUser configurable
PYTORCH_CUDA_ALLOC_CONFMemory strategyexpandable_segments:TrueGPU shells
NPM_CONFIG_PREFIXNPM global location$PWD/.npm-globalAll shells
NIX_CFLAGS_COMPILECompiler optimizationsZnver2 flagsOptimized shells

Shell-Specific Variables

GPU Shells:

export CUDA_MODULE_LOADING=LAZY
export NVIDIA_DRIVER_CAPABILITIES=compute,utility
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$CUDA_HOME/lib

Documentation Shell:

# No specific variables, uses standard tool defaults

Container Shell:

export DOCKER_HOST="unix://$HOME/.colima/default/docker.sock"

Cross-Platform Support

Supported Systems

supportedSystems = [ "x86_64-linux" "aarch64-darwin" ];

Platform-Specific Features

x86_64-Linux:

  • Full GPU support - NVIDIA CUDA, Docker GPU passthrough
  • CPU optimizations - Znver2, Intel architecture targeting
  • Container building - Docker images with CUDA support

aarch64-Darwin (Apple Silicon):

  • Metal Performance Shaders - GPU acceleration via MPS
  • Rosetta compatibility - x86_64 dependencies when needed
  • Native performance - ARM64-optimized packages

Platform Detection

# Conditional features based on platform
${pkgs.lib.optionalString pkgs.stdenv.isLinux ''
  source ${znver2File}
  echo "[cpu-opt] Using znver2 flags: $NIX_CFLAGS_COMPILE"
''}

Troubleshooting

Common Issues

CUDA Not Detected:

# Check NVIDIA drivers
nvidia-smi

# Verify CUDA environment
echo $CUDA_HOME
echo $LD_LIBRARY_PATH

# Test PyTorch CUDA
python -c "import torch; print(torch.cuda.is_available())"

Solution: Ensure NVIDIA drivers are installed and compatible with CUDA 12.8.

Docker Issues on macOS:

# Check Colima status
colima status

# Restart if needed
colima stop
colima start --vm-type vz --cpu 2 --memory 4

Slow Package Installation:

# Use binary cache
echo "substituters = https://cache.nixos.org https://cuda-maintainers.cachix.org" >> ~/.config/nix/nix.conf
echo "trusted-public-keys = cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY= cuda-maintainers.cachix.org-1:0dq3bujKpuEPiCgBv7/11NEBpCcEKUzZzUNjRgPTOOA=" >> ~/.config/nix/nix.conf

Memory Issues

GPU Memory:

# Monitor GPU memory
nvidia-smi -l 1

# Optimize PyTorch memory
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128,expandable_segments:True

System Memory:

# Check available memory
free -h

# Monitor during development
htop

Performance Issues

Check Optimizations:

# Verify CPU flags
cat /proc/cpuinfo | grep flags

# Check compiler optimizations
echo $NIX_CFLAGS_COMPILE

# Benchmark inference
python -c "
import torch
import time
device = 'cuda' if torch.cuda.is_available() else 'cpu'
x = torch.randn(1000, 1000, device=device)
start = time.time()
result = torch.mm(x, x)
print(f'{device} time: {time.time() - start:.4f}s')
"

Advanced Usage

Custom Overlays

Create project-specific overlays in overlays/:

# overlays/custom-package.nix
final: prev: {
  custom-package = prev.python3Packages.buildPythonPackage {
    pname = "custom-package";
    version = "1.0.0";
    src = prev.fetchFromGitHub {
      owner = "owner";
      repo = "repo";
      rev = "v1.0.0";
      sha256 = "...";
    };
    propagatedBuildInputs = with prev.python3Packages; [
      numpy
      torch
    ];
  };
}

Custom Development Shells

Add new shells to the flake:

# Add to devShells
experimental = pkgs.mkShell {
  buildInputs = commonSystemPackages ++ [
    # Custom packages
  ];
  shellHook = ''
    echo "Experimental environment loaded"
    # Custom setup
  '';
};

Environment Specialization

Create environment-specific configurations:

# .env.gpu
CUDA_VISIBLE_DEVICES=0
TORCH_CUDA_ARCH_LIST=8.6

# .env.multi-gpu  
CUDA_VISIBLE_DEVICES=0,1,2,3
NCCL_DEBUG=INFO

# Load specific environment
cp .env.gpu .env
nix develop .#gpu

Contributing to the Flake

Adding New Packages

  1. Create overlay in overlays/new-package.nix
  2. Add to overlay list in nekoOverlays
  3. Include in appropriate shells
  4. Test across platforms
  5. Update documentation

Testing Changes

# Test specific shell
nix develop .#shell-name --command python -c "import new_package"

# Test all shells
for shell in default gpu ai neko docs cpu-opt gpu-opt; do
  echo "Testing $shell..."
  nix develop .#$shell --command echo "✓ $shell loads successfully"
done

# Test image builds
nix build .#neko-agent-docker-generic
nix build .#neko-agent-docker-opt

Performance Considerations

  • Binary caches - Use Cachix for custom packages
  • Layer optimization - Minimize Docker image layers
  • Dependency management - Avoid unnecessary dependencies
  • Build reproducibility - Pin package versions when needed

This comprehensive flake system provides a robust, reproducible development environment that scales from local development to production deployment while maintaining consistency across different platforms and use cases.