LLM

Specs

A2A
Agent Skills
agents.md - A simple, open format for guiding coding agents, used by over 20k open-source

Tools

Frameworks

Serving

LMCache - Supercharge Your LLM with the Fastest KV Cache Layer
ollama - Get up and running with Llama 3, Mistral, Gemma, and other large language models
sglang - SGLang is a high-performance serving framework for large language models and multimodal models.
vllm - A high-throughput and memory-efficient inference and serving engine for LLMs

Guardrails

Guardrails - NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

API Gateways

bifrost - Fastest LLM gateway (50x faster than LiteLLM) with adaptive load
litellm - Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM]

Serialization

toon - 🎒 Token-Oriented Object Notation (TOON) – Compact, human-readable, schema-aware JSON for LLM prompts. Spec, benchmarks, TypeScript SDK.

Utilities

TokenCost - Easy token price estimates for 400+ LLMs. TokenOps projects.
opencommit - top #1 and most feature rich GPT wrapper for git — generate commit messages with an LLM in 1 sec — works with Claude, GPT and every other provider, supports local Ollama models too

Models

Creator	Name	Hugging Face
Alibaba	Qwen3-ASR	HF
Alibaba	Qwen3-VL
Alibaba	Qwen3.5	HF
BAAI	bge-m3	HF
Google	TranslateGemma	HF
Google	Gemma 4	HF
PrismML	Bonsai	HF

TTS

Kokoro - Available voices: https://huggingface.co/onnx-community/Kokoro-82M-v1.0-ONNX
pocket-tts - A TTS that fits in your CPU (and pocket)

STT

Kyutai STT

API

bash

curl -X POST http://example.com/v1/responses \
  -u "username:password" \ # basic auth
  -H "Authorization: Bearer $OPENAI_API_KEY" \ # api key auth
  -H "Content-Type: application/json" \
  -d '{
        "model": "Llama-3.2-1B-Instruct-Hybrid",
        "input": "What is the population of Paris?",
        "stream": false
      }'

Basicauth can also be provided as request header:

bash

echo -n "username:password" | base64
curl -H "Authorization: Basic xxxx"

Security

Misc

Hardware

Setting up NVIDIA DGX Spark with ggml

bash

bash <(curl -s https://ggml.ai/dgx-spark.sh)

Vendors

Google

Gemini - OpenAI compatibility

Apps

gallery - A gallery that showcases on-device ML/GenAI use cases and allows people to try and use models locally.

Resources

12-factor-agents - What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?
Agentic UX Patterns
Artificial Analysis - AI Model & API Providers Analysis
Awesome Agentic Patterns
How LLMs Work — A Visual Deep Dive
Inference Hardware Leaderboard
Killed by LLM
LLM Explorer
LLM Politeness Study
LLM Pricing
LLMRequirements.com — Hardware for Local LLMs in 2026
MakingMCP
The Ultra-Scale Playbook: Training LLMs on GPU Clusters

Tools

Tools

Tools

GIS

Tools

NLP

Web Scraping

Tools

Infrastructure

Containers

Kubernetes

Databases

Networking

Cookbook

Tools

LLM

Specs

Tools

Frameworks

Serving

Guardrails

API Gateways

Serialization

Utilities

Models

TTS

STT

API

Security

Misc

Hardware

Setting up NVIDIA DGX Spark with ggml

Vendors

Google

Apps

Resources

Tools

Tools

Kubernetes

LLM ​

Specs ​

Tools ​

Frameworks ​

Serving ​

Guardrails ​

API Gateways ​

Serialization ​

Utilities ​

Models ​

TTS ​

STT ​

API ​

Security ​

Misc ​

Hardware ​

Setting up NVIDIA DGX Spark with ggml ​

Vendors ​

Google ​

Apps ​

Resources ​

LLM

Specs

Tools

Frameworks

Serving

Guardrails

API Gateways

Serialization

Utilities

Models

TTS

STT

API

Security

Misc

Hardware

Setting up NVIDIA DGX Spark with ggml

Vendors

Google

Apps

Resources