OpenAI Internals: Structured Outputs to System Prompts Curious how OpenAI converts your schema into a hidden system message? We'll show you how it works!
Optimizing Claude MCP Server Usage: Leveraging Prompt Caching Multiple MCP server calls are billed like separate API requests, quickly inflating your token count. Discover how prompt caching - an undocumented feature for MCP servers - dramatically reduces these costs with minimal code changes.
OpenAI Internals: Remote MCP to System Prompts Ever wonder how OpenAI handle remote MCP servers? Wonder no more!
Claude Code Internals: Intercepting Requests Want to see what Claude Code is doing? We'll show you how to check!
Google Vertex AI Model Availability (Updated Hourly) Live list of all models on Google Cloud Vertex AI.
Where are my Cloudflare Email Workers running? A simple solution to figure out what PoP your Cloudflare Email Workers are running in.
Automating Aurora DSQL and Cloudflare Hyperdrive Integration with Workers Learn how to automatically manage rotating credentials and IAM tokens for your PostgreSQL databases using Cloudflare Workers. Perfect for teams using Aurora DSQL or any database that requires periodic credential updates with Hyperdrive.
Automatic Anthropic to Vertex AI Failover using Cloudflare AI Gateway Getting the dreaded overloaded_error when using claude-3-5-sonnet-20241022? Use Cloudflare AI Gateway to automatically failover to Vertex AI!
Identifying Anthropic Claude Errors on Amazon Bedrock Learn how to identify and troubleshoot errors with Claude 3.5 Sonnet on Amazon Bedrock using CloudWatch and CloudTrail.
Chaining OpenAI Models: Better and Faster This blog explores combining AI models for a efficient, accurate, and cost-effective solution. We chain responses from a larger model (GPT-4o) to a smaller one (GPT-4o mini) to convert complex outputs into structured JSON. A Python script demonstrates this step-by-step process.
Optimizing Token Usage in OpenAI's JSON Mode with Stop Sequences By strategically setting stop conditions, you can cut down on unnecessary tokens, saving 20% on your output in our example. Learn how to implement this technique and even recover full tokens with logprobs.
Peeking Behind the Curtain: How to Gauge GPT's Confidence Discover how to gauge GPT's confidence using log probabilities. This post walks through a Python script that extracts and analyzes probabilities from GPT-4o's responses, helping you understand the certainty behind AI-generated answers.
Collecting Anthropic Server Logs with an Undocumented API Learn how to collect and analyze logs for Claude using an undocumented Anthropic API. This guide covers log collection, deduplication, and sorting by error or latency, helping you measure performance and track errors.
Using Regional Google Storage Buckets with S3 APIs Learn how to use regional GCS buckets with S3 API compatibility for data sovereignty and compliance. This guide covers creating a bucket, setting permissions, and using HMAC for S3 API access. Test your setup with rclone and create presigned URLs for secure file sharing.
Restricting Amazon Bedrock Regions with IAM Learn how to enhance data sovereignty by using AWS IAM to restrict Amazon Bedrock requests to specific regions. This guide provides a sample policy and crucial tips for managing cross-region inference while maintaining control over your AI processing locations.
Identifying Anthropic Claude Errors on Vertex AI Discover how to monitor Claude 3.5 Sonnet's performance on Google Vertex AI using Audit Logs and Log Explorer. Learn to identify and analyze errors, distinguish between user and system issues, and calculate accurate error rates to estimate your AI application's reliability.
Leveraging Anthropic's New Message Batches API and Prompt Caching: A Practical Guide Anthropic's API now offers Message Batches and Prompt Caching. Let's explore how to use these powerful new features with a practical example.
Enforcing Azure AD / Entra ID on Azure OpenAI A technical guide on enforcing Azure AD / Entra ID authentication for Azure OpenAI, including scripts to disable local auth and verify the changes.
Sending emails via SES with Cloudflare Workers Learn how to send emails from Cloudflare Workers via Amazon SES. Covers AWS setup, IAM config, Worker code, and security best practices.
Google Gemini Internals: Function Calling / Tools to Prompts Ever wonder how Google Gemini handles tool / function calling? Wonder no more!
OpenAI Internals: Function Calling / Tools to System Prompts Ever wonder how OpenAI handles tool / function calling? Wonder no more!
Security Analysis of CVE-2023-32439 On June 21, 2023, Apple rolled out a security update for Safari, tagging CVE-2023-32439 a DFG type confusion issue.