Integrating Anthropic Claude API with Next.js: A Full‑Stack Example
Integrating Anthropic Claude API with Next.js: A Full‑Stack Example
A hands‑on walkthrough that builds a Next.js app, calls the Claude API, manages token budgeting, and renders AI‑generated results—perfect for full‑stack developers ready to add Claude‑powered features to their sites.
Project Setup & Prerequisites
To build a robust AI-powered application, starting with a clean, modern foundation is essential. For this integration, we will use Next.js, which provides the ideal hybrid of server-side and client-side capabilities required to handle secure API calls to Anthropic's servers. For more deep-dives into architectural patterns and full-stack development, visit the bkrsna.dev homepage.
Create a new Next.js app
Initialize your project using the latest version of Next.js. We recommend selecting TypeScript and the App Router during the setup process to ensure type safety and a modern routing structure.
Install required packages
While Next.js has built-in fetch capabilities, using Axios can simplify request handling and interceptors. Additionally, we use dotenv for local environment variable management.
Configure environment variables
To keep your Anthropic API key secure, never hardcode it into your source files. Instead, store it in a `.env.local` file at the root of your project. Next.js will automatically load these variables into `process.env` on the server side.
- Create a file named `.env.local` in the root directory.
- Add the following line: `ANTHROPICAPIKEY=yourapikey_here`
- Add `.env.local` to your `.gitignore` file to prevent accidental leaks to GitHub.
Image placeholder: A clean flat‑design illustration of a terminal window showing the command `npx create-next-app my-claude-app` and a .env file icon, with soft teal and gray tones, modern UI style.
Creating a Secure API Route for Claude
To keep your Anthropic Claude integration safe, route all AI calls through a server‑side API in Next.js. This isolates the secret API key, lets you enforce payload validation, and gives you a single place to trim or batch prompts so you stay within token limits.
Define the API endpoint
Image placeholder: Isometric illustration of a server rack labeled 'Next.js API', with data flow arrows pointing to a cloud icon labeled 'Anthropic Claude', in cool blue and white palette.
Implementing Token Budgeting Logic
Managing your token budget is critical to preventing unexpected costs and ensuring your application remains stable. Because Anthropic charges based on input and output tokens, implementing a budgeting layer between your Next.js frontend and the API is a best practice for any production-grade AI wrapper. For a deeper dive into pricing, refer to our Claude Token Cost Guide.
Calculate prompt token count
Before sending a request, your application should estimate the prompt size. While Anthropic provides precise counts in the response, client-side or middleware estimation helps you reject oversized prompts before they hit the API, saving latency and avoiding 400-series errors.
Determine max response tokens
The `max_tokens` parameter controls the length of the model's response. To avoid hitting hard limits or incurring massive costs from runaway generations, it is advisable to set a safety margin—typically 10% below the absolute model limit.
Log usage to console or a file
For cost monitoring, log the `usage` object returned by the API. This includes `inputtokens` and `outputtokens`. Integrating these logs into a separate internal API endpoint allows you to monitor spend in real-time, which is essential when designing an AI wrapper business model.
- Use Anthropic's token estimator library for high-precision counting
- Maintain a 10% safety buffer to prevent unexpected truncation
- Expose budget telemetry via a dedicated API endpoint for administrative monitoring
Image placeholder: A schematic diagram showing a prompt box, a token counter gauge, and a budget slider, rendered in a minimal flat style with pastel colors.
Building the Front-End UI
With the API route established, the next step is to create a user interface that allows users to interact with Claude. In Next.js, this involves building a client-side interface that captures user prompts, manages the request lifecycle, and renders the AI's response in real-time.
InputForm component
The InputForm serves as the primary interaction point. Using React's `useState` hook, we can track the user's input in a textarea and trigger a POST request to our `/api/claude` endpoint. To optimize performance, consider implementing a debounce mechanism or simple validation to prevent empty submissions.
ResultsDisplay component
Once the API returns a response, the ResultsDisplay component renders the text. Since Claude's output can be lengthy, using a Markdown library is recommended for proper formatting. To help users monitor costs—as detailed in our Claude Token Cost Guide—it is a best practice to include a token usage badge showing the prompt and completion tokens consumed.
Loading and error states
A seamless UX requires handling the 'in-between' moments. Implementing a loading skeleton or a spinner prevents users from submitting duplicate requests. Error boundaries should be used to catch API timeouts or rate limits, providing clear feedback to the user.
If you are building more complex data-driven interfaces, you can apply similar state management patterns used in our Commercial invoice guide to handle structured form data.
Image placeholder: A high‑fidelity mockup of a Next.js page with a textarea at the top, a 'Generate' button, and a card displaying AI response, styled in light mode with subtle shadows.
Error Handling & Edge Cases
Robust error handling is essential for a seamless Claude integration. This section covers how to detect and surface network failures, respect Anthropic’s rate‑limit headers, and guard against token‑budget overruns. By combining user‑friendly toast alerts, exponential‑backoff retries, and real‑time budget displays, developers can keep the UI responsive while preventing costly API calls from blowing the allocated quota.
Network errors
Network glitches, DNS timeouts, or dropped connections cause fetch to reject. Wrap the call in a try/catch and surface the error via a toast (e.g., react-hot-toast). Optionally, schedule a retry with exponential backoff so transient blips recover without user intervention.
API rate limits
Anthropic returns a 429 status and X-RateLimit-Remaining headers when you hit the limit. Inspect the response; if remaining is zero, show a polite message and pause further requests until the reset window expires. Respecting these limits avoids 429 cascades.
Token budget exceeded
When the accumulated prompt+completion tokens surpass the user’s allocated budget, the API will reject calls with a 400 error. Detect this condition, alert the user about the overrun, and optionally provide a link to the token‑cost guide.
- Show toast notifications for failures
- Retry logic with exponential backoff
- Display remaining budget to the user
By implementing these patterns, your Next.js front‑end remains resilient, your users stay informed, and you stay within budget. The same wrapper can be reused across all API endpoints for consistency.
Deploying to Vercel & Production Considerations
Add ANTHROPICAPIKEY in Vercel dashboard
After your Next.js app is ready, push the repo to GitHub and import it in Vercel. In the Vercel dashboard open the project settings, navigate to Environment Variables, and create a new variable named ANTHROPICAPIKEY. Paste the secret key, set the environment to Production, and mark it as “Encrypted”. Vercel masks the value in logs, ensuring it never leaks.
- Connect repo
- Add env var
- Select production
- Save
Enable Serverless Functions
Create an API route (e.g., pages/api/claude.ts) that forwards requests to the Anthropic Claude endpoint. Vercel automatically bundles this as a serverless function, giving you isolated compute and automatic scaling. If you need sub‑millisecond response times, enable Edge Functions for the route, but be aware edge runtimes have stricter size limits.
Set up a simple cost‑monitoring webhook
For production cost awareness, expose a lightweight webhook at /api/cost-monitor that receives Claude usage payloads (token count, model, cost). Forward the data to Slack, a monitoring service, or a spreadsheet. Register the URL in the Anthropic dashboard so every request triggers the webhook, giving you real‑time spend visibility.
- Vercel environment variable masking
- Cold‑start latency expectations
- Use Vercel Analytics to track API calls
Image placeholder: Screenshot‑style illustration of Vercel dashboard showing an environment variable entry for ANTHROPIC_API_KEY, with a subtle overlay of the Claude logo.
Testing & Extending the Integration
In this final section we cover how to verify the Claude integration with automated tests and outline paths for future growth. Robust testing ensures the API route behaves correctly under token‑budget constraints and that UI components render expected responses.
Jest tests for /api/claude
We use Jest together with msw (Mock Service Worker) to intercept outgoing HTTP calls to Anthropic. The test suite spins up a mock server, supplies a fake API key, and asserts that the handler returns a properly formatted response while respecting the token budget.
Example Jest test suite for the API route:
- Mock Axios with nock or msw
- Test token budgeting edge case
- Verify error handling for rate limits
Running the suite in CI is straightforward. Add `npm test` to your GitHub Actions workflow, ensure the MSW server starts in the test environment, and fail the build on any token‑budget regression.
React Testing Library for components
Component tests render the chat UI, mock the /api/claude endpoint, and check that user messages appear, the loading spinner toggles, and the assistant reply is displayed. Snapshot testing can capture the markup after a successful call.
Future extensions
The architecture is deliberately model‑agnostic. Adding Claude 3.5 Opus or Haiku only requires a new environment variable and a small switch in the request builder. Multi‑model support can be exposed via a dropdown, and token‑budget logic can be abstracted into a shared utility.
Below is a quick reference for the three Claude 3.5 models and typical token limits.
How do I obtain an Anthropic API key?
Sign up at the Anthropic website and log into the developer console. In the API Keys section click “Create new key”, name it, and copy the generated secret. Store the key safely (e.g., in an environment variable) and refer to the Claude Token Cost Guide 2026 for integration tips.
What is the maximum token limit for Claude 3.5 models?
Claude 3.5 Sonnet and Opus each support up to 200 k tokens of context, while Claude 3.5 Haiku caps at about 100 k tokens. The limit includes both input and output tokens, so plan your prompts accordingly. See the detailed breakdown in the Claude Token Cost Guide 2026.
Can I use this setup with other Next.js hosting providers besides Vercel?
Yes, the same Next.js codebase works on platforms like Netlify, AWS Amplify, Railway, or Render with minimal changes. Ensure the provider supports Node.js serverless functions or edge runtimes for the API proxy to Anthropic. Update your deployment scripts and environment variables accordingly.
How do I monitor real‑time token usage in production?
Enable Anthropic’s usage dashboard to see aggregate token counts, or instrument your API wrapper to log the `inputtokens` and `outputtokens` fields returned with each response. Forward these logs to a monitoring tool such as Datadog, Grafana, or CloudWatch for real‑time alerts. You can also query the usage endpoint programmatically for custom dashboards.
Is there a way to stream Claude’s response instead of waiting for the full reply?
Anthropic’s API supports streaming via the `stream` flag, which returns Server‑Sent Events (SSE) as tokens are generated. Set `stream: true` in your request payload and consume the incremental chunks on the client side. This reduces latency for interactive applications and works with standard fetch or Axios streams.
Conclusion
Summarize the end‑to‑end workflow, reinforce the reusable pattern for AI integration, and invite readers to clone the GitHub repo (CTA) and experiment with their own prompts or models. Emphasize cost awareness and next steps such as adding caching or multi‑model support.