Check Commands: Teaching AI to Catch and Correct Team Code
When teams use AI tools to generate code, consistency becomes critical. Learn how the check command lets AI tools discover and fix their own errors, and how AGENTS.md teaches your AI tools to generate correct code the first time. These create alignment across your entire team.
What if your team could use any AI coding tool—Cursor, Claude Code, GitHub Copilot—and still maintain perfect consistency? What if AI-generated code automatically met your standards before reaching code review? What if new developers onboarded their AI assistants in minutes instead of spending weeks learning your patterns?
Without proper verification, your team might face serious alignment problems:
- Different AI tools generate inconsistent code patterns and formatting
- AI hallucinations create dependencies that don't exist in your project
- Each tool has different preferences—arrow functions vs traditional functions, classes vs hooks
- Formatting chaos: tabs vs spaces, inconsistent indentation, different naming conventions
- Syntactically correct but functionally wrong code slips through high-level reviews
AI Hallucination: When AI tools generate references to code, packages, or APIs that don't actually exist in your project. The code looks correct and compiles, but uses imaginary dependencies or imports from non-existent paths.
The solution is check commands that verify AI-generated code against shared standards. These commands catch issues before code reaches review, enforce consistency across all AI tools, and let every developer use their preferred AI assistant while maintaining a unified codebase.
This post is the first in my "Guide the AI" series. I'll show your team how to use check commands to verify AI-generated code, align multiple AI tools around shared standards, and maintain consistency even when each developer uses a different AI assistant.
Throughout this post, I'll use a standard Node-based web application as the example. We'll be creating npm scripts in package.json that your team can run to verify AI-generated code. The same principles apply whether you're building a React frontend, Express backend, or full-stack application.
Why Check Commands Matter More with AI Tools
AI coding assistants come in different types. Some like Cursor and GitHub Copilot generate code in your editor but you manually run checks. Others like Claude Code and Codex are CLI-based tools that can read your files, run commands like
npm run checkdirectly, and even iterate on their own code based on test results.
Here's what changes when your team adopts AI coding assistants:
More code, less human oversight: Developers review AI-generated code at a high level but don't scrutinize every line. That's faster, but it means errors slip through.
Variability in what gets generated: The same prompt to Cursor and Claude Code produces different implementations. Different prompts to the same tool create different patterns.
Trust without verification: AI-generated code looks professional. It compiles. It often works. But it might not match your team's standards, use your team's libraries, or follow your team's patterns.
The solution isn't banning AI tools or creating 50-page style guides that nobody reads. It's using check commands.
What is a check command?
A check command combines your tests, build verification, linting, and formatting checks into one command. Every developer runs it after AI generates code. Your CI runs it on every pull request. The check command doesn't care if a human or AI wrote the code—it verifies quality either way.
npm run check
One command. Same standards for all code. Same safety net whether one developer uses Cursor or another uses Claude Code.
Pull Request (PR): A request to merge your code changes into the main codebase. Other developers review your changes before they're accepted. Often abbreviated as "PR."
Here's what makes this approach essential for AI-assisted teams:
- Safety net for AI code - Catches issues in generated code before it reaches other developers
- Consistency across AI tools - Enforces team standards even when tools generate different code
- Fast feedback - Developers learn immediately if AI-generated code meets standards
- Teaches better prompting - When check fails, developers learn to prompt AI differently
- No tool lock-in - Team members can use different AI tools; check command ensures consistency
The beauty is that you don't build it all at once. Your team starts with one check and adds more as you discover what AI-generated code needs.
Iteration 1: Verify AI Code Actually Works
Let's build this together. I'll show you exactly how teams adopt this for AI-assisted development.
Start with tests. This is critical when AI generates code because AI tools can produce code that compiles, looks right, but doesn't meet requirements.
Add this to your package.json:
{
"scripts": {
"test": "jest",
"check": "npm run test"
}
}
npm scripts: Custom commands defined in your
package.jsonfile. They let you run complex commands with simple shortcuts likenpm run check.
Now everyone on the team adopts this workflow:
- Prompt AI tool to generate code
- Review the generated code
- Run
npm run check - If tests fail, adjust the prompt and regenerate
Why this matters with AI tools: AI can generate syntactically correct code that doesn't solve the actual problem. Tests verify behavior, not just syntax.
Team Benefit: One developer can use Claude Code, another can use Cursor, and the check command ensures both tools generate code that passes the same tests. No more "it works on my machine with my AI tool."
Iteration 2: Catch AI Hallucinations Early
After a week, a developer asks their AI tool to add a new feature. The code looks good and passes tests locally. But when they run the check command before pushing, the build fails—the AI hallucinated package imports that don't exist in the project. They catch it immediately, fix the imports, and push working code.
AI Hallucination: When AI coding tools confidently generate code that references non-existent packages, functions, or APIs. Running checks locally catches these before they reach your codebase.
Time to add build verification.
Update your check script in package.json:
{
"scripts": {
"check": "npm run test && npm run build"
}
}
The && operator means: run the second command only if the first succeeds.
Now everyone runs: npm run check
Why this matters with AI tools: AI coding assistants can hallucinate dependencies, use packages from their training data that aren't in your project, or import from the wrong path. Build verification catches these immediately.
Team Benefit: When build verification passes locally, the team knows the code will build in CI. No more "AI suggested a package we don't have" surprises.
Frontend variation: For frontend projects where you don't need a full build during development, use tsc --noEmit:
{
"scripts": {
"check": "npm run test && tsc --noEmit"
}
}
What does
tsc --noEmitdo? TypeScript's compiler (tsc) can check your types without generating JavaScript files. The--noEmitflag means "verify types are correct but don't create output files." Perfect for catching when AI uses wrong types or non-existent imports.
Iteration 3: Enforce Patterns Across AI Tools
Linting: Automated code analysis that finds patterns that might cause bugs or violate team standards. A linter checks things like unused variables, incorrect patterns, potential errors, and code complexity.
A few sprints later, your code reviews are catching inconsistencies. One developer's Cursor-generated code uses arrow functions everywhere. Another developer's Claude Code prefers traditional functions. Both tools generate working code, but the codebase looks like three different apps.
Your team decides to enforce linting.
Update your check script to include linting:
{
"scripts": {
"check": "npm run test && npm run build && npm run lint"
}
}
Now your check command catches test failures, build errors, and linting violations.
Why this matters with AI tools: Each AI tool has preferences. Cursor might prefer classes, Claude Code might suggest hooks, Copilot might use different naming conventions. Linting enforces YOUR team's choice, overriding whatever the AI suggests.
Team Benefit: Code reviews stop being about "did you check what your AI tool generated?" Senior developers can focus on architecture and design instead of pointing out that Cursor used a different pattern than Claude Code.
A developer new to the team learns to prompt their AI tool better by seeing what linting catches. The check command becomes a teaching tool.
Iteration 4: One Code Style, Multiple AI Tools
Prettier: An opinionated code formatter that automatically formats your code according to consistent rules. It eliminates debates about code style by enforcing one standard format.
Code reviews still have one problem: formatting inconsistencies. Different AI tools generate different formatting styles. The same tool formats code differently depending on your prompt. Your codebase ends up looking like it was written by five different teams, each with their own conventions.
Your team adopts Prettier to end the formatting wars.
Add prettier verification to your check script:
{
"scripts": {
"check": "npm run test && npm run build && npm run lint && prettier --check \"src/**/*.{ts,js,json}\""
}
}
Backend complete check:
npm run test && npm run build && npm run lint && prettier --check "src/**/*.{ts,js,json}"
Frontend complete check:
npm run test && tsc --noEmit && npm run lint && prettier --check "src/**/*.{ts,tsx,js,jsx,json,css}"
The frontend version includes .tsx, .jsx, and .css files since website frontends use JSX and styles.
Why this matters with AI tools: AI tools will format code according to their defaults or their interpretation of your prompt. Prettier doesn't care what tool generated the code—it enforces one style for everyone.
Team Benefit: One developer's Cursor and another developer's Claude Code both generate code that looks identical after Prettier runs. New developers don't need to learn "the team's formatting style"—Prettier is the style, and the check command enforces it.
Auto-fixing formatting issues:
When prettier --check fails, you don't want to manually fix formatting. Add a format script that auto-fixes:
{
"scripts": {
"format": "prettier --write \"src/**/*.{ts,js,json}\"",
"check": "npm run test && npm run build && npm run lint && prettier --check \"src/**/*.{ts,js,json}\""
}
}
Now when check fails due to formatting, run npm run format to auto-fix all formatting issues, then run check again.
For CLI-based AI tools: Configure your AI tool to run npm run format automatically when prettier --check fails. The tool will reformat code and verify it passes check before showing you the result.
Understanding the Chain: Why Order Matters
The check command runs in order: test → build → lint → format. This sequence is intentional:
- Test (fastest) - Catches broken functionality immediately
- Build (critical) - Catches hallucinated dependencies before worrying about style
- Lint (quality) - Enforces patterns after code compiles
- Format (auto-fixable) - Checks style last since Prettier can auto-fix it
Each step catches different AI mistakes. Run checks from most critical to least critical, and from hardest-to-fix to easiest-to-fix.
From Check to Fix: Making It Easy to Use AI
Your team now has a check command. But here's what you'll notice: developers prompt AI to generate code, run check, it fails due to formatting or linting, and they manually fix issues. That's tedious and defeats the purpose of AI assistance.
The solution depends on which type of AI tool your team uses:
Editor-based AI tools (Cursor, GitHub Copilot): These generate code in your editor but you run checks manually. When npm run check fails, you fix issues yourself or re-prompt the AI.
CLI-based AI tools (Claude Code, Codex): These command-line tools can run npm run check themselves. They read files, execute commands, and iterate on their own code based on check results.
For CLI tools, the workflow becomes:
- Prompt AI tool to generate code
- AI tool runs
npm run checkautomatically - AI tool sees failures (tests, build errors, linting, formatting)
- AI tool iterates and fixes issues itself
- Code reaches you already passing all checks
- Review and commit
Team Benefit: CLI-based AI tools eliminate manual check-fix-check cycles. Code reaches developers already passing all quality checks. Whether one developer uses Cursor (editor-based) or another uses Claude Code (CLI-based), both produce code that meets team standards—just with different amounts of automation.
AGENTS.md: Teaching AI Tools Your Team Standards
Here's the breakthrough: instead of fixing AI-generated code after the fact, teach your AI tools to generate correct code the first time.
That's what AGENTS.md does. It's a file that tells AI coding assistants what to generate.
Most modern AI tools (Cursor, Claude Code, GitHub Copilot with workspace context) read AGENTS.md automatically. They use it to understand your team's standards and generate code that matches.
Example AGENTS.md:
# Agent and LLM Rules
## Technology Stack
- npm for package manager (never suggest yarn or pnpm)
- TypeScript with strict mode enabled
- React 18 with functional components and hooks
- Prettier for all formatting (single quotes, 2-space indents)
- ESLint for code quality
## Code Generation Standards
When generating React components:
- Use functional components with hooks (never class components)
- Use TypeScript interfaces for props
- Import React as: import React from 'react'
- Place types in separate .types.ts files
- Export components as default exports
When generating API calls:
- Use fetch API (never axios unless specifically requested)
- Handle errors with try/catch
- Return typed responses using TypeScript
- Include loading and error states
## Quality Standards
- All generated code must pass `npm run check`
- Test coverage minimum: 80%
- No console.log in production code
- All functions must have JSDoc comments
Your check command enforces what your AGENTS.md specifies. As your standards evolve in AGENTS.md, update your check command to verify those standards automatically.
How this works in practice:
- One developer uses Cursor: Cursor reads AGENTS.md, knows to generate functional components with hooks
- Another developer uses Claude Code: Claude Code reads the same AGENTS.md, generates the same pattern
- A developer new to the team uses any tool: AGENTS.md teaches their tool the team's standards
Team Benefit: Your AI tools become team members that actually follow the style guide. New developers configure their AI tool once with AGENTS.md and immediately generate code that matches team patterns.
Keeping AGENTS.md Effective Over Time
AGENTS.md isn't a write-once document. It's a living guide that evolves with your codebase and team practices.
Update AGENTS.md regularly when you notice:
- New code patterns emerge: Your team adopts a new library or architectural pattern? Add it to AGENTS.md so AI tools generate code using the new approach.
- AI makes mistakes: When AI tools generate code with the wrong import style or create components that don't match your patterns, update AGENTS.md to teach the AI tools what's correct.
- Team standards change: Switching from class components to hooks? Moving from axios to fetch? Update AGENTS.md immediately so all AI tools learn the new standard.
- Incorrect assumptions surface: If AI tools consistently assume your team uses a different tech stack or pattern, explicitly correct those assumptions in AGENTS.md.
Think of it as training your AI tools through feedback. Each update improves the quality of generated code across your entire team.
The context limit challenge:
As your AGENTS.md file grows with more rules and patterns, it can get large—sometimes hundreds of lines. Large instruction files consume more of the AI tool's context window, leaving less space for your actual code and reducing the AI's effectiveness.
This is a real constraint for complex projects with extensive standards.
The solution: Split by task domain
Instead of one massive AGENTS.md file, split your rules into task-specific files:
- AGENTS-FRONTEND.md: React component patterns, CSS conventions, state management rules
- AGENTS-BACKEND.md: API design, database access patterns, authentication rules
- AGENTS-DATABASE.md: Schema conventions, migration patterns, query optimization standards
- AGENTS-TESTING.md: Test structure, mocking patterns, coverage requirements
How to use task-specific files:
When working on a React component, a developer references AGENTS-FRONTEND.md in their Cursor prompts:
Following our AGENTS-FRONTEND.md standards, create a user profile component with avatar, name, and bio fields.
When building a new API endpoint, a developer references AGENTS-BACKEND.md with Claude Code:
Following our AGENTS-BACKEND.md standards, create a POST /api/users endpoint with proper error handling.
Each developer brings in only the relevant rules for their current task, keeping context usage efficient while maintaining comprehensive standards.
Team Benefit: Your team can maintain extensive, detailed standards without overwhelming AI tools. As standards grow, they scale efficiently across different domains without sacrificing code quality or AI effectiveness.
From Local to CI: Consistency Everywhere
The final step: use the same check command in CI that developers use locally.
GitHub Actions example:
name: Check AI-Generated Code
on: [pull_request]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
- run: npm install
- run: npm run check
Why this matters with AI tools:
- CI doesn't know if human or AI wrote the code—just verifies quality
- Same safety net for everyone regardless of which AI tool they use
- Failed CI usually means "someone pushed AI-generated code without running check"
Team Benefit: If one developer's Cursor-generated code passes check locally, it passes in CI. If another developer's Claude Code passes check locally, it passes in CI. No more "my AI tool generated something that broke CI."
The check command is the common language across all AI tools and all developers.
Start This Week: Align Your Team's AI Tools
Here's what your team should do this week:
Monday: Create AGENTS.md with basic standards (technology stack, code patterns)
Tuesday: Tech lead uses AI to create check command, shares in standup
Wednesday: Each developer configures their AI tool (Cursor/Claude/Copilot) to read AGENTS.md
Thursday: Everyone uses AI to generate code, runs check command, shares what it caught
Friday: Retrospective—what did check commands catch? What should we add to AGENTS.md?
Next Monday: Add the next check based on what you learned
Pick one quality check your team runs inconsistently. Just one. Maybe it's tests. Maybe it's linting. Doesn't matter.
Create a check script in your package.json following the patterns shown earlier in this post. Run it. Verify it works. Document it in AGENTS.md. Share it with your team.
Next week, add one more check. Then another. Each iteration takes an hour.
Before you know it, your team will have a system where:
- AI tools generate code 10x faster
- Check commands verify quality automatically
- All code looks consistent regardless of which AI tool generated it
- New developers onboard their AI assistants in minutes
- Code reviews focus on architecture, not "did you check what your AI wrote?"
The best part? You adopted it iteratively. No massive upfront planning. No trying to document every rule. Just: generate with AI, verify with check, iterate.
Future Considerations
As your team scales AI-assisted development, two challenges emerge that are worth planning for:
Managing Verbose Check Command Output
CLI-based AI tools (like Claude Code) run your check commands and read the output. When your test suite grows to hundreds of tests or your linter reports dozens of issues, that output can consume significant portions of the AI tool's context window—leaving less space for your actual code.
Signs your check output is too verbose:
- AI tools struggle to maintain context about your codebase during long debugging sessions
- Check command output exceeds 500 lines when there are failures
- Multiple failing tests produce repetitive stack traces
Strategies to optimize:
{
"scripts": {
"test": "jest --verbose",
"test:check": "jest --silent --maxWorkers=2",
"lint": "eslint . --format stylish",
"lint:check": "eslint . --format compact --quiet",
"check": "npm run test:check && npm run build && npm run lint:check && prettier --check \"src/**/*.{ts,js,json}\""
}
}
Notice the :check variants use flags like --silent, --quiet, and --format compact to reduce output verbosity while still catching failures.
Team Benefit: AI tools maintain better context about your codebase when check commands produce concise output. Developers get faster iterations because the AI can "remember" more of the actual code instead of filling its context with test output.
Code Coverage: Ensuring AI-Generated Tests Are Complete
AI tools can generate code quickly—and they can generate tests for that code. But are those tests actually comprehensive?
Code coverage tools measure what percentage of your code is executed by your tests. They catch when AI generates new features but only writes tests for the happy path, missing error handling, edge cases, or boundary conditions.
Adding coverage to your check command:
{
"scripts": {
"test": "jest",
"test:coverage": "jest --coverage --coverageThreshold='{\"global\":{\"lines\":80,\"statements\":80}}'",
"check": "npm run test:coverage && npm run build && npm run lint && prettier --check \"src/**/*.{ts,js,json}\""
}
}
The --coverageThreshold flag fails the check command if coverage drops below 80%. When AI generates new code, it must also generate tests that achieve the coverage threshold.
Why this matters with AI tools: AI excels at generating the straightforward test cases but might skip edge cases, error paths, or boundary conditions. Coverage requirements force more complete test generation.
Team Benefit: Your team catches incomplete test coverage before code reaches review. When one developer prompts their AI tool to "add user authentication," the check command verifies that authentication includes tests for invalid passwords, expired tokens, and permission errors—not just the successful login case.
These considerations aren't urgent on day one. Start with basic check commands. Add coverage requirements and optimize verbosity as your team grows and your check commands mature.
In Conclusion
Teams that verify AI-generated code stay aligned. Teams that stay aligned ship faster and with fewer bugs.
When AI tools generate 10x more code, you need 10x better verification. Check commands are that verification.
Start small. Create AGENTS.md. Add a check command. Use AI to accelerate. Iterate as your team discovers what works.
The key takeaway: Your team's first AI-assisted project shouldn't be a feature. It should be creating the safety net (check commands) and the instruction manual (AGENTS.md) that make AI-assisted development reliable. Start there. Everything else follows.
