GrepAI: Semantic Code Search That Saves 27% on Claude Code

Traditional grep was built in 1973 for exact text matching. Modern AI-assisted development demands something smarter. GrepAI is that something---a privacy-first CLI that enables semantic code search through vector embeddings, helping AI agents like Claude Code understand your codebase by meaning rather than keywords.

After integrating GrepAI into my Claude Code workflow, I have seen dramatic improvements in token efficiency and search accuracy. This guide covers everything you need to know: what GrepAI does, how to install it, MCP server configuration, and practical workflows that can reduce your Claude Code costs by nearly 30%.

What Is GrepAI and Why Does It Matter?

GrepAI is an open-source command-line tool created by Yoan Bernabeu that transforms how AI agents navigate codebases. Instead of matching exact text patterns like traditional grep, GrepAI uses vector embeddings to understand code semantically.

When you search for "authentication logic," traditional grep requires you to know the exact function names, variable names, or comments in your codebase. GrepAI understands what authentication logic means and finds relevant code even when naming conventions vary across your project.

Core Capabilities

GrepAI provides four primary features that make it invaluable for AI-assisted development:

Semantic Search: Query your codebase using natural language. Ask for "error handling middleware" or "database connection pooling" and get relevant results regardless of how the code is actually named.

Call Graph Tracing: Before modifying any function, understand its dependencies. GrepAI traces both callers (what calls this function) and callees (what this function calls), providing the context AI agents need to make safe modifications.

100% Local Processing: Your code never leaves your machine. GrepAI runs entirely locally using Ollama for embeddings, making it suitable for proprietary codebases and security-conscious environments.

MCP Server Integration: GrepAI exposes its capabilities through the Model Context Protocol, allowing Claude Code, Cursor, and Windsurf to use semantic search as a native tool.

The Token Economics Problem GrepAI Solves

Every time Claude Code explores your codebase, it consumes tokens. Traditional grep returns dozens or hundreds of results, each requiring Claude to read and evaluate. This creates a cascade effect where a single search query can spawn multiple subagents, each with its own context window and cache creation charges.

GrepAI addresses this by returning only semantically relevant results. Instead of "here are 47 files containing the word 'auth'," GrepAI returns "here are the 5 files that implement authentication logic." Fewer results means fewer tokens, lower costs, and faster responses.

Benchmark Results: 27.5% Cost Reduction

A controlled benchmark comparing GrepAI against traditional grep in Claude Code revealed significant savings. The test used the Excalidraw repository (155,000+ lines of TypeScript) with five real-world developer questions posed to both approaches.

Token and Cost Metrics

Metric	Traditional Grep	GrepAI	Improvement
API Cost	$6.78	$4.92	-27.5%
Input Tokens	51,147	1,326	-97%
Cache Creation Tokens	563,883	162,289	-71%
Tool Calls	139	62	-55%
Subagents Spawned	5	0	-100%
Glob Operations	13	0	-100%

The most dramatic improvement came from eliminating subagent spawning entirely. Each subagent in Claude Code creates a separate context with its own cache creation charges. GrepAI's targeted semantic results prevented the need for additional agents, saving $2.51 on cache creation alone.

Why This Matters for Pro Plan Users

If you are on Claude Code's Pro plan ($20/month), these savings compound quickly. A 27.5% reduction in token consumption means you can handle significantly more development tasks before hitting usage limits. For developers who consistently bump against rate limits, GrepAI provides meaningful headroom without requiring a subscription upgrade. For broader strategies on managing Claude Code token consumption, see my guide on Claude Code token management.

Installing GrepAI

GrepAI supports macOS, Linux, and Windows with straightforward installation options.

macOS (Homebrew)

brew install yoanbernabeu/tap/grepai

Linux and macOS (Script)

curl -sSL https://raw.githubusercontent.com/yoanbernabeu/grepai/main/install.sh | sh

Windows (PowerShell)

irm https://raw.githubusercontent.com/yoanbernabeu/grepai/main/install.ps1 | iex

Setting Up the Embedding Provider

GrepAI requires an embedding model to create vector representations of your code. The recommended approach uses Ollama with the nomic-embed-text model for fully local, privacy-preserving operation.

Install Ollama from ollama.ai, then pull the embedding model:

ollama pull nomic-embed-text

For faster indexing at the cost of sending code to an external API, you can configure OpenAI's text-embedding-3-small model instead. This requires an API key and modifies your .grepai/config.yaml file.

Basic Usage and Commands

Initialize a Project

Navigate to your project root and initialize GrepAI:

cd /path/to/your/project
grepai init

This creates a .grepai directory containing configuration and the vector index.

Start the Indexing Daemon

For real-time index updates as you modify code:

grepai watch

The daemon monitors file changes and updates the index automatically. GrepAI can index 10,000+ files in seconds and performs searches in milliseconds.

Perform Semantic Searches

Search your codebase using natural language:

grepai search "error handling middleware"
grepai search "user authentication flow"
grepai search "database connection pooling"

Results include file paths, line numbers, relevance scores, and code previews.

Trace Function Dependencies

Before modifying any function, understand its impact:

# Find everything that calls the Login function
grepai trace callers "Login"

# Find everything the Login function calls
grepai trace callees "Login"

# Build a complete dependency graph
grepai trace graph "Login"

Call graph tracing supports Go, TypeScript, JavaScript, Python, PHP, Java, C, C#, C++, Rust, and Zig.

Check Index Status

Verify your index is healthy and up to date:

grepai status

MCP Server Integration with Claude Code

The real power of GrepAI emerges when integrated as an MCP server, making semantic search available as a native Claude Code tool.

Adding GrepAI to Claude Code

Run the following command to register GrepAI as an MCP server:

claude mcp add grepai -- grepai mcp-serve

After adding the server, Claude Code can access these tools:

grepai_search: Semantic code search with natural language
grepai_trace_callers: Find all functions calling a symbol
grepai_trace_callees: Find all functions a symbol calls
grepai_trace_graph: Build complete dependency graphs
grepai_index_status: Check index health metrics

Cursor IDE Configuration

For Cursor, create or edit .cursor/mcp.json in your project:

{
  "mcpServers": {
    "grepai": {
      "command": "grepai",
      "args": ["mcp-serve"]
    }
  }
}

On Windows or when working with projects outside the current directory, specify an explicit path:

{
  "mcpServers": {
    "grepai": {
      "command": "grepai",
      "args": ["mcp-serve", "/path/to/your/project"]
    }
  }
}

Windsurf Configuration

Windsurf supports the same MCP protocol. Add GrepAI through Windsurf's MCP settings using the stdio transport with the grepai mcp-serve command.

Practical Workflows for AI-Assisted Development

With GrepAI integrated into Claude Code, you can leverage semantic search in your daily development workflows.

Workflow 1: Safe Refactoring

Before refactoring any function, understand its full impact:

Use grepai to trace all callers of the processPayment function,
then show me which files would need updates if I change its signature.

Claude Code uses grepai_trace_callers to find every location calling the function, then provides a comprehensive list of required changes. This eliminates the risk of breaking dependent code.

Workflow 2: Understanding Unfamiliar Codebases

When joining a new project or exploring an unfamiliar codebase:

Search for authentication and authorization logic in this codebase.
Show me how user permissions are checked.

GrepAI returns semantically relevant results even if the codebase uses terms like "access control," "permission guards," or "auth middleware" instead of exact keywords you searched for.

Workflow 3: Bug Investigation

When debugging issues, search by behavior rather than implementation:

Find all code that handles database connection failures or retries.

Traditional grep would require knowing specific error class names or retry function names. GrepAI understands the semantic concept and finds relevant error handling regardless of naming conventions.

Workflow 4: Feature Implementation Planning

Before implementing new features, understand existing patterns:

Search for how this codebase implements caching.
Show me the caching patterns used and any cache invalidation logic.

This provides context that helps both you and Claude Code follow established patterns rather than introducing inconsistent implementations.

Optimizing GrepAI for Large Codebases

Configure Code Chunking

For very large files, adjust chunking parameters in .grepai/config.yaml to balance search accuracy with index size. Smaller chunks provide more precise results but increase index size.

Use Workspace Commands for Monorepos

If you work with multiple related projects, GrepAI supports workspace commands to manage indices across repositories:

grepai workspace add /path/to/frontend
grepai workspace add /path/to/backend
grepai workspace add /path/to/shared-libs

Exclude Build Artifacts

Ensure your .grepai configuration excludes generated files, node_modules, vendor directories, and build artifacts. This keeps the index focused on actual source code and improves search relevance.

Comparing GrepAI to Alternatives

GrepAI vs Traditional Grep

Traditional grep excels at finding exact text matches quickly. It is faster for simple pattern matching but requires knowing implementation details. GrepAI shines when you know what you want but not how it is named.

GrepAI vs IDE Search

IDE search tools like VS Code's search provide regex and file filtering but still rely on text matching. GrepAI's semantic understanding finds conceptually related code that IDE search would miss.

GrepAI vs GitHub Code Search

GitHub's code search works across repositories but requires internet connectivity and sends queries to external servers. GrepAI runs entirely locally, making it suitable for proprietary code and offline development.

Token Management Best Practices with GrepAI

To maximize token savings when using GrepAI with Claude Code:

Start Sessions with Semantic Context: Begin new tasks by having Claude use GrepAI to understand relevant code sections rather than exploring the codebase through file reads.

Use Call Graph Tracing Before Modifications: Always trace callers and callees before changing function signatures. This prevents Claude from missing dependent code and reduces correction iterations.

Combine with CLAUDE.md Optimization: Reference GrepAI capabilities in your CLAUDE.md file so Claude knows to use semantic search instead of traditional exploration. For comprehensive CLAUDE.md strategies, see my legacy codebase practitioner's guide.

Monitor MCP Server Impact: Use /context in Claude Code to verify GrepAI's tool definitions do not consume excessive context. GrepAI's MCP footprint is relatively small compared to the token savings it provides.

Frequently Asked Questions

Does GrepAI work with all programming languages?

GrepAI's semantic search works with any text-based code. Call graph tracing specifically supports Go, TypeScript, JavaScript, Python, PHP, Java, C, C#, C++, Rust, and Zig.

How much disk space does the index require?

Index size depends on codebase size and chunking configuration. A typical medium-sized project (10,000 files) creates an index of approximately 100-500MB.

Can I use GrepAI without Ollama?

Yes. GrepAI supports OpenAI's embedding API as an alternative. Configure your API key in .grepai/config.yaml to use text-embedding-3-small. This trades local privacy for potentially faster indexing.

Does GrepAI slow down my editor?

The grepai watch daemon runs in the background with minimal resource usage. Searches complete in milliseconds. Most developers report no perceptible impact on editor performance.

Is GrepAI suitable for enterprise codebases?

Yes. The 100% local operation makes GrepAI suitable for proprietary code with strict security requirements. No code or queries are sent to external servers when using Ollama embeddings.

Conclusion

GrepAI represents a meaningful evolution in how developers navigate and understand codebases, particularly when working with AI coding assistants. The 27.5% cost reduction and 97% input token decrease demonstrated in benchmarks translate to real savings for developers on Claude Code's Pro plan.

The key benefits worth highlighting:

Semantic search finds relevant code by meaning, not exact text matches
Call graph tracing enables safe refactoring with full dependency awareness
MCP integration makes these capabilities native to Claude Code, Cursor, and Windsurf
Local operation keeps proprietary code secure and enables offline development
Meaningful token savings extend your subscription value significantly

For developers already using Claude Code or other AI assistants, adding GrepAI to your toolkit requires minimal setup and provides immediate benefits. The combination of semantic understanding and local privacy makes it particularly valuable for professional development on proprietary codebases.

Start with grepai init in your next project and experience the difference semantic code search makes in AI-assisted development.

Resources:

Richard Joseph Porter

Senior Laravel Developer with 14+ years of experience building scalable web applications. Specializing in PHP, Laravel, Vue.js, and AWS cloud infrastructure. Based in Cebu, Philippines, I help businesses modernize legacy systems and build high-performance APIs.

Get in touch →

Looking for Expert Web Development?

With 14+ years of experience in Laravel, AWS, and modern web technologies, I help businesses build and scale their applications.

View My Services