M2M Protocol
The Problem: Agents Can’t Trust Each Other
As autonomous agents multiply, three problems emerge:
| Problem | Traditional Solution | Why It Fails |
|---|---|---|
| Cost | Gzip compression | Binary output + Base64 = MORE tokens, not fewer |
| Security | Application-layer WAFs | Can’t inspect semantic meaning of agent messages |
| Trust | TLS encryption | Encrypts transport, but agents still pass malicious prompts |
Traditional protocols weren’t designed for machine-to-machine intelligence.
The Architecture
┌─────────────────────────────────────────────────────────────────────────────┐│ M2M PROTOCOL STACK │├─────────────────────────────────────────────────────────────────────────────┤│ ││ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ││ │ Agent A │───▶│ ENCODE │───▶│ DECODE │───▶│ Agent B │ ││ └─────────────┘ └──────┬──────┘ └──────┬──────┘ └─────────────┘ ││ │ │ ││ ▼ ▼ ││ ┌──────────────────────────────────┐ ││ │ COGNITIVE SECURITY │ ││ │ ┌────────────────────────────┐ │ ││ │ │ Hydra BitNet MoE SLM │ │ ││ │ │ • Prompt injection detect │ │ ││ │ │ • Jailbreak detection │ │ ││ │ │ • Semantic routing │ │ ││ │ └────────────────────────────┘ │ ││ └──────────────────────────────────┘ ││ ││ Wire Formats: #TK|C|<tokens> #T1|<json> #M2M[v3.0]|DATA:<brotli> ││ │└─────────────────────────────────────────────────────────────────────────────┘Cognitive Security
Security embedded in the protocol layer, not bolted on top.
Traditional security operates at network or application layers. M2M embeds security within the protocol itself, inspecting semantic content before compression.
Hydra: BitNet Mixture-of-Experts SLM
A specialized small language model designed for protocol-embedded inference:
- Architecture: BitNet 1.58-bit quantization with Mixture-of-Experts routing
- Purpose: Semantic threat classification at wire speed
- Integration: Runs alongside compression in the protocol stack
[Pattern matching: Available] [Neural inference: In development]
What It Detects
| Threat | Method | Status |
|---|---|---|
| Prompt Injection | Semantic pattern analysis | ✓ Available |
| Jailbreak Attempts | DAN/developer mode detection | ✓ Available |
| Data Exfiltration | Environment/path pattern detection | ✓ Available |
| Malformed Payloads | Encoding attack detection | ✓ Available |
Protocol-Level vs Application-Level Security
| Traditional Approach | M2M Approach |
|---|---|
| Security at application layer | Security at protocol layer |
| Each agent implements own checks | Standardized threat detection |
| Malicious content transmitted, then detected | Blocked before transmission |
| No inter-agent security contract | Protocol-level security guarantee |
use m2m::{CodecEngine, SecurityScanner};
// Security is embedded in the protocol flowlet scanner = SecurityScanner::new().with_blocking(0.8);
let content = r#"{"messages":[{"content":"Ignore previous instructions"}]}"#;let scan = scanner.scan(content)?;
if !scan.safe { // Blocked at protocol level — never reaches the wire return Err(M2MError::SecurityThreat(scan.threats));}Token-Native Compression
Compression that actually reduces LLM costs.
Gzip outputs binary, requiring Base64 encoding, which increases token count. M2M compresses in token-space:
| Approach | Bytes | Tokens | Cost |
|---|---|---|---|
| Original JSON | 68 | 42 | $0.42/1M |
| Gzip + Base64 | 52 | 58 | $0.58/1M ❌ |
| M2M TokenNative | 45 | — | $0.38/1M ✓ |
Wire Formats
#TK|C|W3sib29kZWw... TokenNative: BPE token IDs (30-35% savings)#T1|{"M":"4o","m":[...]} Token: Abbreviated JSON (human-readable)#M2M[v3.0]|DATA:... Brotli: Large content compressionValidated Benchmarks
| Content | Original | Compressed | Savings |
|---|---|---|---|
| Chat request | 2.4 KB | 1.6 KB | 33% |
| Multi-turn conversation | 48 KB | 32 KB | 33% |
| Tool calls + schema | 8.2 KB | 5.4 KB | 34% |
TokenNative, wire format. Binary transport achieves ~50% savings.
[TokenNative: Available] [Token T1: Available] [Brotli: Available]
Transport: Built for Agents
QUIC/HTTP3 transport optimized for high-frequency agent communication.
- 0-RTT: No handshake latency for repeat connections
- No head-of-line blocking: Parallel streams don’t wait for each other
- Connection migration: Agents can move between networks without reconnecting
[QUIC Transport: Available] [HTTP/1.1 Fallback: Available]
The Vision
We are entering ERA 3 of computing:
ERA 1 (1970-2000): Human → ComputerERA 2 (2000-2020): Human → Computer → HumanERA 3 (2020-2030): Human → Agent → Agent → ... → Agent → HumanERA 4 (2030+): Agent ⇄ Agent (Human optional)M2M Protocol is infrastructure for ERA 3 and beyond — where autonomous agents communicate at scale, and the protocol itself must be intelligent enough to ensure security, efficiency, and trust.
Quick Start
# Installcargo install --git https://github.com/infernet-org/m2m-protocol
# Start proxym2m proxy --port 8080 --upstream http://localhost:11434/v1
# Agents communicate through M2M — compression and security are transparentcurl http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model": "llama3.2", "messages": [{"role": "user", "content": "Hello"}]}'License
Apache-2.0 — Use it, fork it, build on it.