The Problem: Resource Hoarding
In large-scale computing clusters, efficient resource allocation is fundamentally crippled by a widespread behavioral phenomenon: resource hoarding. Agents systematically over-report their actual compute and memory requirements to ensure maximum availability for their own workloads. Traditional static allocation rules and classical auction mechanisms struggle here because they generally take reported demands at face value. This naïve trust leads to artificial scarcity, massive resource waste, and severely degraded overall cluster efficiency.
To solve this, we move beyond rigid, theoretical economic models and treat mechanism design as a machine learning problem. By analyzing real cluster telemetry, we can observe the exact gap between what an agent requests and what they actually use. This project introduces a Transformer-based auction system trained directly on this real-world data. It actively learns the patterns of hoarding and generates allocation strategies that intentionally ignore inflated requests, heavily penalizing hoarding behavior while maintaining robust system-wide welfare.
Real Workload Signals
Dataset Mapping
Reported Demand = assigned_memory
α = Reported / True Demand
Systematically observed in real data, indicating repeated pattern of inflated requests.
Waste = Reported - True
Direct inefficiency metric causing cluster underutilization.
α Distribution
Values range 2 → 10. Strong clustering reveals repeated hoarding patterns. The highest frequency occurs at the lower end, but noticeable spikes exist even at extreme multipliers. This confirms agents aren't just adding small safety margins; they are intentionally multiplying requests to manipulate the system.
Waste Distribution
Mostly small, but non-zero with a long tail causing extreme inefficiency. While most instances of waste seem minor in isolation, the cumulative effect of the long tail—where agents requested far more than they could ever use—creates massive stranded capacity across the entire cluster.
Transformer Auction
A neural mechanism design mapping reported states to efficient, stochastic outcomes.
Training Dynamics Critical
Model trades peak short-term welfare for long-term stability. As Regret The incentive for an agent to lie. Lower regret means truthful reporting is the optimal strategy. and Waste plummet, the system learns robust allocation. Notice the sharp initial drop in Waste (orange) and Regret (green) corresponding with a stabilization of Welfare (blue). The Transformer realizes that maximizing raw welfare is impossible without inadvertently rewarding hoarders, opting instead for a highly stable, low-regret equilibrium.
Efficiency vs. True Demand
Points lie below the ideal (y=x) line. The model learns conservative allocation to avoid over-allocation and reduce waste. By keeping allocations strictly underneath the true demand curve, the system builds in a structural safeguard. It essentially acts as a strict auditor, ensuring that even if an agent tells the truth, they receive just enough to function, thereby completely eliminating opportunistic overallocation.
Hoarding Response
MOST IMPORTANTAllocation does NOT increase with α. (e.g., α ≈ 2.3 → alloc ≈ 0.6; α ≈ 9 → alloc ≈ 0.3). The system ignores inflated demand. This is the core breakthrough. In a traditional system, a higher requested amount yields a higher allocation. Here, the scatter plot clearly shows a downward trend—the more an agent exaggerates their needs, the less resources they are statistically likely to receive.
Generative Allocation Sampling
The model is generative, not deterministic. It produces diverse but bounded allocations (low variance, slightly stochastic). Notice how multiple inference runs cluster in specific ranges (~0.2, ~0.3, ~0.4, ~0.5-0.6), ensuring robustness under uncertainty. This stochastic nature means agents cannot easily reverse-engineer the exact allocation function. By maintaining small variations within distinct bands, the mechanism remains unpredictable enough to deter strategic manipulation while still guaranteeing reliable baseline performance for the workloads.
Hoarding Exists
Real cluster data proves agents systematically over-report ($\alpha \gg 1$).
Waste is Measurable
The gap between reported and true demand is a direct inefficiency metric.
Intelligent Trade-offs
The model explicitly learns to reduce waste and regret while maintaining acceptable welfare.
Hoarding is Penalized
Most importantly: Hoarding does NOT improve allocation outcomes.
Computer Market Implications
Mapping modeled behavior to real-world hardware profiles.
Low α Profile
High allocation efficiency
High α Profile
Lower allocation