Sometimes the Best Feature Engineering Is Throwing Features Away

TL;DR Classifying software hotfixes — the panic-mode patches you ship to fix something that’s broken in production right now — is hard for ML: tiny dataset (88 entries, 17 categories), brutal class imbalance, and expensive LLM features. HotCat reframes feature engineering as a search problem: NSGA-II evolves binary masks over 18 features, optimizing accuracy, NMI, and runtime simultaneously. A two-stage data augmentation lifts generalization from 55% → 72%. The Pareto front gives a balanced config: 59% accuracy, 0.58 NMI, 129 seconds. Most surprising: some features actively hurt — pruning them is both faster and more accurate. Hotfixes are not normal bugs In any normal software project, bugs queue up. They get triaged, prioritized, scheduled into sprints. Some sit there for months. ...

October 13, 2025 · 5 min · Giovanni Pinna

Sometimes Your AI Agent Burns More Energy Optimizing Code Than the Code Will Ever Save

TL;DR AI coding agents burn 100,000+ tokens per task. When the task is “optimize this code’s performance,” the agent itself often costs more energy than the optimized code will ever save. We built GA4GC — Greener Agent for Greener Code — using NSGA-II to tune the agent’s own configuration against three objectives: code correctness, code speedup, and agent runtime. On a mini-SWE-agent powered by Gemini 2.5 Pro on the SWE-Perf benchmark, we got 37.7% runtime reduction while also improving correctness, with a 135× hypervolume improvement over defaults. Bonus finding: temperature is the single most important knob, and LLM hyperparameters control quality while agent constraints control cost — they can be tuned almost independently. The energy paradox nobody talks about Here’s a thing that should be obvious but isn’t: when you ask an AI agent to optimize the performance of your code, the agent’s own execution costs energy. A lot of energy. Often more than the code it’s optimizing will ever save. ...

October 13, 2025 · 5 min · Giovanni Pinna