<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Blog on Giovanni Pinna</title>
    <link>https://giovannipinna.net/posts/</link>
    <description>Recent content in Blog on Giovanni Pinna</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Tue, 14 Apr 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://giovannipinna.net/posts/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>There Is No &#34;Best&#34; AI Coding Agent — And That&#39;s the Whole Point</title>
      <link>https://giovannipinna.net/posts/msr2026-comparing-ai-agents/</link>
      <pubDate>Tue, 14 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/msr2026-comparing-ai-agents/</guid>
      <description>We looked at 7,156 pull requests from five AI coding agents on real open-source projects. The agent matters less than you&amp;#39;d think. The kind of work matters far more.</description>
    </item>
    <item>
      <title>When AI Agents Lie About Their Own Code (Without Meaning To)</title>
      <link>https://giovannipinna.net/posts/msr2026-message-code-inconsistency/</link>
      <pubDate>Tue, 14 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/msr2026-message-code-inconsistency/</guid>
      <description>Only 1.7% of AI-authored pull requests have descriptions that don&amp;#39;t match their code. Those PRs get accepted 51.7% less often and take 3.5× longer to merge. Trust is the bottleneck nobody is measuring.</description>
    </item>
    <item>
      <title>Sometimes the Best Feature Engineering Is Throwing Features Away</title>
      <link>https://giovannipinna.net/posts/ssbse2025-hotcat/</link>
      <pubDate>Mon, 13 Oct 2025 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/ssbse2025-hotcat/</guid>
      <description>Classifying urgent software hotfixes is hard: tiny dataset, brutal class imbalance, expensive LLM features. We let evolution pick which features to keep — and discovered some were actively making things worse.</description>
    </item>
    <item>
      <title>Sometimes Your AI Agent Burns More Energy Optimizing Code Than the Code Will Ever Save</title>
      <link>https://giovannipinna.net/posts/ssbse2025-ga4gc/</link>
      <pubDate>Mon, 13 Oct 2025 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/ssbse2025-ga4gc/</guid>
      <description>AI coding agents that &amp;#39;optimize&amp;#39; your code can cost more energy than they save — for hundreds of thousands of runs. We tuned the agent itself, and got 37.7% faster runs and better code at the same time.</description>
    </item>
    <item>
      <title>The Text-to-SQL Field Has a Measurement Problem</title>
      <link>https://giovannipinna.net/posts/scireports2025-text-to-sql-metrics/</link>
      <pubDate>Wed, 02 Jul 2025 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/scireports2025-text-to-sql-metrics/</guid>
      <description>Every text-to-SQL benchmark today scores queries as either perfect or wrong. That&amp;#39;s a coin flip dressed up as a metric. We built one that actually tells you how close you got.</description>
    </item>
    <item>
      <title>Making the LLM-Plus-Evolution Pipeline Actually Smart</title>
      <link>https://giovannipinna.net/posts/sncs2025-exploring-gi-effect/</link>
      <pubDate>Tue, 01 Jul 2025 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/sncs2025-exploring-gi-effect/</guid>
      <description>Last year we showed evolution can fix LLM code. This year we made the evolution itself smarter — better selection, partial credit, fewer cycles — and got improvements in 11 of 12 cases.</description>
    </item>
    <item>
      <title>Improving LLM-Generated Code via Genetic Improvement: A Summary of Recent Advances</title>
      <link>https://giovannipinna.net/posts/italia2025-gi-summary/</link>
      <pubDate>Mon, 23 Jun 2025 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/italia2025-gi-summary/</guid>
      <description>A comprehensive summary of our research program on applying Genetic Improvement to LLM-generated code, presented at the Italian national AI conference Ital-IA 2025.</description>
    </item>
    <item>
      <title>GPT-4 Can Make Court Rulings Easier to Read. It Can Also Lie to You About Them, Confidently.</title>
      <link>https://giovannipinna.net/posts/wiat2024-courts-to-comprehension/</link>
      <pubDate>Tue, 10 Dec 2024 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/wiat2024-courts-to-comprehension/</guid>
      <description>We asked 75 people to read summaries of Italian Constitutional Court rulings — written by experts, by GPT-4o, by a fine-tuned LLaMA, and the raw judgments themselves. The results say more about LLMs than about courts.</description>
    </item>
    <item>
      <title>What If We Stopped Asking ChatGPT to Fix Its Own Code?</title>
      <link>https://giovannipinna.net/posts/eurogp2024-gi-for-llm-code/</link>
      <pubDate>Wed, 03 Apr 2024 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/eurogp2024-gi-for-llm-code/</guid>
      <description>Self-correction is the default fix for buggy LLM code, but it has a ceiling. We tried something stranger — evolving the code instead — and it worked across every model we tested.</description>
    </item>
    <item>
      <title>Influence: Where Marketing Meets Artificial Intelligence</title>
      <link>https://giovannipinna.net/posts/influence-project/</link>
      <pubDate>Wed, 13 Jan 2021 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/influence-project/</guid>
      <description>A project exploring the intersection of marketing and artificial intelligence, using data analysis to generate targeted social media content.</description>
    </item>
    <item>
      <title>Book Review: Thinking, Fast and Slow by Daniel Kahneman</title>
      <link>https://giovannipinna.net/posts/pensieri-lenti-e-veloci/</link>
      <pubDate>Sun, 10 Jan 2021 10:00:00 +0100</pubDate>
      <guid>https://giovannipinna.net/posts/pensieri-lenti-e-veloci/</guid>
      <description>A review of Daniel Kahneman&amp;#39;s Nobel Prize-winning work on how we make decisions and the cognitive biases that influence our thinking.</description>
    </item>
    <item>
      <title>Book Review: Don&#39;t Make Me Think by Steve Krug</title>
      <link>https://giovannipinna.net/posts/dont-make-me-think/</link>
      <pubDate>Sun, 10 Jan 2021 09:00:00 +0100</pubDate>
      <guid>https://giovannipinna.net/posts/dont-make-me-think/</guid>
      <description>A review of Steve Krug&amp;#39;s classic guide to web usability and human-computer interaction.</description>
    </item>
  </channel>
</rss>
