<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>LLM on Giovanni Pinna</title>
    <link>https://giovannipinna.net/tags/llm/</link>
    <description>Recent content in LLM on Giovanni Pinna</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Wed, 02 Jul 2025 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://giovannipinna.net/tags/llm/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>The Text-to-SQL Field Has a Measurement Problem</title>
      <link>https://giovannipinna.net/posts/scireports2025-text-to-sql-metrics/</link>
      <pubDate>Wed, 02 Jul 2025 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/scireports2025-text-to-sql-metrics/</guid>
      <description>Every text-to-SQL benchmark today scores queries as either perfect or wrong. That&amp;#39;s a coin flip dressed up as a metric. We built one that actually tells you how close you got.</description>
    </item>
    <item>
      <title>Making the LLM-Plus-Evolution Pipeline Actually Smart</title>
      <link>https://giovannipinna.net/posts/sncs2025-exploring-gi-effect/</link>
      <pubDate>Tue, 01 Jul 2025 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/sncs2025-exploring-gi-effect/</guid>
      <description>Last year we showed evolution can fix LLM code. This year we made the evolution itself smarter — better selection, partial credit, fewer cycles — and got improvements in 11 of 12 cases.</description>
    </item>
    <item>
      <title>Improving LLM-Generated Code via Genetic Improvement: A Summary of Recent Advances</title>
      <link>https://giovannipinna.net/posts/italia2025-gi-summary/</link>
      <pubDate>Mon, 23 Jun 2025 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/italia2025-gi-summary/</guid>
      <description>A comprehensive summary of our research program on applying Genetic Improvement to LLM-generated code, presented at the Italian national AI conference Ital-IA 2025.</description>
    </item>
    <item>
      <title>GPT-4 Can Make Court Rulings Easier to Read. It Can Also Lie to You About Them, Confidently.</title>
      <link>https://giovannipinna.net/posts/wiat2024-courts-to-comprehension/</link>
      <pubDate>Tue, 10 Dec 2024 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/wiat2024-courts-to-comprehension/</guid>
      <description>We asked 75 people to read summaries of Italian Constitutional Court rulings — written by experts, by GPT-4o, by a fine-tuned LLaMA, and the raw judgments themselves. The results say more about LLMs than about courts.</description>
    </item>
    <item>
      <title>What If We Stopped Asking ChatGPT to Fix Its Own Code?</title>
      <link>https://giovannipinna.net/posts/eurogp2024-gi-for-llm-code/</link>
      <pubDate>Wed, 03 Apr 2024 00:00:00 +0000</pubDate>
      <guid>https://giovannipinna.net/posts/eurogp2024-gi-for-llm-code/</guid>
      <description>Self-correction is the default fix for buggy LLM code, but it has a ceiling. We tried something stranger — evolving the code instead — and it worked across every model we tested.</description>
    </item>
  </channel>
</rss>
