Giovanni Pinna

AI Researcher & Engineer | Ph.D. in Applied Data Science & AI

University of Trieste

Welcome!

I am Giovanni Pinna, an AI Researcher and Engineer based in Trieste, Italy. I hold a Ph.D. in Applied Data Science & Artificial Intelligence from the University of Trieste (March 2026).

My research focuses on the intersection of Natural Language Processing, Large Language Models, and Evolutionary Computation — particularly on improving LLM-generated code through Genetic Improvement techniques and developing evaluation metrics for Text-to-SQL systems.

I have international research experience at University College London (UCL) in London, UK and NOVA IMS in Lisbon, Portugal. I am the author of 10+ publications in top international venues including Scientific Reports (Nature), IEEE Access, and EuroGP.

Interests

💬 NLP & Large Language Models
🗄️ Text-to-SQL
🧬 Genetic Improvement
🤖 AI Coding Agents
🔍 RAG Systems

Education

🎓
Ph.D. Applied Data Science & AI
University of Trieste, 2023 – 2025
🎓
M.Sc. Computer Science Engineering
University of Trieste, 2019 – 2022
🎓
B.Sc. Computer Science Engineering
University of Trieste, 2015 – 2019

Experience

🏛️
Applied AI Scientist
PLUS S.r.l. / Area Science Park, 2023 – 2025
🇬🇧
Visiting Researcher
UCL — CREST Centre, Sep – Dec 2025
🇵🇹
Visiting Researcher
NOVA IMS — Lisbon, 2024 & 2025

🗞️ News

Apr 2026 Published two papers “Comparing ai coding agents: A task-stratified analysis of pull request acceptance” and “Analyzing Message-Code Inconsistency in AI Coding Agent-Authored Pull Requests” at MSR 2026 .
Mar 2026 Completed my Ph.D. in Applied Data Science & AI at the University of Trieste!
Sep 2025 Started visiting research at University College London (UCL), in Prof. Federica Sarro's group.
2025 Published “Redefining Text-to-SQL Metrics” in Scientific Reports (Nature) and 2 papers at SSBSE 2025 (GA4GC and HotCat).

🎓 Ph.D. Thesis

Application of Large Language Models: Addressing Real-World Challenges

Ph.D. Thesis — University of Trieste, 2025

Large Language Models promise to reshape how we write code, query data, and access knowledge — but do they really deliver? This thesis probes that question across four fronts: how we evaluate LLMs, how they democratize expertise, how we make them reliable, and how we deploy them sustainably. A human-centered study on legal texts shows that domain experts still outperform state-of-the-art models precisely where the stakes are highest. To close the reliability gap, I introduce a Genetic Improvement framework that systematically repairs LLM-generated code, and a continuous Text-to-SQL metric that uncovers distinctions hidden by pass-or-fail scores. On sustainability, multi-objective optimization discovers coding-agent configurations with over a hundredfold hypervolume improvement. The result: a path to LLM deployment that is not only powerful, but trustworthy. Supervisors: Prof. Luca Manzoni, Prof. Andrea De Lorenzo.

📄 Selected Publications

Redefining Text-to-SQL Metrics by Incorporating Semantic and Structural Similarity
G. Pinna, Y. Perezhohin, L. Manzoni, M. Castelli, A. De Lorenzo
Scientific Reports 15.1 (Nature), 2025
Paper
Comparing AI Coding Agents: A Task-Stratified Analysis of Pull Request Acceptance
G. Pinna, J. Gong, D. Williams, F. Sarro
arXiv:2602.08915, 2026
arXiv
Analyzing Message-Code Inconsistency in AI Coding Agent-Authored Pull Requests
J. Gong, G. Pinna, Y. Bian, J. M. Zhang
arXiv:2601.04886, 2026
🏆 Distinguished Mining Challenge Paper Award, MSR 2026
Paper
Enhancing Large Language Models-Based Code Generation by Leveraging Genetic Improvement
G. Pinna, D. Ravalico, L. Rovito, L. Manzoni, A. De Lorenzo
EuroGP 2024, Springer LNCS vol. 14631
Paper
An Artificial Intelligence System for Automatic Recognition of Punches in Fourteenth-Century Panel Painting
M. Zullich, V. Macovaz, G. Pinna, F.A. Pellegrino
IEEE Access, 2023
Paper

📝 Recent Posts

There Is No "Best" AI Coding Agent — And That's the Whole Point

April 14, 2026 · 4 min read

We looked at 7,156 pull requests from five AI coding agents on real open-source projects. The agent matters less than you'd think. The kind of work matters far more.

When AI Agents Lie About Their Own Code (Without Meaning To)

April 14, 2026 · 5 min read

Only 1.7% of AI-authored pull requests have descriptions that don't match their code. Those PRs get accepted 51.7% less often and take 3.5× longer to merge. Trust is the bottleneck nobody is measuring.

Sometimes the Best Feature Engineering Is Throwing Features Away

October 13, 2025 · 5 min read

Classifying urgent software hotfixes is hard: tiny dataset, brutal class imbalance, expensive LLM features. We let evolution pick which features to keep — and discovered some were actively making things worse.