Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation

Xingdi Yuan; Tong Wang; Yen-Hsiang Wang; Emery Fine; Rania Abdelghani; Pauline Lucas; Hélène Sauzéon; Pierre-Yves Oudeyer

Pré-Publication, Document De Travail (Preprint/Prepublication) Année : 2022

Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation

(1) , (1) , , (1) , (2, 3) , (3) , (3, 4) , (3, 1)

1
2
3
4

Xingdi Yuan

Fonction : Auteur

Microsoft Research

Tong Wang

Fonction : Auteur
PersonId : 1201068

Microsoft Research

Yen-Hsiang Wang

Fonction : Auteur

Emery Fine

Fonction : Auteur

Microsoft Research

Rania Abdelghani

Fonction : Auteur
PersonId : 750784
IdHAL : rania-abdelghani
ORCID : 0000-0002-6361-6609

EvidenceB

Flowing Epigenetic Robots and Systems

Pauline Lucas

Fonction : Auteur

Flowing Epigenetic Robots and Systems

Hélène Sauzéon

Fonction : Auteur
PersonId : 178631
IdHAL : helene-sauzeon
ORCID : 0000-0001-5781-9891
IdRef : 166626473

Flowing Epigenetic Robots and Systems

Université de Bordeaux

Pierre-Yves Oudeyer

Fonction : Auteur
PersonId : 6675
IdHAL : pyoudeyer
ORCID : 0000-0002-9404-7613
IdRef : 081674481

Flowing Epigenetic Robots and Systems

Microsoft Research

Résumé

Large Language Models (LLMs) have in recent years demonstrated impressive prowess in natural language generation. A common practice to improve generation diversity is to sample multiple outputs from the model. However, there lacks a simple and robust way of selecting the best output from these stochastic samples. As a case study framed in the context of question generation, we propose two prompt-based approaches to selecting high-quality questions from a set of LLM-generated candidates. Our method works under the constraints of 1) a black-box (non-modifiable) question generation model and 2) lack of access to human-annotated references -- both of which are realistic limitations for real-world deployment of LLMs. With automatic as well as human evaluations, we empirically demonstrate that our approach can effectively select questions of higher qualities than greedy generation.

Domaines

Intelligence artificielle [cs.AI] Traitement du texte et du document

Fichier principal

2209.11000.pdf (507.4 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Pierre-Yves Oudeyer : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03897371

Soumis le : mardi 13 décembre 2022-18:52:57

Dernière modification le : mercredi 15 mars 2023-08:50:07

Archivage à long terme le : mardi 14 mars 2023-19:43:09

Dates et versions

hal-03897371 , version 1 (13-12-2022)

Identifiants

HAL Id : hal-03897371 , version 1

Citer

Xingdi Yuan, Tong Wang, Yen-Hsiang Wang, Emery Fine, Rania Abdelghani, et al.. Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation. 2022. ⟨hal-03897371⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRIA INRIA2

22 Consultations

29 Téléchargements

Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager