17] 今週の主要ML論文（Top ML Papers of the Week）

(discuss.pytorch.kr)

2 ポイント投稿者 ninebow 2023-12-18 | まだコメントはありません。 | WhatsAppで共有

概要

DAIR.AIが毎週公開しているML論文に関する記事を自動翻訳してみました。
今週選ばれた論文を見ると、LLM(Large Language Models)に関する研究が中心を占めている点が目立ちます。具体的には、数学分野における発見、一般化の問題、医療分野への応用、人間のデータを超える学習方法など、さまざまなテーマにわたってLLMの活用度と性能向上に関する分析が進められているようです。
このような傾向は、ここ数年の人工知能分野においてLLMの発展が注目を集めていることを反映しています。特に、OpenAIのGPT-3のような大規模言語モデルの登場により、これらのモデルが多様な分野へ適用される可能性は大きく広がりました。これは、LLMが単なるテキスト処理を超えて複雑な問題解決においても重要な役割を果たし得ることを示唆しており、今週の論文群では数学や医療といった伝統的な分野にまで影響力を広げていることがうかがえます。また、透明性と開放性を重視する流れは、研究および応用分野における信頼性と協調の重要性を強調しています。
また、Weak-to-strong Generalization や Beyond Human Data for LLMs のような論文は、LLMの一般化能力と学習方法論に関する研究を示しており、これは現在のLLMが持つ限定的な学習環境から脱し、より強力な学習メカニズムを開発しようとする試みと見ることができます。こうした流れは、LLM研究が単に性能向上に注力する段階を超え、モデルの一般化能力と有用性を飛躍的に高めようとする方向へ進んでいることを示しています。

数理科学分野の発見のためのLLM / LLMs for Discoveries in Mathematical Sciences

論文紹介

数学およびコンピューターサイエンスで新しい解法を見つけるためにLLMを用い、事前学習済みLLMと体系的な評価器を組み合わせ、それらを反復することで低スコアのプログラムを新たな知識を発見する高スコアのプログラムへ進化させるfunsearchを提案しています。数学的発見やその他の現実世界の問題を解くためには、LLMのハルシネーションから保護することが重要であることが、この研究の中核的な結果の1つです。

Uses llms to search for new solutions in mathematics & computer science; proposes funsearch which combines a pre-trained llm with a systematic evaluator and iterates over them to evolve low-scoring programs into high-scoring ones discovering new knowledge; one of the key findings in this work is that safeguarding against llm hallucinations is important to produce mathematical discoveries and other real-world problems.

論文リンク

https://www.nature.com/articles/s41586-023-06924-6

さらに読む

https://x.com/GoogleDeepMind/status/1735332722208284797

弱から強への一般化 / Weak-to-strong Generalization

論文紹介

弱いモデルによる監督が、より強いモデルのすべての能力を引き出せるかを研究し、弱いモデルが生成したラベルに対して強力な事前学習モデルを素朴にファインチューニングしても、弱い監督者より良い性能を発揮できることを発見しています。また、GPT-2レベルの監督者でGPT-4をファインチューニングすると、NLPタスクにおいてGPT-3.5レベルに近い性能を回復できると報告しています。

Studies whether weak model supervision can elicit the full capabilities of stronger models; finds that when naively fine-tuning strong pretrained models on weak model generated labels they can perform better than their weak supervisors; reports that finetuning gpt-4 with a gpt-2-level supervisor it’s possible to recover close to gpt-3.5-level performance on nlp tasks.

論文リンク

https://cdn.openai.com/papers/weak-to-strong-generalization.pdf

さらに読む

https://x.com/OpenAI/status/1735349718765715913

オーディオボックス / Audiobox

論文紹介

さまざまな音声モダリティを生成できるflow-matchingベースの統合モデルで、説明ベースおよび例示ベースのプロンプトを設計して制御性を高め、音声生成とサウンド生成のパラダイムを統合しています。さらに、ラベルなしの大量音声データを事前学習できるよう自己教師ありのインフィリング目標を適応させ、音声およびサウンド生成で優れた性能を示し、新しいボーカルおよび音響スタイルでオーディオを生成する新たな方法を切り開いています。

A unified model based on flow-matching capable of generating various audio modalities; designs description-based and example-based prompting to enhance controllability and unify speech and sound generation paradigms; adapts a self-supervised infilling objective to pre-train on large quantities of unlabeled audio; performs well on speech and sound generation and unlocks new methods for generating audio with novel vocal and acoustic styles.

論文リンク

https://ai.meta.com/research/publications/…

さらに読む

https://x.com/AIatMeta/status/1734257634008531453

数学的言語モデル：サーベイ / Mathematical Language Models: A Survey

論文紹介

数学的課題に対するLLMの進展に関するサーベイであり、数学の文章題の解法や定理証明といったタスク、ならびにプロンプティング手法をめぐるLLM研究の論文やリソースを扱っています。

A survey on the progress of llms on mathematical tasks; covers papers and resources on llm research around prompting techniques and tasks such as math word problem-solving and theorem proving.

論文要旨

近年、数学分野において、事前学習言語モデル（PLM）と大規模言語モデル（LLM）を含む言語モデル（LM）の活用は著しい進展を見せています。本論文は数学向けLMに関する包括的なサーベイを行い、重要な研究の取り組みを課題と方法論という2つの観点から体系的に分類しています。その結果、多数の数学向けLMが提案されていることが分かり、これらをさらに、インストラクション学習、ツールベース手法、基本的なCoT手法、高度なCoT方法論に細分化して整理しています。さらに本サーベイでは、学習用データセット、ベンチマークデータセット、拡張データセットを含む60以上の数学データセットも収集しています。数学向けLM分野の主要課題に取り組み、今後の方向性を示す本サーベイは、この領域の発展に取り組む研究者にとって、将来のイノベーションを促進し着想を与える貴重な資料となっています。

In recent years, there has been remarkable progress in leveraging Language Models (LMs), encompassing Pre-trained Language Models (PLMs) and Large-scale Language Models (LLMs), within the domain of mathematics. This paper conducts a comprehensive survey of mathematical LMs, systematically categorizing pivotal research endeavors from two distinct perspectives: tasks and methodologies. The landscape reveals a large number of proposed mathematical LLMs, which are further delineated into instruction learning, tool-based methods, fundamental CoT techniques, and advanced CoT methodologies. In addition, our survey entails the compilation of over 60 mathematical datasets, including training datasets, benchmark datasets, and augmented datasets. Addressing the primary challenges and delineating future trajectories within the field of mathematical LMs, this survey is positioned as a valuable resource, poised to facilitate and inspire future innovation among researchers invested in advancing this domain.

論文リンク

https://arxiv.org/abs/2312.07622

さらに読む

https://x.com/omarsar0/status/1735323577392542084

LLM360: 完全に透明なオープンソースLLMへの道 / LLM360: Towards Fully Transparent Open-Source LLMs

論文紹介

エンドツーエンドの機械学習学習プロセスを透明かつ再現可能にすることで、オープンで協調的なAI研究を支援するLLM360を提案し、学習コード、データ、中間チェックポイント、分析を含めて、ゼロから事前学習した7BパラメータのLLMであるAmberとCrystalCoderを公開します。

Proposes llm360 to support open and collaborative ai research by making the end-to-end llm training process transparent and reproducible; releases 7b parameter llms pre-trained from scratch, amber and crystalcoder, including their training code, data, intermediate checkpoints, and analyses.

論文要旨

近年、LLaMA、Falcon、Mistralのようなオープンソース大規模言語モデル（LLM）が急増し、AI実務者や研究者に多様な選択肢を提供しています。しかし、ほとんどのLLMは最終モデル重みや推論コードといった部分的なアーティファクトしか公開しておらず、技術レポートもますます高水準の設計選択や表層的な統計に範囲を限定しています。こうした選択は、LLM学習に対する透明性を損ない、学習プロセスの多くの詳細を各チームが再発見せざるを得なくすることで、この分野の進歩を妨げています。Unityは、すべての学習コードとデータ、モデルチェックポイント、中間結果をコミュニティに公開することを提唱する、LLMの完全なオープンソース化イニシアチブであるLLM360を発表しました。LLM360の目標は、エンドツーエンドのLLM学習プロセスを誰もが透明かつ再現可能にすることで、オープンで協調的なAI研究を支援することです。LLM360の第一歩として、Unityは学習コード、データ、中間チェックポイント、分析を含む、ゼロから事前学習した2つの7BパラメータLLMであるAmberとCrystalCoderを公開します（https://llm360.ai）。Unityは、このオープンソースの取り組みを通じてLLMの限界を継続的に押し広げることに注力しています。より大規模で高性能なモデルも開発中であり、今後公開される予定です。/

The recent surge in open-source Large Language Models (LLMs), such as LLaMA, Falcon, and Mistral, provides diverse options for AI practitioners and researchers. However, most LLMs have only released partial artifacts, such as the final model weights or inference code, and technical reports increasingly limit their scope to high-level design choices and surface statistics. These choices hinder progress in the field by degrading transparency into the training of LLMs and forcing teams to rediscover many details in the training process. We present LLM360, an initiative to fully open-source LLMs, which advocates for all training code and data, model checkpoints, and intermediate results to be made available to the community. The goal of LLM360 is to support open and collaborative AI research by making the end-to-end LLM training process transparent and reproducible by everyone. As a first step of LLM360, we release two 7B parameter LLMs pre-trained from scratch, Amber and CrystalCoder, including their training code, data, intermediate checkpoints, and analyses (at https://www.llm360.ai). We are committed to continually pushing the boundaries of LLMs through this open-source effort. More large-scale and stronger models are underway and will be released in the future.

論文リンク

https://arxiv.org/abs/2312.06550

さらに読む

https://x.com/omarsar0/status/1734591071575744820

医学分野における大規模言語モデルのサーベイ：原理、応用、課題 / A Survey of Large Language Models in Medicine: Principles, Applications, and Challenges

論文紹介

医療分野における機械学習の包括的なサーベイ（300本以上の論文を分析）で、医療分野の機械学習が直面する原理、応用、課題の概要を含みます。

A comprehensive survey (analyzing 300+ papers) on llms in medicine; includes an overview of the principles, applications, and challenges faced by llms in medicine.

論文要旨

ChatGPTのような大規模言語モデル（LLM）は、優れた人間の言語理解・生成能力により大きな注目を集めています。そのため、医師や患者ケアを支援するために医療分野へLLMを適用することは、人工知能と臨床医学の両方において有望な研究方向として浮上しています。こうした流れを踏まえ、本サーベイでは医療分野におけるLLMの原理、応用、そして直面する課題について包括的な概観を提供します。特に、次の問いを扱います。1) 医療用LLMはどのように構築できるのか？ 2) 医療LLMのダウンストリーム性能はどのようなものか？ 3) 医療用LLMは実際の臨床現場でどのように活用できるのか？ 4) 医療用LLMの利用によってどのような課題が生じるのか？ 5) 医療用LLMをより良く構築し活用するにはどうすればよいのか？最終的に、本サーベイは医療分野におけるLLMの機会と課題に関する洞察を提供し、実用的かつ効果的な医療LLMを構築するための貴重な資料となることを目指しています。医療LLMに関する実践ガイドの定期更新リストは https://github.com/AI-in-Health/MedLLMsPracticalGuide で確認できます。

Large language models (LLMs), such as ChatGPT, have received substantial attention due to their impressive human language understanding and generation capabilities. Therefore, the application of LLMs in medicine to assist physicians and patient care emerges as a promising research direction in both artificial intelligence and clinical medicine. To reflect this trend, this survey provides a comprehensive overview of the principles, applications, and challenges faced by LLMs in medicine. Specifically, we aim to address the following questions: 1) How can medical LLMs be built? 2) What are the downstream performances of medical LLMs? 3) How can medical LLMs be utilized in real-world clinical practice? 4) What challenges arise from the use of medical LLMs? and 5) How can we better construct and utilize medical LLMs? As a result, this survey aims to provide insights into the opportunities and challenges of LLMs in medicine and serve as a valuable resource for constructing practical and effective medical LLMs. A regularly updated list of practical guides on medical LLMs can be found at https://github.com/AI-in-Health/MedLLMsPracticalGuide.

論文リンク

https://arxiv.org/abs/2311.05112

さらに読む

https://x.com/omarsar0/status/1734599425568231513

ヒューマンデータを超えて: 言語モデルによる問題解決のための自己学習を拡張する / Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

論文紹介

人間が生成したデータへの依存を大幅に減らせる、フィードバックを用いた自己学習アプローチを提案し、モデル生成データと報酬関数を組み合わせることで、問題解決タスクにおける機械学習の性能を向上させます。

Proposes an approach for self-training with feedback that can substantially reduce dependence on human-generated data; the model-generated data combined with a reward function improves the performance of llms on problem-solving tasks.

論文要旨

人間が生成したデータで言語モデル（LM）をファインチューニングすることは、依然として広く行われています。しかし、そのようなモデルの性能は、高品質な人間データの量と多様性によって制約されることが少なくありません。本論文では、スカラーなフィードバックにアクセスできるタスク、たとえば正誤を検証できる数学問題において、人間データを超えられるかを検討します。そのために、期待値最大化に基づくシンプルな自己学習手法 ReST $^{EM}$ を調査します。この手法では、(1) モデルからサンプルを生成し、二値フィードバックでフィルタリングし、(2) それらのサンプルでモデルをファインチューニングし、(3) この過程を数回繰り返します。PaLM-2モデルを用いて高度な数学推論およびAPPSコーディングベンチマークで評価した結果、ReST $^{EM}$ はモデルサイズに対して良好にスケールし、人間データのみでのファインチューニングを大きく上回ることが分かりました。総じて、本研究の結果は、フィードバックを用いた自己学習が人間生成データへの依存を大幅に低減できることを示唆しています。

Fine-tuning language models(LMs) on human-generated data remains a prevalent practice. However, the performance of such models is often limited by the quantity and diversity of high-quality human data. In this paper, we explore whether we can go beyond human data on tasks where we have access to scalar feedback, for example, on math problems where one can verify correctness. To do so, we investigate a simple self-training method based on expectation-maximization, which we call ReST $^{EM}$, where we (1) generate samples from the model and filter them using binary feedback, (2) fine-tune the model on these samples, and (3) repeat this process a few times. Testing on advanced MATH reasoning and APPS coding benchmarks using PaLM-2 models, we find that ReST $^{EM}$ scales favorably with model size and significantly surpasses fine-tuning only on human data. Overall, our findings suggest self-training with feedback can substantially reduce dependence on human-generated data.

論文リンク

https://arxiv.org/abs/2312.06585

さらに読む

https://x.com/omarsar0/status/1734953578274386002

ガウシアンSLAM / Gaussian-SLAM

論文紹介

速度と効率を損なうことなく、実世界のシーンを写実的に再構成できるニューラルRGBD SLAM手法であり、従来手法の限界を克服するために、シーン表現のための古典的な3Dガウシアン手法を拡張したものです。

A neural rgbd slam method capable of photorealistically reconstructing real-world scenes without compromising speed and efficiency; extends classical 3d gaussians for scene representation to overcome the limitations of the previous methods.

論文リンク

https://vladimiryugay.github.io/gaussian_slam/

さらに読む

https://x.com/vlyug/status/1734683948440252480

Pearl: 本番環境にそのまま適用可能な強化学習エージェント / Pearl: A Production-ready Reinforcement Learning Agent

論文紹介

研究者や実務者が、観測可能性が限られ、フィードバックが疎で、確率性の高い環境に適応する人工知能エージェントを開発できる、新しい本番対応の人工知能エージェント用ソフトウェアパッケージを紹介します。

Introduces a new production-ready rl agent software package that enables researchers and practitioners to develop rl ai agents that adapt to environments with limited observability, sparse feedback, and high stochasticity.

論文要旨

強化学習（RL）は、長期的な目標達成のための汎用的なフレームワークを提供します。この汎用性により、遅延報酬への対処、部分観測可能性の処理、探索と活用のジレンマへの対応、オフラインデータを活用したオンライン性能の向上、安全制約が満たされることの保証など、現実世界の知的システムが直面する幅広い問題を定式化できます。こうした問題に対してRL研究コミュニティが大きな進展を遂げてきたにもかかわらず、既存のオープンソースRLライブラリは、RLソリューションパイプラインの一部にしか焦点を当てず、他の側面はほとんど手つかずのままである傾向があります。本論文では、こうした課題をモジュール式の方法で受け入れるよう明示的に設計された、本番対応RLエージェントのソフトウェアパッケージであるPearlを紹介します。本論文では予備的なベンチマーク結果を示すだけでなく、Pearlの業界での採用事例を通じて、実運用に向けた準備が整っていることを示します。PearlはGithubで github.com/facebookresearch/pearl としてオープンソース化されており、公式ウェブサイトは pearlagent.github.io にあります。

Reinforcement Learning (RL) offers a versatile framework for achieving long-term goals. Its generality allows us to formalize a wide range of problems that real-world intelligent systems encounter, such as dealing with delayed rewards, handling partial observability, addressing the exploration and exploitation dilemma, utilizing offline data to improve online performance, and ensuring safety constraints are met. Despite considerable progress made by the RL research community in addressing these issues, existing open-source RL libraries tend to focus on a narrow portion of the RL solution pipeline, leaving other aspects largely unattended. This paper introduces Pearl, a Production-ready RL agent software package explicitly designed to embrace these challenges in a modular fashion. In addition to presenting preliminary benchmark results, this paper highlights Pearl's industry adoptions to demonstrate its readiness for production usage. Pearl is open sourced on Github at github.com/facebookresearch/pearl and its official website is located at pearlagent.github.io.

論文リンク

https://arxiv.org/abs/2312.03814

さらに読む

https://x.com/ZheqingZhu/status/1732880717263352149

Quip / Quip

論文紹介

学習済みモデルの重みを低精度形式に圧縮してメモリ要件を削減し、格子コードブックとインコヒーレンス処理を組み合わせて2ビット量子化モデルを生成することで、2ビット量子化LLMと非量子化16ビットモデルの間のギャップを大幅に縮めます。

Compresses trained model weights into a lower precision format to reduce memory requirements; the approach combines lattice codebooks with incoherence processing to create 2 bit quantized models; significantly closes the gap between 2 bit quantized llms and unquantized 16 bit models.

[2023/12/11 ~ 12/17] 今週の主要ML論文（Top ML Papers of the Week）

概要

数理科学分野の発見のためのLLM / LLMs for Discoveries in Mathematical Sciences

論文紹介

論文リンク

さらに読む

弱から強への一般化 / Weak-to-strong Generalization

論文紹介

論文リンク

さらに読む

オーディオボックス / Audiobox

論文紹介

論文リンク

さらに読む

数学的言語モデル：サーベイ / Mathematical Language Models: A Survey

論文紹介

論文要旨

論文リンク

さらに読む

LLM360: 完全に透明なオープンソースLLMへの道 / LLM360: Towards Fully Transparent Open-Source LLMs

論文紹介

論文要旨

論文リンク

さらに読む

医学分野における大規模言語モデルのサーベイ：原理、応用、課題 / A Survey of Large Language Models in Medicine: Principles, Applications, and Challenges

論文紹介

論文要旨

論文リンク

さらに読む

ヒューマンデータを超えて: 言語モデルによる問題解決のための自己学習を拡張する / Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

論文紹介

論文要旨

論文リンク

さらに読む

ガウシアンSLAM / Gaussian-SLAM

論文紹介

論文リンク

さらに読む

Pearl: 本番環境にそのまま適用可能な強化学習エージェント / Pearl: A Production-ready Reinforcement Learning Agent

論文紹介

論文要旨

論文リンク

さらに読む

Quip / Quip

論文紹介

論文リンク

さらに読む

原文

関連記事

まだコメントはありません。