Home 🏠
Search
Search
Dark mode
Light mode
Explorer
Paper Breakdowns
A simple neural network module for relational reasoning
Aligner: Efficient Alignment by Learning to Correct
Autonomy-of-Experts Models
DINT Transformer
Diverse Preference Optimization
From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control
GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment
Grokking at the Edge of Numerical Stability
LLM Pretraining with Continuous Concepts
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Reasoning with Latent Thoughts: On the Power of Looped Transformers
ReMoE- Fully Differentiable Mixture-of-Experts with ReLU Routing
Scalable-Softmax Is Superior for Attention
Searching Latent Program Spaces
SFT Memorizes, RL Generalizes- A Comparative Study of Foundation Model Post-training
Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models
The Super Weight in Large Language Modeling
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning
Home
❯
tags
❯
Tag: Reinforcement-Learning
Tag: Reinforcement-Learning
2 items with this tag.
Feb 21, 2025
From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control
Reinforcement-Learning
Feb 17, 2025
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning
Reinforcement-Learning