0
Skip to Content
AI Circle
AI Circle
Home
Values
Membership
Annotates
Apply
AI Circle
AI Circle
Home
Values
Membership
Annotates
Apply
Home
Values
Membership
Annotates
Apply

Research Papers

Core Architecture Papers

➙ "Attention Is All You Need" (Google Brain, Vaswani et al., 2017)

➙  "BERT: Pre-training of Deep Bidirectional Transformers" (Google AI Language, Devlin et al., 2018)

➙ "Language Models are Few-Shot Learners" (OpenAI, Brown et al., 2020)

➙ "LLaMA: Open and Efficient Foundation Language Models" (AI at Meta, Touvron et al., 2023)

Scaling & Efficiency Papers

➙ "Scaling Laws for Neural Language Models" (OpenAI Kaplan et al., 2020)

➙ "Training Compute-Optimal Large Language Models" (Google DeepMind Hoffmann et al., 2022)

➙ "Flash Attention: Fast and Memory-Efficient Exact Attention" (Stanford University Department of Computer Science, Dao et al., 2022)

➙ "Mixture of Experts with Expert Choice" (Google, Zhou et al., 2022)

➙ "Constitutional AI: Harmlessness from AI Feedback" (Anthropic, Bai et al., 2022)

➙ "Training Language Models to Follow Instructions" (OpenAI, Ouyang et al):

➙ "Learning to Summarize from Human Feedback" (OpenAI, Stiennon et al., 2020)

➙ "LoRA: Low-Rank Adaptation of Large Language Models" (Microsoft, Carnegie Mellon University, Hu et al., 2021)

➙ "Spectrum: Targeted Training on Signal to Noise Ratio" (Arcee.ai, Hartford et al., 2024):

AI Circle
Terms
hello@ai-circle.org