More Thinking Solves Harder Problems
Machine Learning Research

More Thinking Solves Harder Problems

In machine learning, an easy task and a more difficult version of the same task — say, a maze that covers a smaller or larger area — often are learned separately. A new study shows that recurrent neural networks can generalize from one to the

2 min read
AI With a Sense of Style
Machine Learning Research

AI With a Sense of Style

The process known as image-to-image style transfer — mapping, say, the character of a painting’s brushstrokes onto a photo — can render inconsistent results. When they apply the styles of different artists to the same target

3 min read
Team Players
Machine Learning Research

Team Players

Playing a team sport involves a fluid blend of individual and group skills. Researchers integrated both types of action into realistic humanoid agents that play football (known as soccer in the U.S.).

3 min read
Perceptrons Are All You Need
Machine Learning Research

Perceptrons Are All You Need

The paper that introduced the transformer famously declared, “Attention is all you need.” To the contrary, new work shows you may not need transformer-style attention at all.What’s new: Hanxiao Liu and colleagues at Google

2 min read
Solar System
Machine Learning Research

Solar System

Astronomers may use deep learning to keep the sun in focus.What’s new: Researchers at the U.S. National Aeronautics and Space Administration (NASA), Catholic University of America, University of Oslo,

2 min read
Ask Me in a Different Way
Machine Learning Research

Ask Me in a Different Way

Pretrained language models like GPT-3 have shown notable proficiency in few-shot learning. Given a prompt that includes a few example questions and answers (the shots) plus an unanswered question (the task), such models can

3 min read
Weak Foundations Make Weak Models
Machine Learning Research

Weak Foundations Make Weak Models

A new study examines a major strain of recent research: huge models pretrained on immense quantities of uncurated, unlabeled data and then fine-tuned on a smaller, curated corpus. The sprawling 200-page document

2 min read
More Reliable Pretraining
Machine Learning Research

More Reliable Pretraining

Pretraining methods generate basic representations for later fine-tuning, but they’re prone to certain issues that can throw them off-kilter. New work proposes a solution.What’s new: Researchers at

2 min read
Solve RL With This One Weird Trick
Machine Learning Research

Solve RL With This One Weird Trick

The previous state-of-the-art model for playing vintage Atari games took advantage of a number of advances in reinforcement learning (RL). The new champion is a basic RL architecture plus a trick borrowed from image

2 min read
Sharper Attention
Machine Learning Research

Sharper Attention

Self-attention enables transformer networks to track relationships between distant tokens — such as text characters — in long sequences, but the computational resources required grow quadratically with input size. New

2 min read
Smaller Models, Bigger Biases
Machine Learning Research

Smaller Models, Bigger Biases

Compression methods like parameter pruning and quantization can shrink neural networks for use in devices like smartphones with little impact on accuracy — but they also exacerbate a network’s bias. Do compressed models

3 min read
GANs for Smaller Data
Machine Learning Research

GANs for Smaller Data

Trained on a small dataset, generative adversarial networks (GANs) tend to generate either replicas of the training data or noisy output. A new method spurs them to produce satisfying variations.

2 min read
Upgrade for ReLU
Machine Learning Research

Upgrade for ReLU

The activation function known as ReLU builds complex nonlinear functions across layers of a neural network, making functions that outline flat faces and sharp edges. But how much of the world breaks down into perfect polyhedra?

2 min read
Revenge of the Perceptrons
Machine Learning Research

Revenge of the Perceptrons

Why use a complex model when a simple one will do? New work shows that the simplest multilayer neural network, with a small twist, can perform some tasks as well as today’s most sophisticated architectures.

2 min read
Transformers: Smarter Than You Think
Machine Learning Research

Transformers: Smarter Than You Think

The transformer architecture has shown an uncanny ability to model not only language but also images and proteins. New research found that it can apply what it learns from the first domain to the others.

2 min read

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox