Unsupervised Learning

24 Posts

Graph with difference in test error in keeping hard versus easy examples

Unsupervised Data Pruning: New method removes useless machine learning data.

Large datasets often contain overly similar examples that consume training cycles without contributing to learning. A new paper identifies similar training examples, even if they’re not labeled.

High-level overview of the STEGO architecture at train and prediction steps

Unsupervised Learning

Segmented Images, No Labeled Data: Improved unsupervised learning for semantic segmentation

Training a model to separate the objects in a picture typically requires labeled images for best results. Recent work upped the ante for training without labels.

Flowcharts show how a new contrastive learning approach uses metadata to improve AI image classifiers

Unsupervised Learning

Learning From Metadata: Descriptive Text Improves Performance for AI Image Classification Systems

Images in the wild may not come with labels, but they often include metadata. A new training method takes advantage of this information to improve contrastive learning.

Everlaw's clustering feature organizing thousands of documents

Unsupervised Learning

Order in the Court: Machine Learning Tool from Everlaw Finds Legal Evidence

Machine learning is helping lawyers sift through mountains of documents to find evidence. The legal technology company Everlaw launched a clustering feature that automatically organizes up to 25 million documents for lawyers gathering evidence to be used during a trial.

Graph Average across 14 NLP Tasks parameters versus Average Accuracy

Unsupervised Learning

GPT-Free: Meta Releases Open Source Large Language Models OPT

Itching to get your hands on a fully trained large language model? The wait is over. Meta introduced the OPT family of transformer-based language models with nearly unfettered access to source code and trained weights.

Shifted Patch Tokenization (SPT) | Locality Self-Attention (LSA)

Unsupervised Learning

Less Data for Vision Transformers: Boosting Vision Transformer Performance with Less Data

Vision Transformer (ViT) outperformed convolutional neural networks in image classification, but it required more training data. New work enabled ViT and its variants to outperform other architectures with less training data.

Unsupervised Learning

AI Versus the Garbage Heap: How Amazon uses AI to cut waste.

Amazon reported long-term success using machine learning to shrink its environmental footprint. The online retailer developed a system that fuses product descriptions, images, and structured data to decide how an item should be packed for shipping.

A conversation between a human and an open-domain chatbot.

Unsupervised Learning

Long-Haul Chatbot: Facebook Chatbot is Able to Carry on Long Conversations

Facebook released a chatbot that summarizes dialog on the fly and uses the summary to generate further repartee.

Animated chart shows how AI can help robots locate key spatial coordinates.

Unsupervised Learning

Finding Useful Points in Space: Keypoint3D Helps Robots Locate Spatial Coordinates

A new machine learning method aims to improve a machine’s ability to determine and locate points of interest.

Series of example of accurate and inaccurate matching images to text

Unsupervised Learning

Crawl the Web, Absorb the Bias: NLP Models Absorb Biases from Web Training Data

The emerging generation of trillion-parameter models needs datasets of billions of examples, but the most readily available source of examples on that scale — the web — is polluted with bias and antisocial expressions. A new study examines the issue.

Animated image showing the transformer architecture of processing an image

Unsupervised Learning

Transformer Speed-Up Sped Up: How to Speed Up Image Transformers

The transformer architecture is notoriously inefficient when processing long sequences — a problem in processing images, which are essentially long sequences of pixels. One way around this is to break up input images and process the pieces

Series of images showing some of the findings of the new study by researchers at Stanford’s Human AI Institute

Unsupervised Learning

Weak Foundations Make Weak Models: Foundation AI Models Pass Flaws to Fine-Tuned Variants

A new study examines a major strain of recent research: huge models pretrained on immense quantities of uncurated, unlabeled data and then fine-tuned on a smaller, curated corpus.

Information about a new unsupervised pretraining method called VICReg

Unsupervised Learning

More Reliable Pretraining: Pretraining Method Helps AI Learn Useful Representations

Pretraining methods generate basic representations for later fine-tuning, but they’re prone to certain issues that can throw them off-kilter. New work proposes a solution.

System designed to isolate changes in the pose of a two-dimensional figure

Unsupervised Learning

Motion Mapper: An AI system for automated animations for video game sprites

In some animated games, different characters can perform the same actions — say, walking, jumping, or casting spells. A new system learned from unlabeled data to transfer such motions from one character to another.

Data related to SElf-supERvised (SEER), an image classifier pretrained on unlabeled images

Unsupervised Learning

Pretraining on Uncurated Data: How unlabeled data improved computer vision accuracy.

It’s well established that pretraining a model on a large dataset improves performance on fine-tuned tasks. In sufficient quantity and paired with a big model, even data scraped from the internet at random can contribute to the performance boost.

Unsupervised Learning

Unsupervised Data Pruning: New method removes useless machine learning data.

Segmented Images, No Labeled Data: Improved unsupervised learning for semantic segmentation

Learning From Metadata: Descriptive Text Improves Performance for AI Image Classification Systems

Order in the Court: Machine Learning Tool from Everlaw Finds Legal Evidence

GPT-Free: Meta Releases Open Source Large Language Models OPT

Less Data for Vision Transformers: Boosting Vision Transformer Performance with Less Data

AI Versus the Garbage Heap: How Amazon uses AI to cut waste.

Long-Haul Chatbot: Facebook Chatbot is Able to Carry on Long Conversations

Finding Useful Points in Space: Keypoint3D Helps Robots Locate Spatial Coordinates

Crawl the Web, Absorb the Bias: NLP Models Absorb Biases from Web Training Data

Transformer Speed-Up Sped Up: How to Speed Up Image Transformers

Weak Foundations Make Weak Models: Foundation AI Models Pass Flaws to Fine-Tuned Variants

More Reliable Pretraining: Pretraining Method Helps AI Learn Useful Representations

Motion Mapper: An AI system for automated animations for video game sprites

Pretraining on Uncurated Data: How unlabeled data improved computer vision accuracy.

Subscribe to The Batch