Focal Transformer
This is a codebase for our recently released paper “Focal Self-attention for Local-Global Interactions in Vision Transformers”. It developed a new sparse self-attention mechanism called focal self-attention towards more effective and efficient vision transformers. The…
Document AI (Intelligent Document Processing)
Document AI (opens in new tab), or Document Intelligence, is a new research topic that refers to techniques for automatically reading, understanding, and analyzing business documents. Understanding business documents is an incredibly challenging task due…
Network Architecture Search for Face Enhancement
Reference-Based Defect Detection Network
Efficient Self-Supervised Vision Transformers (EsViT)
This is a research project in exploring self-supervised learning (SSL) for computer vision. It aims to learn general-purpose image features from raw pixels without relying on manual supervisions, and the learned networks serve as the…