About me

I’m a second-year master’s student majoring in Computer Science at Columbia University. I’m a member of the Digital Video and Multimedia (DVMM) Lab and the Robotic Manipulation and Mobility (ROAM) Lab at Columbia University, fortunate to be advised by Prof. Shih-Fu Chang and Guangxing Han on few-shot learning with Transformers; Prof. Matei Ciocarlie, Prof. Shuran Song, and Jingxi Xu on tactile exploration for 3D objects. I also have the great pleasure to work with Prof. Krzysztof Choromanski on several projects related to kernelized attention on Transformers and Graph Neural Networks.

My research spans several fields of machine learning, including representation learning, reinforcement learning, deep learning frameworks design (e.g. Transformers, GNNs), as well as Monte Carlo methods. Despite such diversity, I’m chiefly fond of the theory-grounded algorithms with applications in computer vision and robotics. Specifically, my research aims at making algorithms more efficient [1,5,7] and scalable [4, 6], as well as designing simple but effective [2] learning algorithms as a better alternative to traditional heuristics [3].

If you are interested in my research and would like collaboration, please feel free to contact me via email! :)

I’m applying for Fall 2023 CS Ph.D. programs and looking for Spring & Summer 2023 research assistant positions. Feel free to reach out!


1. (Preprint 2023) Efficient Graph Field Integrators Meet Point Clouds

Krzysztof Choromanski*, Arijit Sehanobish*, Han Lin*, Yunfan Zhao*, Eli Berger, Alvin Pan, Tetiana Parshakova, Tianyi Zhang, David Watkins, Valerii Likhosherstov, Somnath Basu Roy Chowdhury, Avinava Dubey, Deepali Jain, Tamas Sarlos, Snigdha Chaturvedi, Adrian Weller

Highlight: We present two new classes of algorithms for efficient field integration on graphs encoding point clouds.

2. (Preprint 2023) Supervised Masked Knowledge Distillation for Few-shot Transformers

Han Lin*, Guangxing Han*, Jiawei Ma, Shiyuan Huang, Xudong Lin, Shih-Fu Chang
[Paper coming soon][Code][Slides]

Highlight: We propose a novel framework for few-shot Transformers which incorporates label information into self-distillation. Compared with previous self-supervised methods, we allow intra-class knowledge distillation on both class and patch tokens, and introduce the challenging task of masked patch tokens reconstruction across intra-class images.

3. (ICRA 2023) Active Tactile Exploration for 3D Object Recognition

Jingxi Xu*, Han Lin*, Shuran Song, Matei Ciocarlie

Highlight: We propose TANDEM3D, a co-training framework for exploration and decision making to 3D object recognition with tactile signals. TANDEM3D is based on a novel encoder that builds 3D object representation from contact positions and normals using PointNet++, and enables 6DOF movement.

4. (ICML 2022) From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers

Krzysztof Choromanski*, Han Lin*, Haoxian Chen*, Tianyi Zhang, Arijit Sehanobish, Valerii Likhosherstov, Jack Parker-Holder, Tamas Sarlos, Adrian Weller, Thomas Weingarten

Highlight: We leverage many mathematical techniques ranging from spectral analysis through dynamic programming and random walks and proposed a comprehensive approach for incorporating various masking mechanisms into Transformers architectures in a scalable way, including efficient d-dimensional RPE-masking and graph-kernel masking.

5. (ICLR 2022) Hybrid Random Features

Krzysztof Choromanski*, Han Lin*, Haoxian Chen*, Yuanzhe Ma*, Arijit Sehanobish*, Deepali Jain, Michael S Ryoo, Jake Varley, Andy Zeng, Valerii Likhosherstov, Dmitry Kalashnikov, Vikas Sindhwani, Adrian Weller

Highlight: We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) equipted with strong theoretical guarantees - unbiased approximation and strictly smaller worst-case relative errors than its counterparts.

6. (Preprint 2021) Graph Kernel Attention Transformers

Krzysztof Choromanski*, Han Lin*, Haoxian Chen*, Jack Parker-Holder

Highlight: We introduce a new class of graph neural networks, called GKAT, by combining several concepts that were so far studied independently - graph kernels, attention-based networks with structural priors and more recently, efficient Transformers architectures applying small memory footprint implicit attention methods via low rank decomposition techniques.

7. (NeurIPS 2020) Demystifying Orthogonal Monte Carlo and Beyond

Han Lin*, Haoxian Chen*, Tianyi Zhang, Clement Laroche, Krzysztof Choromanski

Highlight: In this paper we shed new light on the theoretical principles behind Orthogonal Monte Carlo (OMC), applying theory of negatively dependent random variables to obtain several new concentration results. We also propose a novel extensions of the method leveraging number theory techniques and particle algorithms, called Near-Orthogonal Monte Carlo (NOMC).

* Co-First Authors, Equal Contribution.
Slideslive video recording and conference poster presenter for [5, 7].
Github code maintainer for [2, 4, 5, 7], contributor for [3].

Teaching Assistants

Academic Services

  • Conference Reviewer: ICML 2022, 2023; NeurIPS 2022
  • Conference Volunteer: RSS 2022