P.S. The paper descriptions are based on my personal understanding. Some text were extracted from the abstracts and reviews.
Domain
LLM
ZhenghaoLin2024NeurIPS scores training tokens using a reference model and then training the language model with a focused loss on tokens with higher scores selectively, proving "not all tokens are what you need"
Quantization
HaokunLin2024NeurIPS utilizes rotation and permutation transformations to more effectively mitigate both massive and normal outliers when quantizing LLMs
VladimirMalinovskii2024NeurIPS proposes a quantization-aware strategies for fine-tuning the LLMs after quantization, improving its performance especially in extreme-compression
Alignment
JiamingJi2024NeurIPS proposes Aligner, a small model that learns the correctional residuals between preferred and dis-preferred answers, which can be used as a model-agnostic, plug-and-play alignment module with only one-off training
Evaluation
ZheHu2024NeurIPS evaluates vision language models' abilities to understand human humor with the YesBut dataset
RicardoDominguezOlmedo2024NeurIPS evaluates LLM's answer to survey questions that's designed for human and reveals that it suffers strong positional bias; if the positional bias is properly controlled, it represents aggregated uniform. Therefore, it's dangerous to interpret its response to survey as if it's a real human being
ArjunPanickssery2024NeurIPS evaluates the self-preference bias when LLM acting as an evaluator, which is an issue with wide implications in LLM benchmarking, reward modeling, and self-refinement
QiguangChen2024NeurIPS introduces the reasoning boundary (RB) for (a) the quantitative metrics to assess CoT capabilities and (b) explains how certain strategies optimizes CoT performance
ChengyiCai2024NeurIPS uses Bayesian-guided label mapping for visual reprogramming, in replacement of the simple one-to-one mapping of the training and downstream labels
ShangziXue2024NeurIPS introduces a reasoning tree framework consisting of decompose-analyze-rethink, noted that the decompose step builds sub-trees while the rethink step reflect & update the parent tree
ShaotengLiu2024NeurIPS break down a task into subtasks and dynamically decide whether to solve such a subtask by code generated by the LLM or by a "traditional" RL agent
Computer vision
2D
JiaqingZhang2024NeurIPS for single stage end-to-end training of multi-modal fusion detection
zhengruiXu2024NeurIPS uses diffusion model as feature extractor for discriminative tasks
MichaelLuo2024NeurIPS automatically selects and composes task-specific adapters for diffusion models based on a user-provided prompt
Generation
KeyuTian2024NeurIPS presents a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction", diverging from the standard raster-scan "next-token prediction"
It presents two important properties of LLMs: scaling laws and zero-shot task generalization #📖
TianhongLi2024NeurIPS+ improves unconditioned image generation by using latent representation to conditioned the image generation process
Video
Generation
SichengXu2024NeurIPS learns a disentangled face latent space for facial dynamics and head motion. It's then used for audio to facial video conversion in real-time
3D
Reconstruction
RuiqiGao2024NeurIPS uses multi-view diffusion model to generate novel views for 3D reconstruction
Segmentation
ChangliWu2024NeurIPS uses spatial information to enhance 3D referring expression segmentation
Spatial-temporal
JunhaoCai2024NeurIPS uses Gaussians for simulation & physical property estimation
ZhongchaoYi2024NeurIPS cooperative multi-dimensional and multi-task learning for urban intelligence
Generation
MinghuaLiu2024NeurIPS generates mesh with 3D sparse voxels as representation, instead of triplane #📖 P.S. Trained with 8xH100 for 1 week
Recommendation
ShenLi2024NeurIPS proposes to use response time as a cue to learn human preference. Specifically, it combines choices and response times to estimate human utility functions, grounded in the EZ diffusion model from psychology. P.S. It claims combining such extra-info accelerates the preference learning process. Hum... Claiming extra-info boost performance sounds trivial, yet claiming that it boosts learning sounds brilliant! Clever one.
AI4Science
Physics & chemistry
GangLiu2024NeurIPS proposes Graph DiT for conditioned molecular design, with a condition encoder to learn the representation of numerical and categorical properties and a Transformer-based graph denoiser to achieve molecular graph denoising
NicholasGao2024NeurIPS designs the over-parametrized & fully learnable neural wave functions, facilitating the use of learnable generalized wave functions for simulating the ground state of many-electron systems
YuliaRubanova2024NeurIPS uses learned signed-distance functions (SDFs) to represent the object shapes and to speed up distance computation for GNN based rigid simulation. It's the fist GNN-based simulator that scale to scenes with hundreds of objects and up to 1.1 million nodes
Neuroscience
SpencerRooke2024NeurIPS finds that (i) the number of contexts storable by the hippocampus grows exponentially with the number of place cells; and (ii) identifies a trade-off between high resolution encoding of position and the number of storable contexts
ZixuanGong2024NeurIPS proposes NeuroClips for fMRI-to-video decoding. It first reconstruct video keyframes from high-level semantics flow, and then injects both keyframes and low-level perception flows to a pre-trained T2V diffusion model for video reconstruction
Healthcare
YubinKim2024NeurIPS introduce a multi-agent framework that enforce a collaboration structure to a team of LLMs for medical decision
AI4Math
PDE
ZekunShi2024NeurIPS uses stochastic Taylor derivative estimator for efficient amortization of differential operators, boosting the speed of high-order differential operators in large-scale problems, e.g. solving PDE and running Physics-Informed Neural Networks (PINNs)
Casual inference
FengXie2024NeurIPS theoretically investigates the identification of the bi-directional MR from observational data and develops a cluster fusion-like method for causal inference
SiyuanGuo2024NeurIPS develops a casual inference framework for exchangeable generative processes, which naturally arise in multi-environment data, extending current works from i.i.d. (independent and identically distributed) to non i.i.d. settings
TianweiYin2024NeurIPS enhances one-step generation by improve the distillation scheme with two time-scale update and GAN loss (rather than images sampled from the teacher model)
TeroKarras2024NeurIPS guides the model with a smaller, less-trained model, leading to improved variation, comparing with using an unconditional model
SangwoongYoon2024NeurIPS uses inverse reinforcement learning (IRL) to improve the sample quality of diffusion generative model
Transformer
TianyuHe2024NeurIPS investigates transformer model's out-of-distribution in-context learning ability with a set of constructed arithmetic tasks
YuhongChou2024NeurIPS unifies existing linear complexity attention and proposes Meta Linear Attention (MetaLA), to replace the conventional softmax attention
YutaoSun2024NeurIPS introduce a decoder-decoder architecture, YOCO, which only caches key-value pairs once, to reduce RAM demands and prefill latency
GNN
DongxiaoHe2024NeurIPS reveals the common mechanism behind various contrastive representation learning for GNN as representation scattering and proposes the Scattering Graph Representation Learning (SGRL) framework
RaffaelePaolino2024NeurIPS introduces a new hierarchy of graph isomorphism tests, alternative to the standard k-WL hierarchy
IoannisKalogeropoulos2024NeurIPS proposes a new GNN-based meta-networks design to include scaling symmetries, instead of only permutation symmetries that has been investigated before
ArthurdaCunha2024NeurIPS close the gap of the Boosting's theoretical lower bound of the p (the number of training rounds) - t (the total parallel work per round) trade-off
JinZhang2024NeurIPS establishes generalization upper bound of Rademacher complexity for various tree-based retrievers
XinChen2024NeurIPS achieves optimal clustering in Gaussian Mixture Models with anisotropic covariance structures
Supervision
Reinforcement learning
XiongHuiChen2024NeurIPS introduces policy learning from tutorial books and verifies the idea by outperforming GPT-agent without using real data for training
JaydenTeoh2024NeurIPS proposes coverage-based novelty evaluation for unsupervised environmental design (UED)
YongzheJia2024NeurIPS introduces a FL framework to address system heterogeneity and domain shifts in edge computing environments. It employs a Model Fusion Pruning (MFP) module to generate personalized compact local models and a Domain Adaptive Regularization (DAR) module to enhance performance across multiple domains
Optimization algorithm
AaronDefazio2024NeurIPS introduces schedule-free AdamW, without additional hyper-parameters over standard optimizers with momentum. It's based on the authors' theory that unifies scheduling and iterate averaging
RohanAlur2024NeurIPS for human in the loop to incorporate side information that are algorithmically indistinguishable
Do you have any ideas or comments? Please join the discussion on X👇