Article

Enabling Performant and Flexible Model-Internal Observability for LLM Inference featured image

Enabling Performant and Flexible Model-Internal Observability for LLM Inference

A performant, flexible deep model inspector that turns internal LLM observability into a first-class systems primitive — only 0.4%–6.8% offline / ~6% online overhead, 2×–15× lower …

avatar
Nengneng Yu
Reliable and Resilient Collective Communication Library for LLM Training and Serving featured image

Reliable and Resilient Collective Communication Library for LLM Training and Serving

A fault-tolerant, NCCL-compatible collective communication library that keeps LLM training and serving alive under NIC/link failures with <1.1% training / <3% inference overhead.

wei-wang
TabSyM: A Generative Pipeline for Small Multi-Cohort Omics Tabular Data featured image

TabSyM: A Generative Pipeline for Small Multi-Cohort Omics Tabular Data

Diffusion-based generative pipeline for small, high-dimensional, cross-cohort omics tabular data.

avatar
Nengneng Yu