2025
SpatialDINO - 3D self-supervised vision transformer for LLSM
ResearchCompleted
Designed and trained a 3D self-supervised vision transformer for label-free segmentation and tracking of subcellular dynamics in lattice light-sheet microscopy — pre-trained on 2.4 TB / 180k volumes across 24 A100s. Beat a prior approach co-led by Nobel laureate Eric Betzig on downstream evaluation.
● What I shipped
- Adapted DINO-style self-supervised contrastive learning into 3D — student/teacher ViTs over LLSM volumes with native 3D iBOT block masking.
- Introduced KMeans content-aware 3D cropping, no-positional-encoding 3D ViTs (NoPE), and a 3D adaptation of SINDER for singular-defect repair.
- Built a streaming encoder with token-store + online softmax for full-volume inference at million-token sequence lengths.
- Beat the prior SOTA — including the Nobel-laureate-led approach — on downstream subcellular structure prediction.
- Released as a BioRxiv preprint, first-author; engineering log at /writing/spatialdino-lessons.