← All Docs

Getting Started

README

README.md

arkashj.com

Personal website, canonical knowledge hub, and O-1 visa evidence portfolio for Arkash Jain — first-author of SpatialDINO, Harvard ML researcher (Kirchhausen Lab), Head of Forward Deployed Engineering at Benmore Technologies, four-time published author across cell biology, chemical physics, and self-supervised learning.

This README is intentionally long. It consolidates the website's contents — biography, credentials, publications, experience, knowledge, projects, internal tools, life changelog, stack, and external links — into a single navigable Markdown reference. Treat it as the canonical text-only mirror of arkashj.com.

License Version Status CI

Next.js React TypeScript Tailwind CSS MDX Vercel

ESLint Prettier Husky Geist

BioRxiv DOI JCB DOI JCP PyTorch issue ORCID


Table of contents


Who is Arkash

Arkash Jain is a researcher and engineer working at the intersection of self-supervised computer vision, distributed systems, and forward-deployed AI consulting. He is currently Head of Forward Deployed Engineering at Benmore Technologies in Chicago, where he was the second technical hire and helped scale revenue 887% across 2025–2026 while serving as the lead engineer on engagements across SaaS, healthcare, NIL athletics, compliance (Vanta / SOC 2 / NIST / FedRAMP), and consumer verticals.

Before Benmore, he spent fifteen months as an ML researcher in the Kirchhausen Lab at Harvard Medical School / Boston Children's Hospital, where he designed SpatialDINO — the first 3D self-supervised vision transformer for cryo-electron tomography. SpatialDINO beat a prior approach led by a Nobel laureate on downstream subcellular structure prediction, was first-authored, and is currently a BioRxiv preprint. Two follow-on papers from the same lab were published in the Journal of Cell Biology in 2025 and 2026.

He started college at Boston University in 2020 as a physics student. By the end of his freshman year he was a UROP Scholar (1 of 5 freshmen selected university-wide), working on ultrafast 2D infrared spectroscopy of supercritical fluids in Larry Ziegler's lab. That work culminated in a co-authored 2022 paper in the Journal of Chemical Physics. He then pivoted into computer science, completed an accelerated BA/MS dual-degree in Math + CS / CS in four years, graduated Magna Cum Laude, was named the Marvin Freedman Scholar (1 of 6 in the entire BU mathematics department), TA'd four classes (CS411, CS131, EK301, MA581), and authored a Master's thesis on dynamic checkpointing in Apache Flink under the BU distributed-systems group.

In parallel he interned twice at Battery Ventures (sourcing + diligence), once at Boston Children's Hospital (an ALS resource discovery web app built to Section 508 / WCAG 2.1 AA), and once at ZeroSync (production Rust on tokio + NATS JetStream + a Merkle-tree POC for tamper-evident sync). He also turned down VC offers from MassMutual Ventures and State Street to stay at Battery, and was admitted to UCL, NTU Singapore, NYU, and Dartmouth before choosing BU.

He writes weekly at arkash.substack.com on AI hardware, finance, distributed systems, geopolitics, and venture strategy. He is the co-host of the STU STREET podcast (long-form interviews with founders, athletes, and professors, originally on WTBU). He has 7 distributed-systems articles on Medium and active accounts on X / Twitter and LinkedIn.

He arrived in the United States from Chandigarh, India in September 2020. This website is the central evidence hub for his O-1 visa application — every page is structured to be Google-indexed, link-rich, and built around verifiable external citations.


How to reach me


Identity at a glance

NameArkash Jain
TitleHead of Forward Deployed Engineering, Benmore Technologies
PriorML Researcher, Kirchhausen Lab — Harvard Medical School / Boston Children's Hospital
EducationHarvard University (Postgraduate Research, Computer Vision & AI, 2024–2025) · Boston University (BA Math + CS, MS CS — 4-yr accelerated, 2020–2024, Magna Cum Laude)
CitizenshipIndian national (in the US since Sep 2020); O-1 visa applicant
HometownChandigarh, India
Currently inBoston / Chicago (remote-first)
Languages spokenEnglish, Hindi, Punjabi
Languages programmedPython, TypeScript, Rust, Go, Java, OCaml, C, R, MATLAB, JavaScript, Bash
First paperNov 2022 — J. Chem. Phys. (chemical physics)
First-author paper2025 — SpatialDINO on BioRxiv (3D self-supervised ViT)
Open-sourcePyTorch issue #144779 — RDZV Infiniband backend
Currently writingWeekly long-form essays at arkash.substack.com

Life changelog

A reverse-chronological-friendly retelling of the path so far, in five named phases. Headings link to the on-site deep-dive timeline entries where one exists; otherwise the supporting source is linked inline.

Phase 0 — Wanting to be a physicist (2018–2020, Chandigarh, India)

Two years of E&M, particle physics, organic and physical chemistry, and optics — the standard JEE Advanced curriculum for kids who wanted to do physics in India. Sat JEE Advanced (≈1M candidates), placed roughly All-India rank 8,000 (top percentile). Sat AP Calculus, AP Physics C: Mechanics, and AP Physics C: E&M. Was admitted to University College London, Nanyang Technological University Singapore, Boston University, NYU, and Dartmouth. Picked Boston University — the physics department and proximity to MIT and Harvard labs were the deciding factors.

Phase 1 — Physicist at BU (Sep 2020 – 2022)

Arrived in the United States in September 2020. Joined Larry Ziegler's ultrafast spectroscopy lab as a freshman under PhD student Matt Rotondaro. Aligned femtosecond ultrafast laser systems for 2D infrared spectroscopy. Prepared supercritical Xe and SF₆ fluid samples for near-critical-density studies. Wrote the auto-correlation analysis code for rotational and vibrational energy relaxation traces.

Selected as NSF UROP Scholar (1 of 5 freshmen across the entire university) for the project "Ultrafast Two Dimensional Infrared Spectroscopy of Supercritical Fluids: Energy Relaxation and Local Critical Slowing Effects." Co-authored the Nov 2022 Journal of Chemical Physics paper on N₂O dynamics in supercritical solvents — IBC breakdown and critical slowing near the critical point — and a companion paper in the Journal of Physical Chemistry Letters (ACS).

The lab also taught me that I liked the math and the code more than the optics bench. By the end of sophomore year I had pivoted to computer science.

Phase 2 — Venture capital (Dec 2021 – Aug 2022)

Sourcing intern at Battery Ventures under Dallin Bills, working alongside GP Michael Brown. Got fluent in the early-stage B2B SaaS investment vocabulary: Rule of 40, ARR growth vs. burn multiples, gross retention vs. logo churn vs. net dollar retention, the magic number, CAC payback. Sourced three deals to partner-meeting stage (including CarNow). Got VC offers from MassMutual Ventures (Feb 2022) and State Street (Summer 2022); turned both down to return to Battery for a diligence summer.

Diligence intern at Battery the following summer — embedded with a portfolio company on its EU expansion strategy: pricing, GTM motion, regulatory fit, competitive landscape across the European market.

Phase 3 — Engineer + researcher at BU (2022 – May 2024)

Admitted to BU's accelerated BA/MS in Computer Science (BA Math + CS / MS CS, four years instead of six). TA'd four classes for ~300 students each: CS411 (Software Engineering), CS131 (Discrete Mathematics), EK301 (Mechanics), MA581 (Probability). Co-instructed a 300-level Mechanics course as a sophomore while carrying an 18-credit load and a 20-hour work week.

Six classes that year, each documented as a coursework deep-dive:

  • CS 350 — Distributed Systems: from-scratch Raft consensus in Go — leader election, log replication, snapshot install RPC. Plus a MapReduce coordinator + worker with plugin-loaded map/reduce functions.
  • CS 320 — Concepts of Programming Languages: hand-written lexer, parser, and stack-machine evaluator for a BNF language in OCaml.
  • DS 522 — Optimization: Adam / AMSGrad / RMSProp comparison; article evaluations of Reddi 2018 and Toulis 2016 (implicit SGD).
  • CS 561 — Data Mechanics / Cloud: built a generated-HTML mini-internet and ran PageRank locally vs. on GCP.
  • CS 630 — Advanced Algorithms: 331,776-instance enumeration of Gale-Shapley to study average-case behavior; reservoir sampling.
  • CS 611 — OOP & Design Patterns: Monsters & Heroes turn-based RPG + a Java Swing trading platform with singleton persistence.
  • MA 582 — Mathematical Statistics (graduate): rigorous inference — MLE, sufficient statistics, MGFs, asymptotics.

Distributed Systems Research / BU Master's thesis: dynamic checkpointing in Apache Flink. Static checkpoint intervals are a tax in idle periods and a stall during bursts — built a controller that adapted cadence from live backpressure signals. Instrumented the Flink JobManager to surface per-operator backpressure ratios as a control signal. Benchmarked RocksDB state backend against in-memory; quantified write-amplification tradeoffs. Validated on the NEXMARK streaming benchmark; measured tail-latency wins on bursty workloads.

Internships during this phase:

  • Boston Children's Hospital (Spring 2023) — Built an internal ALS resource-discovery web app for clinicians and patient families. React frontend wired to a Strapi headless CMS so non-technical staff could update content without a deploy. Swagger-documented REST API. Section 508 / WCAG 2.1 AA compliance — keyboard-only nav, ARIA landmarks, focus-visible rings, sufficient contrast, skip links, screen-reader tested. Validated search/filter UX with real ALS clinicians.
  • ZeroSync (Summer 2023) — First production-grade Rust. Built an Excel-side marketplace and a server-side ingestion pipeline that converted unstructured data (CSVs, JSON dumps, free-form Excel) into structured records flowing through NATS JetStream. Spent the first two weeks deep in The Rust Book — the borrow checker forces you to internalize ownership, lifetimes, and Send/Sync before you can ship anything async. tokio + async-trait for concurrent I/O across hundreds of NATS subjects. Merkle-tree POC (repo) — SHA-256 + canonical JSON hashing + sorted pairwise concat — for tamper-evident sync of records across the pipeline. Excel side: JavaScript Office Add-in scaffolded with Yeoman (yo generator-office); generated TLS dev certificates with office-addin-dev-certs, trusted them in macOS Keychain, wired their paths into nats.conf so the add-in published over TLS.

Graduated Magna Cum Laude from Boston University in May 2024 — BA in Math & CS, MS in CS — and was named the Marvin Freedman Scholar (1 of 6 mathematics undergraduates department-wide).

Phase 4 — Harvard Medical School / Kirchhausen Lab (May 2024 – Aug 2025)

Joined the Kirchhausen Lab at Harvard Medical School / Boston Children's Hospital, under Tom Kirchhausen (member, National Academy of Arts and Sciences). The lab images subcellular structures at near-atomic resolution via cryo-electron tomography; my job was to make the resulting volumes interpretable at scale.

Trained on multi-node DGX clusters: A100 / H100 GPUs, NVLink intra-node, Infiniband inter-node, RAID + custom NVMe storage tier. Used PyTorch FSDP with bf16 mixed precision and activation checkpointing to fit large 3D vision transformers. Diagnosed and reported a Rendezvous (RDZV) backend issue affecting Infiniband multi-node training — filed PyTorch issue #144779.

SpatialDINO — designed and trained the first 3D self-supervised vision transformer for subcellular structure prediction from cryo-electron tomograms. Adapted DINO-style self-supervised contrastive learning into 3D — student/teacher ViTs over volumetric tomograms. Pretrained on unannotated tomograms; fine-tuned on a tiny labeled set for vesicle / organelle classification. Beat the prior SOTA, including a Nobel-laureate-led approach, on downstream evaluation. Released as a BioRxiv preprint, first-author. The lessons are written up at /knowledge/ai/spatialdino-lessons.

Two co-authored follow-on papers appeared in the Journal of Cell Biology (Aug 2025): a volumetric reconstruction of mammalian ER exit sites at unprecedented resolution via FIB-SEM and learned segmentation; and a UNET architecture for semi-supervised segmentation. (JCB 225, e202504178)

Phase 5 — Benmore Technologies (Aug 2025 – present)

Joined Benmore Technologies as Employee #2 — Forward Deployed Strategist & Engineer. Embedded into client engineering teams, scoped systems end-to-end, and shipped production code from day one. Onboarded the first ten clients across SaaS, healthcare, NIL athletics, compliance (Vanta / SOC 2 / NIST / FedRAMP), and consumer verticals — including Patriot Safety Services (compliance for Chevron / NextTier, $5–10M / yr), Nobel Gas, and Sun Theory.

Cross-stack: Stripe, Django, Next.js, FastAPI, React Native, plus authoring Claude Code skill systems at scale (one of the first companies to use Claude Code commercially in production engagements). Authored the Benmore Foundry CLI — internal orchestration layer for SMB AI consulting engagements.

Promoted to Head of FDE in April 2026, leading the forward-deployed engineering practice across all client engagements. Revenue acceleration during this period: $150k total → $150k every 15 days — 887% growth in six months. Headcount 8 → 40. The full mechanism is written up at /writing/the-fde-feedback-loop.


Education

InstitutionDegreeDatesNotes
Harvard UniversityPostgraduate Research, Computer Vision & AI2024–2025Kirchhausen Lab. SpatialDINO + 2 Journal of Cell Biology papers. PyTorch upstream contribution.
Boston UniversityBA — Mathematics + Computer Science · MS — Computer Science (4-yr accelerated dual-degree)2020–2024Magna Cum Laude. Marvin Freedman Scholar (1 of 6 in math dept). NSF UROP Scholar (1 of 5 freshmen university-wide). Dean's List every semester. Master's thesis: Dynamic Checkpointing in Apache Flink.

Verifiable credentials (PDF) are linked under Verifiable credentials below.


Awards & honors

  • NSF UROP Scholar (2021) — 1 of 5 freshmen selected university-wide at BU. Project: Ultrafast Two-Dimensional Infrared Spectroscopy of Supercritical Fluids: Energy Relaxation and Local Critical Slowing Effects.UROP Symposium 2021 brochure (PDF) · UROP 2020-2021 awarded students
  • Marvin Freedman Scholar (2024) — 1 of 6 mathematics undergraduates honored department-wide at BU.
  • Magna Cum Laude (2024) — Boston University.
  • Dean's List — every semester at BU.
  • JEE Advanced (2019) — ≈8,000 / ≈1M All-India rank.
  • Battery Ventures — admitted to two consecutive internships (sourcing 2021–22, diligence 2022); chosen over MassMutual Ventures and State Street offers.
  • University admissions (2020) — UCL, NTU Singapore, BU, NYU, Dartmouth (chose BU).
  • PyTorch open-source contribution — issue #144779 accepted into the upstream tracker for the RDZV / Infiniband backend.

Publications

Four formal publications, in reverse-chronological order. All have permanent DOI / preprint server URLs.

1. SpatialDINO: Self-Supervised Learning for 3D Vision Transformers — BioRxiv, 2025

First-author, Kirchhausen Lab, Harvard Medical School. The first 3D self-supervised vision transformer for subcellular structure prediction from cryo-electron tomograms. Beat a prior approach led by a Nobel laureate on downstream evaluation.

2. Close-up of Vesicular ER Exit Sites by Volume Electron Imaging using FIB-SEM — Journal of Cell Biology, 2026

Volumetric reconstruction of mammalian ER exit sites at unprecedented resolution via FIB-SEM and learned segmentation. Co-author, Kirchhausen Lab.

3. UNET for Semi-Supervised Segmentation — Journal of Cell Biology, 2025

Co-author. Semi-supervised UNET segmentation as part of the broader Kirchhausen Lab program.

4. Ultrafast 2DIR comparison of rotational energy transfer, isolated binary collision breakdown, and near-critical fluctuations in Xe and SF₆ solutions — Journal of Chemical Physics, Nov 2022

First publication. Femtosecond two-dimensional infrared spectroscopy of N₂O dynamics in supercritical xenon and SF₆ — IBC breakdown and critical slowing near the critical point. Companion paper in Journal of Physical Chemistry Letters (ACS).


Open-source contributions

  • pytorch/pytorch#144779 — Diagnosed and reported a Rendezvous (RDZV) backend issue affecting Infiniband-backed multi-node distributed training during SpatialDINO scale-out on Harvard's DGX cluster.
  • ArkashJ/merkle_tree — Rust Merkle-tree POC (SHA-256 + canonical JSON hashing + sorted pairwise concat) for tamper-evident sync of NATS-streamed records. Built during ZeroSync internship; full write-up at /knowledge/distributed-systems/merkle-tree-rust-poc.
  • ArkashJ/Raft — From-scratch Raft consensus implementation in Go: leader election, log replication, snapshot install RPC.
  • ArkashJ/CloudComputing — BU CS591 coursework: MapReduce, Spark, distributed kv-store experiments.
  • ArkashJ/NEXMARK-Benchmark — Streaming-systems benchmark suite implementation against Apache Flink.
  • ArkashJ/implict-SGD-implementation — Implicit stochastic gradient descent — convergence improvements for ill-conditioned problems.
  • ArkashJ/CS411_labs — TA materials and reference solutions for the BU undergraduate software-engineering course.
  • ArkashJ/excel_connector — Yeoman-scaffolded JavaScript Office Add-in talking to a Rust + NATS pipeline.

Profile: github.com/ArkashJ.


Experience

RoleOrgDatesLocation
Head of FDE — Forward Deployed Strategist & EngineerBenmore TechnologiesAug 2025 – presentRemote (Chicago HQ)
ML ResearcherHarvard Medical School / Kirchhausen LabMay 2024 – Aug 2025Boston, MA
SWE Intern (Rust)ZeroSyncMay – Aug 2023Remote
SWE Intern (ALS resource tool)Boston Children's HospitalJan – May 2023Boston, MA
TA + Distributed Systems ResearcherBoston University2021 – 2024Boston, MA
Analyst — Diligence ExternBattery VenturesMay – Aug 2022Boston, MA
Analyst — Sourcing ExternBattery VenturesDec 2021 – Apr 2022Boston, MA
Undergraduate Researcher (NSF UROP)BU Chemistry / Ziegler LabJan – Aug 2021Boston, MA

Detailed bullets per role live at arkashj.com/experience.


Internal tooling I author and maintain

ToolPurposeStack
Benmore Foundry CLIInternal orchestration layer for SMB AI consulting engagements — kicks off scoped agents, books work, manages handoffsPython · Typer · Claude Code
RTK — Rust Token KillerToken-optimized CLI proxy for Claude Code; 60–90% token savings on dev operations through transparent rewrite hooksRust · Claude Code Hooks
Compound Engineering SkillsAuthored Claude Code skills for code review, debugging, planning, brainstorming, frontend design — used by team dailyMarkdown · Claude Code Skills
Excalidraw Discovery FlowsReusable client-discovery diagram set used during scoping engagements at BenmoreExcalidraw · Process

Internal tools page: arkashj.com/work.


Projects

A representative slice — full list with GitHub links lives at arkashj.com/projects.

  • SpatialDINO (2025) — First 3D SSL ViT for cryo-ET subcellular structures. PyTorch · FSDP · DGX · cryo-ET.
  • Raft (Go) (2023) — From-scratch consensus impl: leader election, log replication, snapshotting.
  • CloudComputing (2023) — BU CS591 — MapReduce, Spark, distributed kv-store.
  • NEXMARK Benchmark (2023) — Streaming-systems benchmark against Apache Flink.
  • Implicit SGD (2024) — Convergence on ill-conditioned problems.
  • CS411 Labs (2023) — TA materials for BU software-engineering course.
  • Benmore Foundry CLI (2025) — Python · Typer · Claude Code orchestration.
  • Dynamic Checkpointing in Apache Flink (2024) — BU thesis; adaptive cadence on backpressure.
  • OCaml Interpreter (2022) — Tree-walking interpreter for a typed functional language.
  • Spotify ↔ YouTube transfer (2022) — Migrate playlists between music platforms via API matching.
  • STU STREET podcast (2022 –) — Co-hosted long-form interviews on WTBU.
  • ALS Resource Tool (2023) — Resource-discovery tool for ALS patients — Django + Postgres at Boston Children's.
  • merkle_tree (Rust) (2023) — Tamper-evident sync POC at ZeroSync.

Knowledge — the six domains

The site organizes durable, citable knowledge into six domains. Each page is an MDX deep-dive with external sources.

DomainWhat's in it
AISelf-supervised learning, vision transformers, distributed training infrastructure (FSDP, NCCL, Infiniband), SpatialDINO lessons.
FinanceAggregation theory, AI infrastructure as a structural trade, public thesis tracker.
Distributed SystemsFlink, RocksDB, Raft, MapReduce, compression, checkpointing, Merkle trees.
MathOptimizers, convergence, intuition behind the proofs.
PhysicsSupercritical fluids, nuclear reactor efficiency, why I left physics.
SoftwareStack evolution, Claude Code, the tools that make me 10× — TypeScript strict, ergonomics, agentic engineering.

Writing

arkash.substack.com — weekly long-form essays. Topics: AI hardware, economics, finance, geopolitics, venture strategy.

Site-hosted essays (MDX, render at /writing/[slug]):

Also active on Medium @arkjain — 7 distributed-systems articles.


Media — podcasts, press, talks

  • STU STREET — co-hosted long-form interview podcast. Originally on WTBU. 25 episodes, including conversations with Benmore leadership and BU faculty. Available on Apple Podcasts and Spotify; episode embeds live at arkashj.com/media.
  • Benmore talks — internal talks, embedded on /media.
  • Trustpilot reviews — 5★ social proof, surfaced on /media.

Verifiable credentials

PDF originals are committed to the repository under public/images/files/ and served from the website at /credentials:

The full credentials page including external profile cross-references (BU CS, Kirchhausen Lab, ORCID, PubMed) lives at arkashj.com/credentials.


About this repository

The remainder of this README is for engineers reading the source. Skip if you only came for the bio.

Quickstart

git clone https://github.com/ArkashJ/Personal-Website.git
cd Personal-Website
npm install
npm run dev          # http://localhost:3000

Production build:

npm run build && npm run start

Stack

LayerTechnology
FrameworkNext.js 15.5 (App Router) + React 19
LanguageTypeScript strict
StylingTailwind CSS 3 + CSS variables for dark/light theme
FontsGeist Sans + Geist Mono via next/font (geist package)
ContentMDX rendered via next-mdx-remote/rsc (server components, zero client JS)
EmbedsTweet · YouTube · LinkedIn · Substack · Gist (in MDX) via react-tweet and bespoke components
Themenext-themes with data-theme attribute, sun/moon toggle
Iconslucide-react
Searchcmdk — Cmd+K command palette
Markdownreact-markdown + remark-gfm + rehype-slug for /docs
SEONative app/sitemap.ts + app/robots.ts + per-page OG via next/og
JSON-LDPerson · Article · ScholarlyArticle schemas via lib/structured-data.ts
DeploymentVercel (auto-deploy from main via GitHub integration)
CIGitHub Actions — lint + format check + build on every PR / push
Pre-commitHusky + lint-staged (Prettier + ESLint on staged files)

Repository tree

Personal-Website/
├── app/                         # Next.js App Router
│   ├── about/                   # Life Changelog
│   │   ├── archive/             # Pre-revamp legacy bio (snapshot)
│   │   └── timeline/[slug]/     # Per-milestone deep dives
│   ├── architecture/            # 6 React/SVG diagrams of the stack
│   ├── coursework/              # BU + Harvard coursework hub
│   │   └── [slug]/
│   ├── credentials/             # Verifiable PDF credentials
│   ├── docs/                    # In-site rendering of docs/*.md
│   │   └── [slug]/
│   ├── experience/              # 8 reverse-chrono work entries
│   ├── knowledge/               # 6 domain hub
│   │   └── [domain]/[slug]/     # MDX deep dives
│   ├── learnings/               # 12+ hard-won lessons
│   ├── media/                   # Podcasts · press · Substack · Medium
│   ├── projects/                # Real GitHub projects
│   ├── research/                # 4 papers + ML stack + PyTorch contribution
│   ├── stack/                   # uses.tech-style — 36 entries × 7 categories
│   ├── VC/                      # Server redirect (legacy)
│   ├── Volunteering/            # Server redirect (legacy)
│   ├── work/                    # Internal CLIs (Foundry, RTK, Skills, Excalidraw)
│   ├── writing/                 # Tagged essay index
│   │   └── [slug]/              # MDX articles
│   ├── apple-icon.tsx           # Apple touch icon
│   ├── globals.css              # Tailwind + CSS-variable theme
│   ├── layout.tsx               # Root layout (Person JSON-LD, Nav, Footer, fonts)
│   ├── manifest.ts              # PWA manifest
│   ├── not-found.tsx            # 404
│   ├── opengraph-image.tsx      # Static OG for /
│   ├── page.tsx                 # Homepage (Hero, Arc, Now, Research, Work, Projects)
│   ├── robots.ts                # robots.txt MetadataRoute
│   └── sitemap.ts               # sitemap.xml MetadataRoute
├── components/
│   ├── architecture/            # SVG diagrams
│   ├── docs/                    # Doc-rendering helpers
│   ├── embeds/                  # <Tweet> <YouTube> <LinkedInPost> <Substack> <Gist>
│   ├── layout/                  # <Nav> <Footer> <Container> <SectionHeader> <Pill> <HeroDemo>
│   ├── sections/                # <PaperCard> <ProjectCard> <TimelineItem> ...
│   ├── seo/                     # <JsonLd>
│   ├── ui/                      # <BackLink> <CommandPalette> <InstitutionLogo> ...
│   ├── MdxContent.tsx           # next-mdx-remote/rsc renderer
│   ├── ThemeProvider.tsx
│   └── ThemeToggle.tsx
├── content/
│   ├── coursework/              # Course MDX
│   │   ├── fall-2023/
│   │   └── spring-2023/
│   ├── knowledge/               # 6 domain MDX directories
│   │   ├── ai/
│   │   ├── distributed-systems/
│   │   ├── finance/
│   │   ├── math/
│   │   ├── physics/
│   │   └── software/
│   └── writing/                 # Long-form MDX essays
├── lib/                         # Typed data + helpers (single source of truth)
│   ├── content.ts               # MDX frontmatter loaders
│   ├── coursework.ts            # Course data
│   ├── data.ts                  # Papers, experience, projects, timeline, ...
│   ├── docs.ts                  # docs/*.md loader for /docs
│   ├── finance.ts               # Theses + trade log
│   ├── learnings.ts             # Learnings cards
│   ├── media.ts                 # Podcast, Medium, Substack, press
│   ├── metadata.ts              # buildMetadata() factory
│   ├── og.tsx                   # Shared 1200×630 OG renderer
│   ├── site.ts                  # SITE constants + NAV_LINKS
│   ├── stack.ts                 # uses.tech entries
│   └── structured-data.ts       # Person · Article · ScholarlyArticle JSON-LD
├── public/
│   ├── favicon.svg              # Next.js favicon convention
│   ├── images/                  # SINGLE consolidated location for all site images
│   │   ├── profile.jpeg         # Author photo (Timeline, JSON-LD, OG fallback)
│   │   ├── logos/               # Institution logos (BU, Harvard, BCH, NSF)
│   │   ├── files/               # Verifiable PDF credentials
│   │   └── legacy/              # Pre-revamp assets (archival; do not add new content)
│   ├── timeline/                # Reserved for per-milestone hero images
│   ├── llms.txt · llms-full.txt # AI crawler guidance
│   ├── humans.txt · robots.txt
│   └── *.html · *.txt           # Google + IndexNow verification keys
├── docs/
│   ├── architecture/            # ASCII flow archive
│   ├── screenshots/             # Dev screenshots (gitignored except this dir)
│   └── superpowers/             # Specs, plans, notes from build sessions
├── scripts/                     # Vercel + IndexNow + build utilities
├── types/                       # Type augmentation (CSS imports, etc.)
├── .github/workflows/ci.yml     # GitHub Actions — lint + format + build
├── .husky/                      # Pre-commit hooks
├── CHANGELOG.md                 # Every release
├── CLAUDE.md                    # Agent / Claude Code orientation
├── LICENSE                      # Apache 2.0
├── README.md                    # ← you are here
├── next.config.js               # Security headers + image remotePatterns
├── package.json                 # Dependencies + scripts
├── tailwind.config.js           # Theme + sharp edges
├── tsconfig.json                # Strict mode, @/* aliases
└── vercel.json                  # Headers + caching

Routes

/                                — Hero · Arc · Now · Research · Work · Projects · Knowledge · Writing
/about                           — Life Changelog
/about/timeline/[slug]           — Per-milestone deep dive
/about/archive                   — Pre-revamp legacy bio
/research                        — 4 papers + ML stack + PyTorch contribution
/experience                      — 8 reverse-chrono entries
/projects                        — 13 real projects with GitHub links
/work                            — Foundry · RTK · Skills · Excalidraw
/writing                         — Essay index
/writing/[slug]                  — MDX article
/knowledge                       — 6 domains
/knowledge/[domain]              — domain index
/knowledge/[domain]/[slug]       — MDX deep dive
/coursework                      — BU + Harvard coursework
/coursework/[slug]               — Course detail
/credentials                     — Verifiable PDFs
/media                           — Podcasts, Medium, Substack, press
/stack                           — uses.tech-style page (36 × 7)
/learnings                       — 12+ lessons
/architecture                    — 6 React/SVG diagrams
/docs · /docs/[slug]             — In-site rendering of docs/*.md
/sitemap.xml                     — All static + dynamic MDX routes
/robots.txt                      — Allow-all + sitemap pointer
/manifest.webmanifest            — PWA manifest

Common commands

npm run dev                # Dev server
npm run build              # Production build (must pass before push)
npm run start              # Serve the built output
npm run lint               # ESLint
npm run lint:fix           # ESLint --fix
npm run format             # Prettier write
npm run format:check       # Prettier check (CI uses this)

vercel deploy --prod --yes # Manual production deploy

# Branch flow
git checkout -b feat/short-name
git add -A && git commit -m "feat(scope): one-line"
git push -u origin feat/short-name
gh pr create --base main --head feat/short-name --title "..." --body "..."
gh pr merge --squash --delete-branch

Adding content

SurfaceHow
New writing postAdd MDX to content/writing/*.mdx with frontmatter (title, date, tags, description). Picked up automatically.
New knowledge articlecontent/knowledge/[domain]/*.mdx. New domain = new folder + entry in KNOWLEDGE_DOMAINS in lib/data.ts.
New paper / experience / project / internal toolEdit the relevant array in lib/data.ts.
New podcast / Medium / Substack / presslib/media.ts.
New thesis / tradelib/finance.ts.
New stack entrylib/stack.ts.
New learninglib/learnings.ts.
New nav linkNAV_LINKS in lib/site.ts.
New courselib/coursework.ts + optional MDX in content/coursework/.
New verifiable credentialDrop PDF in public/images/files/ + entry in app/credentials/page.tsx.

Embedding social posts in MDX

<Tweet id="1234567890" />
<YouTube id="dQw4w9WgXcQ" />
<LinkedInPost urn="7165432109876543210" />
<Substack publication="arkash" slug="some-post" />
<Gist user="ArkashJ" id="abc123def456" />

Components live in components/embeds/, wired via components/MdxContent.tsx.

SEO infrastructure

Because this site is the central evidence hub for an O-1 visa application, SEO is load-bearing.

  • app/sitemap.ts enumerates every static and dynamic MDX route.
  • app/robots.ts allow-all + sitemap pointer.
  • Per-page generateMetadata() via buildMetadata() (lib/metadata.ts).
  • Per-page JSON-LD via lib/structured-data.ts — Person on every page; ScholarlyArticle on research; Article on writing.
  • Static OG images for top-level routes via opengraph-image.tsx.
  • Dynamic per-post OG for MDX routes (opengraph-image.tsx colocated in dynamic route folders) using next/og + the shared template in lib/og.tsx.
  • IndexNow keyfile committed at the repo root (.indexnow-key); submission script in scripts/.
  • Google + Bing verification keys in public/.
  • LLM crawler guidance: public/llms.txt and public/llms-full.txt.

Image conventions

All site images live under public/images/ as of v2.3.0. References:

  • Static assets in MDX or JSX: /images/<subpath>/<file>.
  • TypeScript imports for static optimization: import x from '@/public/images/...'.
  • Dev / exploratory captures go in docs/screenshots/ (gitignored except docs/).

ffmpeg demo recipes

# Screen recording → WebM (primary)
ffmpeg -i recording.mov -c:v libvpx-vp9 -crf 30 -b:v 0 -an -vf "scale=1200:-2" public/demos/[name]/demo.webm

# MP4 fallback (Safari)
ffmpeg -i recording.mov -c:v libx264 -crf 23 -an -vf "scale=1200:-2" public/demos/[name]/demo.mp4

# GIF (GitHub READMEs only)
ffmpeg -i recording.mov -vf "fps=12,scale=900:-1:flags=lanczos,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" output.gif

Custom domain

This repo claims https://www.arkashj.com everywhere (sitemap, JSON-LD, OG images). To make that real:

vercel domains add arkashj.com
# Vercel prints DNS records — add at your registrar:
#   arkashj.com         A     76.76.21.21
#   www.arkashj.com     CNAME cname.vercel-dns.com
# SSL provisions automatically.

Until that's done the site lives at the latest vercel.app deploy URL (run vercel ls to see).

Documentation

  • 📘 CLAUDE.md — agent / Claude Code orientation (current architecture, data layer, image conventions)
  • 📋 CHANGELOG.md — every release, every change
  • 📂 docs/HANDOFF.md — extended project orientation
  • 📂 docs/TODO.md — open work, prioritized
  • 🎨 docs/superpowers/specs/2026-04-26-personal-website-revamp-design.md — original v2 design intent
  • 🏗️ arkashj.com/architecture — 6 live React/SVG diagrams of the running stack
  • 🌐 /docs route — same docs rendered as a polished in-site reading experience

License

Apache 2.0 © 2026 Arkash Jain — see LICENSE for full text.