arkashj.com
Personal website, canonical knowledge hub, and O-1 visa evidence portfolio for Arkash Jain — first-author of SpatialDINO, Harvard ML researcher (Kirchhausen Lab), Head of Forward Deployed Engineering at Benmore Technologies, four-time published author across cell biology, chemical physics, and self-supervised learning.
This README is intentionally long. It consolidates the website's contents — biography, credentials, publications, experience, knowledge, projects, internal tools, life changelog, stack, and external links — into a single navigable Markdown reference. Treat it as the canonical text-only mirror of arkashj.com.
Table of contents
- Who is Arkash · How to reach me
- Identity at a glance
- Life changelog — 2018 → present
- Education
- Awards & honors
- Publications
- Open-source contributions
- Experience
- Internal tooling I author and maintain
- Projects
- Knowledge — the six domains
- Writing
- Media — podcasts, press, talks
- Verifiable credentials
- About this repository
- Quickstart · Stack · Repository tree · Routes
- Adding content
- SEO infrastructure
- Image conventions
- License
Who is Arkash
Arkash Jain is a researcher and engineer working at the intersection of self-supervised computer vision, distributed systems, and forward-deployed AI consulting. He is currently Head of Forward Deployed Engineering at Benmore Technologies in Chicago, where he was the second technical hire and helped scale revenue 887% across 2025–2026 while serving as the lead engineer on engagements across SaaS, healthcare, NIL athletics, compliance (Vanta / SOC 2 / NIST / FedRAMP), and consumer verticals.
Before Benmore, he spent fifteen months as an ML researcher in the Kirchhausen Lab at Harvard Medical School / Boston Children's Hospital, where he designed SpatialDINO — the first 3D self-supervised vision transformer for cryo-electron tomography. SpatialDINO beat a prior approach led by a Nobel laureate on downstream subcellular structure prediction, was first-authored, and is currently a BioRxiv preprint. Two follow-on papers from the same lab were published in the Journal of Cell Biology in 2025 and 2026.
He started college at Boston University in 2020 as a physics student. By the end of his freshman year he was a UROP Scholar (1 of 5 freshmen selected university-wide), working on ultrafast 2D infrared spectroscopy of supercritical fluids in Larry Ziegler's lab. That work culminated in a co-authored 2022 paper in the Journal of Chemical Physics. He then pivoted into computer science, completed an accelerated BA/MS dual-degree in Math + CS / CS in four years, graduated Magna Cum Laude, was named the Marvin Freedman Scholar (1 of 6 in the entire BU mathematics department), TA'd four classes (CS411, CS131, EK301, MA581), and authored a Master's thesis on dynamic checkpointing in Apache Flink under the BU distributed-systems group.
In parallel he interned twice at Battery Ventures (sourcing + diligence), once at Boston Children's Hospital (an ALS resource discovery web app built to Section 508 / WCAG 2.1 AA), and once at ZeroSync (production Rust on tokio + NATS JetStream + a Merkle-tree POC for tamper-evident sync). He also turned down VC offers from MassMutual Ventures and State Street to stay at Battery, and was admitted to UCL, NTU Singapore, NYU, and Dartmouth before choosing BU.
He writes weekly at arkash.substack.com on AI hardware, finance, distributed systems, geopolitics, and venture strategy. He is the co-host of the STU STREET podcast (long-form interviews with founders, athletes, and professors, originally on WTBU). He has 7 distributed-systems articles on Medium and active accounts on X / Twitter and LinkedIn.
He arrived in the United States from Chandigarh, India in September 2020. This website is the central evidence hub for his O-1 visa application — every page is structured to be Google-indexed, link-rich, and built around verifiable external citations.
How to reach me
- Email — arkash@benmore.tech
- GitHub — @ArkashJ
- LinkedIn — /in/arkashj
- X / Twitter — @_arkash
- Substack — arkash.substack.com
- Medium — @arkjain
- ORCID — 0000-0003-2692-7472
- PubMed — arkash jain
- Google Scholar — profile
- BU CS profile — bu.edu/cs/profiles/arkash-jain
- Harvard / Kirchhausen Lab profile — kirchhausen.hms.harvard.edu
- STU STREET podcast — Apple · Spotify · YouTube (multi-episode)
Identity at a glance
| Name | Arkash Jain |
| Title | Head of Forward Deployed Engineering, Benmore Technologies |
| Prior | ML Researcher, Kirchhausen Lab — Harvard Medical School / Boston Children's Hospital |
| Education | Harvard University (Postgraduate Research, Computer Vision & AI, 2024–2025) · Boston University (BA Math + CS, MS CS — 4-yr accelerated, 2020–2024, Magna Cum Laude) |
| Citizenship | Indian national (in the US since Sep 2020); O-1 visa applicant |
| Hometown | Chandigarh, India |
| Currently in | Boston / Chicago (remote-first) |
| Languages spoken | English, Hindi, Punjabi |
| Languages programmed | Python, TypeScript, Rust, Go, Java, OCaml, C, R, MATLAB, JavaScript, Bash |
| First paper | Nov 2022 — J. Chem. Phys. (chemical physics) |
| First-author paper | 2025 — SpatialDINO on BioRxiv (3D self-supervised ViT) |
| Open-source | PyTorch issue #144779 — RDZV Infiniband backend |
| Currently writing | Weekly long-form essays at arkash.substack.com |
Life changelog
A reverse-chronological-friendly retelling of the path so far, in five named phases. Headings link to the on-site deep-dive timeline entries where one exists; otherwise the supporting source is linked inline.
Phase 0 — Wanting to be a physicist (2018–2020, Chandigarh, India)
Two years of E&M, particle physics, organic and physical chemistry, and optics — the standard JEE Advanced curriculum for kids who wanted to do physics in India. Sat JEE Advanced (≈1M candidates), placed roughly All-India rank 8,000 (top percentile). Sat AP Calculus, AP Physics C: Mechanics, and AP Physics C: E&M. Was admitted to University College London, Nanyang Technological University Singapore, Boston University, NYU, and Dartmouth. Picked Boston University — the physics department and proximity to MIT and Harvard labs were the deciding factors.
Phase 1 — Physicist at BU (Sep 2020 – 2022)
Arrived in the United States in September 2020. Joined Larry Ziegler's ultrafast spectroscopy lab as a freshman under PhD student Matt Rotondaro. Aligned femtosecond ultrafast laser systems for 2D infrared spectroscopy. Prepared supercritical Xe and SF₆ fluid samples for near-critical-density studies. Wrote the auto-correlation analysis code for rotational and vibrational energy relaxation traces.
Selected as NSF UROP Scholar (1 of 5 freshmen across the entire university) for the project "Ultrafast Two Dimensional Infrared Spectroscopy of Supercritical Fluids: Energy Relaxation and Local Critical Slowing Effects." Co-authored the Nov 2022 Journal of Chemical Physics paper on N₂O dynamics in supercritical solvents — IBC breakdown and critical slowing near the critical point — and a companion paper in the Journal of Physical Chemistry Letters (ACS).
The lab also taught me that I liked the math and the code more than the optics bench. By the end of sophomore year I had pivoted to computer science.
Phase 2 — Venture capital (Dec 2021 – Aug 2022)
Sourcing intern at Battery Ventures under Dallin Bills, working alongside GP Michael Brown. Got fluent in the early-stage B2B SaaS investment vocabulary: Rule of 40, ARR growth vs. burn multiples, gross retention vs. logo churn vs. net dollar retention, the magic number, CAC payback. Sourced three deals to partner-meeting stage (including CarNow). Got VC offers from MassMutual Ventures (Feb 2022) and State Street (Summer 2022); turned both down to return to Battery for a diligence summer.
Diligence intern at Battery the following summer — embedded with a portfolio company on its EU expansion strategy: pricing, GTM motion, regulatory fit, competitive landscape across the European market.
Phase 3 — Engineer + researcher at BU (2022 – May 2024)
Admitted to BU's accelerated BA/MS in Computer Science (BA Math + CS / MS CS, four years instead of six). TA'd four classes for ~300 students each: CS411 (Software Engineering), CS131 (Discrete Mathematics), EK301 (Mechanics), MA581 (Probability). Co-instructed a 300-level Mechanics course as a sophomore while carrying an 18-credit load and a 20-hour work week.
Six classes that year, each documented as a coursework deep-dive:
- CS 350 — Distributed Systems: from-scratch Raft consensus in Go — leader election, log replication, snapshot install RPC. Plus a MapReduce coordinator + worker with plugin-loaded map/reduce functions.
- CS 320 — Concepts of Programming Languages: hand-written lexer, parser, and stack-machine evaluator for a BNF language in OCaml.
- DS 522 — Optimization: Adam / AMSGrad / RMSProp comparison; article evaluations of Reddi 2018 and Toulis 2016 (implicit SGD).
- CS 561 — Data Mechanics / Cloud: built a generated-HTML mini-internet and ran PageRank locally vs. on GCP.
- CS 630 — Advanced Algorithms: 331,776-instance enumeration of Gale-Shapley to study average-case behavior; reservoir sampling.
- CS 611 — OOP & Design Patterns: Monsters & Heroes turn-based RPG + a Java Swing trading platform with singleton persistence.
- MA 582 — Mathematical Statistics (graduate): rigorous inference — MLE, sufficient statistics, MGFs, asymptotics.
Distributed Systems Research / BU Master's thesis: dynamic checkpointing in Apache Flink. Static checkpoint intervals are a tax in idle periods and a stall during bursts — built a controller that adapted cadence from live backpressure signals. Instrumented the Flink JobManager to surface per-operator backpressure ratios as a control signal. Benchmarked RocksDB state backend against in-memory; quantified write-amplification tradeoffs. Validated on the NEXMARK streaming benchmark; measured tail-latency wins on bursty workloads.
Internships during this phase:
- Boston Children's Hospital (Spring 2023) — Built an internal ALS resource-discovery web app for clinicians and patient families. React frontend wired to a Strapi headless CMS so non-technical staff could update content without a deploy. Swagger-documented REST API. Section 508 / WCAG 2.1 AA compliance — keyboard-only nav, ARIA landmarks, focus-visible rings, sufficient contrast, skip links, screen-reader tested. Validated search/filter UX with real ALS clinicians.
- ZeroSync (Summer 2023) — First production-grade Rust. Built an Excel-side marketplace and a server-side ingestion pipeline that converted unstructured data (CSVs, JSON dumps, free-form Excel) into structured records flowing through NATS JetStream. Spent the first two weeks deep in The Rust Book — the borrow checker forces you to internalize ownership, lifetimes, and Send/Sync before you can ship anything async.
tokio+async-traitfor concurrent I/O across hundreds of NATS subjects. Merkle-tree POC (repo) — SHA-256 + canonical JSON hashing + sorted pairwise concat — for tamper-evident sync of records across the pipeline. Excel side: JavaScript Office Add-in scaffolded with Yeoman (yo generator-office); generated TLS dev certificates withoffice-addin-dev-certs, trusted them in macOS Keychain, wired their paths intonats.confso the add-in published over TLS.
Graduated Magna Cum Laude from Boston University in May 2024 — BA in Math & CS, MS in CS — and was named the Marvin Freedman Scholar (1 of 6 mathematics undergraduates department-wide).
Phase 4 — Harvard Medical School / Kirchhausen Lab (May 2024 – Aug 2025)
Joined the Kirchhausen Lab at Harvard Medical School / Boston Children's Hospital, under Tom Kirchhausen (member, National Academy of Arts and Sciences). The lab images subcellular structures at near-atomic resolution via cryo-electron tomography; my job was to make the resulting volumes interpretable at scale.
Trained on multi-node DGX clusters: A100 / H100 GPUs, NVLink intra-node, Infiniband inter-node, RAID + custom NVMe storage tier. Used PyTorch FSDP with bf16 mixed precision and activation checkpointing to fit large 3D vision transformers. Diagnosed and reported a Rendezvous (RDZV) backend issue affecting Infiniband multi-node training — filed PyTorch issue #144779.
SpatialDINO — designed and trained the first 3D self-supervised vision transformer for subcellular structure prediction from cryo-electron tomograms. Adapted DINO-style self-supervised contrastive learning into 3D — student/teacher ViTs over volumetric tomograms. Pretrained on unannotated tomograms; fine-tuned on a tiny labeled set for vesicle / organelle classification. Beat the prior SOTA, including a Nobel-laureate-led approach, on downstream evaluation. Released as a BioRxiv preprint, first-author. The lessons are written up at /knowledge/ai/spatialdino-lessons.
Two co-authored follow-on papers appeared in the Journal of Cell Biology (Aug 2025): a volumetric reconstruction of mammalian ER exit sites at unprecedented resolution via FIB-SEM and learned segmentation; and a UNET architecture for semi-supervised segmentation. (JCB 225, e202504178)
Phase 5 — Benmore Technologies (Aug 2025 – present)
Joined Benmore Technologies as Employee #2 — Forward Deployed Strategist & Engineer. Embedded into client engineering teams, scoped systems end-to-end, and shipped production code from day one. Onboarded the first ten clients across SaaS, healthcare, NIL athletics, compliance (Vanta / SOC 2 / NIST / FedRAMP), and consumer verticals — including Patriot Safety Services (compliance for Chevron / NextTier, $5–10M / yr), Nobel Gas, and Sun Theory.
Cross-stack: Stripe, Django, Next.js, FastAPI, React Native, plus authoring Claude Code skill systems at scale (one of the first companies to use Claude Code commercially in production engagements). Authored the Benmore Foundry CLI — internal orchestration layer for SMB AI consulting engagements.
Promoted to Head of FDE in April 2026, leading the forward-deployed engineering practice across all client engagements. Revenue acceleration during this period: $150k total → $150k every 15 days — 887% growth in six months. Headcount 8 → 40. The full mechanism is written up at /writing/the-fde-feedback-loop.
Education
| Institution | Degree | Dates | Notes |
|---|---|---|---|
| Harvard University | Postgraduate Research, Computer Vision & AI | 2024–2025 | Kirchhausen Lab. SpatialDINO + 2 Journal of Cell Biology papers. PyTorch upstream contribution. |
| Boston University | BA — Mathematics + Computer Science · MS — Computer Science (4-yr accelerated dual-degree) | 2020–2024 | Magna Cum Laude. Marvin Freedman Scholar (1 of 6 in math dept). NSF UROP Scholar (1 of 5 freshmen university-wide). Dean's List every semester. Master's thesis: Dynamic Checkpointing in Apache Flink. |
Verifiable credentials (PDF) are linked under Verifiable credentials below.
Awards & honors
- NSF UROP Scholar (2021) — 1 of 5 freshmen selected university-wide at BU. Project: Ultrafast Two-Dimensional Infrared Spectroscopy of Supercritical Fluids: Energy Relaxation and Local Critical Slowing Effects. — UROP Symposium 2021 brochure (PDF) · UROP 2020-2021 awarded students
- Marvin Freedman Scholar (2024) — 1 of 6 mathematics undergraduates honored department-wide at BU.
- Magna Cum Laude (2024) — Boston University.
- Dean's List — every semester at BU.
- JEE Advanced (2019) — ≈8,000 / ≈1M All-India rank.
- Battery Ventures — admitted to two consecutive internships (sourcing 2021–22, diligence 2022); chosen over MassMutual Ventures and State Street offers.
- University admissions (2020) — UCL, NTU Singapore, BU, NYU, Dartmouth (chose BU).
- PyTorch open-source contribution — issue #144779 accepted into the upstream tracker for the RDZV / Infiniband backend.
Publications
Four formal publications, in reverse-chronological order. All have permanent DOI / preprint server URLs.
1. SpatialDINO: Self-Supervised Learning for 3D Vision Transformers — BioRxiv, 2025
First-author, Kirchhausen Lab, Harvard Medical School. The first 3D self-supervised vision transformer for subcellular structure prediction from cryo-electron tomograms. Beat a prior approach led by a Nobel laureate on downstream evaluation.
2. Close-up of Vesicular ER Exit Sites by Volume Electron Imaging using FIB-SEM — Journal of Cell Biology, 2026
Volumetric reconstruction of mammalian ER exit sites at unprecedented resolution via FIB-SEM and learned segmentation. Co-author, Kirchhausen Lab.
3. UNET for Semi-Supervised Segmentation — Journal of Cell Biology, 2025
Co-author. Semi-supervised UNET segmentation as part of the broader Kirchhausen Lab program.
4. Ultrafast 2DIR comparison of rotational energy transfer, isolated binary collision breakdown, and near-critical fluctuations in Xe and SF₆ solutions — Journal of Chemical Physics, Nov 2022
First publication. Femtosecond two-dimensional infrared spectroscopy of N₂O dynamics in supercritical xenon and SF₆ — IBC breakdown and critical slowing near the critical point. Companion paper in Journal of Physical Chemistry Letters (ACS).
- DOI: 10.1063/5.0118395
- AIP: pubs.aip.org/aip/jcp/article-abstract/157/17/174305
- PubMed: 36347695
- ACS companion: J. Phys. Chem. Lett.
Open-source contributions
- pytorch/pytorch#144779 — Diagnosed and reported a Rendezvous (RDZV) backend issue affecting Infiniband-backed multi-node distributed training during SpatialDINO scale-out on Harvard's DGX cluster.
- ArkashJ/merkle_tree — Rust Merkle-tree POC (SHA-256 + canonical JSON hashing + sorted pairwise concat) for tamper-evident sync of NATS-streamed records. Built during ZeroSync internship; full write-up at /knowledge/distributed-systems/merkle-tree-rust-poc.
- ArkashJ/Raft — From-scratch Raft consensus implementation in Go: leader election, log replication, snapshot install RPC.
- ArkashJ/CloudComputing — BU CS591 coursework: MapReduce, Spark, distributed kv-store experiments.
- ArkashJ/NEXMARK-Benchmark — Streaming-systems benchmark suite implementation against Apache Flink.
- ArkashJ/implict-SGD-implementation — Implicit stochastic gradient descent — convergence improvements for ill-conditioned problems.
- ArkashJ/CS411_labs — TA materials and reference solutions for the BU undergraduate software-engineering course.
- ArkashJ/excel_connector — Yeoman-scaffolded JavaScript Office Add-in talking to a Rust + NATS pipeline.
Profile: github.com/ArkashJ.
Experience
| Role | Org | Dates | Location |
|---|---|---|---|
| Head of FDE — Forward Deployed Strategist & Engineer | Benmore Technologies | Aug 2025 – present | Remote (Chicago HQ) |
| ML Researcher | Harvard Medical School / Kirchhausen Lab | May 2024 – Aug 2025 | Boston, MA |
| SWE Intern (Rust) | ZeroSync | May – Aug 2023 | Remote |
| SWE Intern (ALS resource tool) | Boston Children's Hospital | Jan – May 2023 | Boston, MA |
| TA + Distributed Systems Researcher | Boston University | 2021 – 2024 | Boston, MA |
| Analyst — Diligence Extern | Battery Ventures | May – Aug 2022 | Boston, MA |
| Analyst — Sourcing Extern | Battery Ventures | Dec 2021 – Apr 2022 | Boston, MA |
| Undergraduate Researcher (NSF UROP) | BU Chemistry / Ziegler Lab | Jan – Aug 2021 | Boston, MA |
Detailed bullets per role live at arkashj.com/experience.
Internal tooling I author and maintain
| Tool | Purpose | Stack |
|---|---|---|
| Benmore Foundry CLI | Internal orchestration layer for SMB AI consulting engagements — kicks off scoped agents, books work, manages handoffs | Python · Typer · Claude Code |
| RTK — Rust Token Killer | Token-optimized CLI proxy for Claude Code; 60–90% token savings on dev operations through transparent rewrite hooks | Rust · Claude Code Hooks |
| Compound Engineering Skills | Authored Claude Code skills for code review, debugging, planning, brainstorming, frontend design — used by team daily | Markdown · Claude Code Skills |
| Excalidraw Discovery Flows | Reusable client-discovery diagram set used during scoping engagements at Benmore | Excalidraw · Process |
Internal tools page: arkashj.com/work.
Projects
A representative slice — full list with GitHub links lives at arkashj.com/projects.
- SpatialDINO (2025) — First 3D SSL ViT for cryo-ET subcellular structures. PyTorch · FSDP · DGX · cryo-ET.
- Raft (Go) (2023) — From-scratch consensus impl: leader election, log replication, snapshotting.
- CloudComputing (2023) — BU CS591 — MapReduce, Spark, distributed kv-store.
- NEXMARK Benchmark (2023) — Streaming-systems benchmark against Apache Flink.
- Implicit SGD (2024) — Convergence on ill-conditioned problems.
- CS411 Labs (2023) — TA materials for BU software-engineering course.
- Benmore Foundry CLI (2025) — Python · Typer · Claude Code orchestration.
- Dynamic Checkpointing in Apache Flink (2024) — BU thesis; adaptive cadence on backpressure.
- OCaml Interpreter (2022) — Tree-walking interpreter for a typed functional language.
- Spotify ↔ YouTube transfer (2022) — Migrate playlists between music platforms via API matching.
- STU STREET podcast (2022 –) — Co-hosted long-form interviews on WTBU.
- ALS Resource Tool (2023) — Resource-discovery tool for ALS patients — Django + Postgres at Boston Children's.
- merkle_tree (Rust) (2023) — Tamper-evident sync POC at ZeroSync.
Knowledge — the six domains
The site organizes durable, citable knowledge into six domains. Each page is an MDX deep-dive with external sources.
| Domain | What's in it |
|---|---|
| AI | Self-supervised learning, vision transformers, distributed training infrastructure (FSDP, NCCL, Infiniband), SpatialDINO lessons. |
| Finance | Aggregation theory, AI infrastructure as a structural trade, public thesis tracker. |
| Distributed Systems | Flink, RocksDB, Raft, MapReduce, compression, checkpointing, Merkle trees. |
| Math | Optimizers, convergence, intuition behind the proofs. |
| Physics | Supercritical fluids, nuclear reactor efficiency, why I left physics. |
| Software | Stack evolution, Claude Code, the tools that make me 10× — TypeScript strict, ergonomics, agentic engineering. |
Writing
arkash.substack.com — weekly long-form essays. Topics: AI hardware, economics, finance, geopolitics, venture strategy.
Site-hosted essays (MDX, render at /writing/[slug]):
- why-fde — why forward-deployed engineering is the right model right now.
- the-fde-feedback-loop — the engagement mechanism behind 887% growth.
- o1-visa-evidence-hub — building a website as O-1 evidence.
- distributed-checkpointing — adaptive cadence under backpressure (Flink thesis).
- sample-ai-hardware — the AI hardware stack from training to deploy.
Also active on Medium @arkjain — 7 distributed-systems articles.
Media — podcasts, press, talks
- STU STREET — co-hosted long-form interview podcast. Originally on WTBU. 25 episodes, including conversations with Benmore leadership and BU faculty. Available on Apple Podcasts and Spotify; episode embeds live at arkashj.com/media.
- Benmore talks — internal talks, embedded on /media.
- Trustpilot reviews — 5★ social proof, surfaced on /media.
Verifiable credentials
PDF originals are committed to the repository under public/images/files/ and served from the website at /credentials:
- BA — Mathematics & CS, Boston University
- MS — Computer Science, Boston University
- Harvard University ID
The full credentials page including external profile cross-references (BU CS, Kirchhausen Lab, ORCID, PubMed) lives at arkashj.com/credentials.
About this repository
The remainder of this README is for engineers reading the source. Skip if you only came for the bio.
Quickstart
git clone https://github.com/ArkashJ/Personal-Website.git
cd Personal-Website
npm install
npm run dev # http://localhost:3000
Production build:
npm run build && npm run start
Stack
| Layer | Technology |
|---|---|
| Framework | Next.js 15.5 (App Router) + React 19 |
| Language | TypeScript strict |
| Styling | Tailwind CSS 3 + CSS variables for dark/light theme |
| Fonts | Geist Sans + Geist Mono via next/font (geist package) |
| Content | MDX rendered via next-mdx-remote/rsc (server components, zero client JS) |
| Embeds | Tweet · YouTube · LinkedIn · Substack · Gist (in MDX) via react-tweet and bespoke components |
| Theme | next-themes with data-theme attribute, sun/moon toggle |
| Icons | lucide-react |
| Search | cmdk — Cmd+K command palette |
| Markdown | react-markdown + remark-gfm + rehype-slug for /docs |
| SEO | Native app/sitemap.ts + app/robots.ts + per-page OG via next/og |
| JSON-LD | Person · Article · ScholarlyArticle schemas via lib/structured-data.ts |
| Deployment | Vercel (auto-deploy from main via GitHub integration) |
| CI | GitHub Actions — lint + format check + build on every PR / push |
| Pre-commit | Husky + lint-staged (Prettier + ESLint on staged files) |
Repository tree
Personal-Website/
├── app/ # Next.js App Router
│ ├── about/ # Life Changelog
│ │ ├── archive/ # Pre-revamp legacy bio (snapshot)
│ │ └── timeline/[slug]/ # Per-milestone deep dives
│ ├── architecture/ # 6 React/SVG diagrams of the stack
│ ├── coursework/ # BU + Harvard coursework hub
│ │ └── [slug]/
│ ├── credentials/ # Verifiable PDF credentials
│ ├── docs/ # In-site rendering of docs/*.md
│ │ └── [slug]/
│ ├── experience/ # 8 reverse-chrono work entries
│ ├── knowledge/ # 6 domain hub
│ │ └── [domain]/[slug]/ # MDX deep dives
│ ├── learnings/ # 12+ hard-won lessons
│ ├── media/ # Podcasts · press · Substack · Medium
│ ├── projects/ # Real GitHub projects
│ ├── research/ # 4 papers + ML stack + PyTorch contribution
│ ├── stack/ # uses.tech-style — 36 entries × 7 categories
│ ├── VC/ # Server redirect (legacy)
│ ├── Volunteering/ # Server redirect (legacy)
│ ├── work/ # Internal CLIs (Foundry, RTK, Skills, Excalidraw)
│ ├── writing/ # Tagged essay index
│ │ └── [slug]/ # MDX articles
│ ├── apple-icon.tsx # Apple touch icon
│ ├── globals.css # Tailwind + CSS-variable theme
│ ├── layout.tsx # Root layout (Person JSON-LD, Nav, Footer, fonts)
│ ├── manifest.ts # PWA manifest
│ ├── not-found.tsx # 404
│ ├── opengraph-image.tsx # Static OG for /
│ ├── page.tsx # Homepage (Hero, Arc, Now, Research, Work, Projects)
│ ├── robots.ts # robots.txt MetadataRoute
│ └── sitemap.ts # sitemap.xml MetadataRoute
├── components/
│ ├── architecture/ # SVG diagrams
│ ├── docs/ # Doc-rendering helpers
│ ├── embeds/ # <Tweet> <YouTube> <LinkedInPost> <Substack> <Gist>
│ ├── layout/ # <Nav> <Footer> <Container> <SectionHeader> <Pill> <HeroDemo>
│ ├── sections/ # <PaperCard> <ProjectCard> <TimelineItem> ...
│ ├── seo/ # <JsonLd>
│ ├── ui/ # <BackLink> <CommandPalette> <InstitutionLogo> ...
│ ├── MdxContent.tsx # next-mdx-remote/rsc renderer
│ ├── ThemeProvider.tsx
│ └── ThemeToggle.tsx
├── content/
│ ├── coursework/ # Course MDX
│ │ ├── fall-2023/
│ │ └── spring-2023/
│ ├── knowledge/ # 6 domain MDX directories
│ │ ├── ai/
│ │ ├── distributed-systems/
│ │ ├── finance/
│ │ ├── math/
│ │ ├── physics/
│ │ └── software/
│ └── writing/ # Long-form MDX essays
├── lib/ # Typed data + helpers (single source of truth)
│ ├── content.ts # MDX frontmatter loaders
│ ├── coursework.ts # Course data
│ ├── data.ts # Papers, experience, projects, timeline, ...
│ ├── docs.ts # docs/*.md loader for /docs
│ ├── finance.ts # Theses + trade log
│ ├── learnings.ts # Learnings cards
│ ├── media.ts # Podcast, Medium, Substack, press
│ ├── metadata.ts # buildMetadata() factory
│ ├── og.tsx # Shared 1200×630 OG renderer
│ ├── site.ts # SITE constants + NAV_LINKS
│ ├── stack.ts # uses.tech entries
│ └── structured-data.ts # Person · Article · ScholarlyArticle JSON-LD
├── public/
│ ├── favicon.svg # Next.js favicon convention
│ ├── images/ # SINGLE consolidated location for all site images
│ │ ├── profile.jpeg # Author photo (Timeline, JSON-LD, OG fallback)
│ │ ├── logos/ # Institution logos (BU, Harvard, BCH, NSF)
│ │ ├── files/ # Verifiable PDF credentials
│ │ └── legacy/ # Pre-revamp assets (archival; do not add new content)
│ ├── timeline/ # Reserved for per-milestone hero images
│ ├── llms.txt · llms-full.txt # AI crawler guidance
│ ├── humans.txt · robots.txt
│ └── *.html · *.txt # Google + IndexNow verification keys
├── docs/
│ ├── architecture/ # ASCII flow archive
│ ├── screenshots/ # Dev screenshots (gitignored except this dir)
│ └── superpowers/ # Specs, plans, notes from build sessions
├── scripts/ # Vercel + IndexNow + build utilities
├── types/ # Type augmentation (CSS imports, etc.)
├── .github/workflows/ci.yml # GitHub Actions — lint + format + build
├── .husky/ # Pre-commit hooks
├── CHANGELOG.md # Every release
├── CLAUDE.md # Agent / Claude Code orientation
├── LICENSE # Apache 2.0
├── README.md # ← you are here
├── next.config.js # Security headers + image remotePatterns
├── package.json # Dependencies + scripts
├── tailwind.config.js # Theme + sharp edges
├── tsconfig.json # Strict mode, @/* aliases
└── vercel.json # Headers + caching
Routes
/ — Hero · Arc · Now · Research · Work · Projects · Knowledge · Writing
/about — Life Changelog
/about/timeline/[slug] — Per-milestone deep dive
/about/archive — Pre-revamp legacy bio
/research — 4 papers + ML stack + PyTorch contribution
/experience — 8 reverse-chrono entries
/projects — 13 real projects with GitHub links
/work — Foundry · RTK · Skills · Excalidraw
/writing — Essay index
/writing/[slug] — MDX article
/knowledge — 6 domains
/knowledge/[domain] — domain index
/knowledge/[domain]/[slug] — MDX deep dive
/coursework — BU + Harvard coursework
/coursework/[slug] — Course detail
/credentials — Verifiable PDFs
/media — Podcasts, Medium, Substack, press
/stack — uses.tech-style page (36 × 7)
/learnings — 12+ lessons
/architecture — 6 React/SVG diagrams
/docs · /docs/[slug] — In-site rendering of docs/*.md
/sitemap.xml — All static + dynamic MDX routes
/robots.txt — Allow-all + sitemap pointer
/manifest.webmanifest — PWA manifest
Common commands
npm run dev # Dev server
npm run build # Production build (must pass before push)
npm run start # Serve the built output
npm run lint # ESLint
npm run lint:fix # ESLint --fix
npm run format # Prettier write
npm run format:check # Prettier check (CI uses this)
vercel deploy --prod --yes # Manual production deploy
# Branch flow
git checkout -b feat/short-name
git add -A && git commit -m "feat(scope): one-line"
git push -u origin feat/short-name
gh pr create --base main --head feat/short-name --title "..." --body "..."
gh pr merge --squash --delete-branch
Adding content
| Surface | How |
|---|---|
| New writing post | Add MDX to content/writing/*.mdx with frontmatter (title, date, tags, description). Picked up automatically. |
| New knowledge article | content/knowledge/[domain]/*.mdx. New domain = new folder + entry in KNOWLEDGE_DOMAINS in lib/data.ts. |
| New paper / experience / project / internal tool | Edit the relevant array in lib/data.ts. |
| New podcast / Medium / Substack / press | lib/media.ts. |
| New thesis / trade | lib/finance.ts. |
| New stack entry | lib/stack.ts. |
| New learning | lib/learnings.ts. |
| New nav link | NAV_LINKS in lib/site.ts. |
| New course | lib/coursework.ts + optional MDX in content/coursework/. |
| New verifiable credential | Drop PDF in public/images/files/ + entry in app/credentials/page.tsx. |
Embedding social posts in MDX
<Tweet id="1234567890" />
<YouTube id="dQw4w9WgXcQ" />
<LinkedInPost urn="7165432109876543210" />
<Substack publication="arkash" slug="some-post" />
<Gist user="ArkashJ" id="abc123def456" />
Components live in components/embeds/, wired via components/MdxContent.tsx.
SEO infrastructure
Because this site is the central evidence hub for an O-1 visa application, SEO is load-bearing.
app/sitemap.tsenumerates every static and dynamic MDX route.app/robots.tsallow-all + sitemap pointer.- Per-page
generateMetadata()viabuildMetadata()(lib/metadata.ts). - Per-page JSON-LD via
lib/structured-data.ts— Person on every page;ScholarlyArticleon research;Articleon writing. - Static OG images for top-level routes via
opengraph-image.tsx. - Dynamic per-post OG for MDX routes (
opengraph-image.tsxcolocated in dynamic route folders) usingnext/og+ the shared template inlib/og.tsx. - IndexNow keyfile committed at the repo root (
.indexnow-key); submission script inscripts/. - Google + Bing verification keys in
public/. - LLM crawler guidance:
public/llms.txtandpublic/llms-full.txt.
Image conventions
All site images live under public/images/ as of v2.3.0. References:
- Static assets in MDX or JSX:
/images/<subpath>/<file>. - TypeScript imports for static optimization:
import x from '@/public/images/...'. - Dev / exploratory captures go in
docs/screenshots/(gitignored except docs/).
ffmpeg demo recipes
# Screen recording → WebM (primary)
ffmpeg -i recording.mov -c:v libvpx-vp9 -crf 30 -b:v 0 -an -vf "scale=1200:-2" public/demos/[name]/demo.webm
# MP4 fallback (Safari)
ffmpeg -i recording.mov -c:v libx264 -crf 23 -an -vf "scale=1200:-2" public/demos/[name]/demo.mp4
# GIF (GitHub READMEs only)
ffmpeg -i recording.mov -vf "fps=12,scale=900:-1:flags=lanczos,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" output.gif
Custom domain
This repo claims https://www.arkashj.com everywhere (sitemap, JSON-LD, OG images). To make that real:
vercel domains add arkashj.com
# Vercel prints DNS records — add at your registrar:
# arkashj.com A 76.76.21.21
# www.arkashj.com CNAME cname.vercel-dns.com
# SSL provisions automatically.
Until that's done the site lives at the latest vercel.app deploy URL (run vercel ls to see).
Documentation
- 📘 CLAUDE.md — agent / Claude Code orientation (current architecture, data layer, image conventions)
- 📋 CHANGELOG.md — every release, every change
- 📂
docs/HANDOFF.md— extended project orientation - 📂
docs/TODO.md— open work, prioritized - 🎨
docs/superpowers/specs/2026-04-26-personal-website-revamp-design.md— original v2 design intent - 🏗️ arkashj.com/architecture — 6 live React/SVG diagrams of the running stack
- 🌐
/docsroute — same docs rendered as a polished in-site reading experience
License
Apache 2.0 © 2026 Arkash Jain — see LICENSE for full text.