 Command

Sam Foreman's personal site. Vim-style keybinds for navigation; theme + font pickers below.

Theme
 Font
Keybinds
Navigation
j / ↓ Next item k / ↑ Previous item g First item in region G Last item in region zz Center focused item h / l Move left/right region ] / [ Next/previous heading } / { Next/previous block ⌃D / ⌃U Half-page down/up
Layout
<zh> / <zl> Toggle left/right sidebar <zj> / <zk> Focus main/navbar <S-h/j/k/l> Focus left/main/navbar/right ⌃H / ⌃L Focus left/right sidebar ⌃J / ⌃K Focus main/navbar ⇧C / ⇧E Collapse / expand all sections
Dialogs
⌃P / : Command palette ⌃X Theme picker / Search ? Show keybinds Esc / ⌃C Close dialog
History
⌃N Next document ⌃B Previous document ⌃O History back ⌃I History forward
 Search
about: Sam Foreman about/more: πŸͺͺ More docs/test: Docs Test ideas: πŸ’‘ Ideas more: βž• More now: Now posts: πŸ“¬ Posts webtui: Style projects: πŸ“š Projects talks: πŸŽ™οΈ Talks posts/ai-for-physics: βš›οΈ AI for Physics posts/auroragpt: πŸ€– AuroraGPT posts/2025: πŸ“† 2025 posts/dope-slides: πŸ’… How to Make Dope Slides posts/jupyter: πŸ“— Jupyter posts/ezpz-at-alcf: πŸ‹ ezpz @ ALCF posts/torchtune-aurora: πŸͺ› Torchtune on Aurora posts/ezpz-v1: πŸ“ ezpz-v1 posts/resume: πŸ§‘πŸ»β€πŸ’» Sam Foreman’s RΓ©sumΓ© posts/svgbob: πŸ«₯ svgbob posts/torchtune-patch-aurora: πŸš‘ Torchtune Patch on Aurora webtui/installation/nextjs: Next.js webtui/installation/astro: Astro webtui/installation/astro: ## Scoping webtui/installation/astro: ### Frontmatter Imports webtui/installation/astro: ### β€Ήstyleβ€Ί tag webtui/installation/astro: ### Full Library Import webtui/installation/vite: Vite webtui/contributing/contributing: Contributing webtui/contributing/contributing: ## Local Development webtui/contributing/contributing: ## Issues webtui/contributing/contributing: ## Pull Requests webtui/contributing/style-guide: Style Guide webtui/contributing/style-guide: ## CSS Units webtui/contributing/style-guide: ## Selectors webtui/contributing/style-guide: ## Documentation webtui/components/accordion: Accordion webtui/components/badge: Badge webtui/components/button: Button webtui/components/checkbox: Checkbox webtui/components/dialog: Dialog webtui/components/popover: Popover webtui/components/input: Input webtui/components/pre: Pre webtui/components/progress: Progress webtui/components/radio: Radio webtui/components/range: Range webtui/components/separator: Separator webtui/components/spinner: Spinner webtui/components/switch: Switch webtui/components/table: Table webtui/components/textarea: Textarea webtui/components/tooltip: Popover webtui/components/typography: Typography webtui/components/view: View webtui/plugins/plugin-nf: Nerd Font Plugin webtui/plugins/theme-catppuccin: Catppuccin Theme webtui/plugins/plugin-dev: Developing Plugins webtui/plugins/plugin-dev: ### Style Layers webtui/plugins/theme-custom: Custom Theme webtui/plugins/theme-everforest: Everforest Theme webtui/plugins/theme-gruvbox: Gruvbox Theme webtui/plugins/theme-nord: Nord Theme webtui/plugins/theme-vitesse: Vitesse Theme webtui/start/ascii-boxes: ASCII Boxes webtui/start/changelog: Changelog webtui/start/installation: Installation webtui/start/installation: ## Installation webtui/start/installation: ## Using CSS webtui/start/installation: ## Using ESM webtui/start/installation: ## Using a CDN webtui/start/installation: ## Full Library Import webtui/start/installation: ### CSS webtui/start/installation: ### ESM webtui/start/installation: ### CDN webtui/start/intro: Introduction webtui/start/intro: ## Features webtui/start/plugins: Plugins webtui/start/plugins: ## Official Plugins webtui/start/plugins: ### Themes webtui/start/plugins: ## Community Plugins webtui/start/tuis-vs-guis: TUIs vs GUIs webtui/start/tuis-vs-guis: ## Monospace Fonts webtui/start/tuis-vs-guis: ## Character Cells webtui/start/theming: Theming webtui/start/theming: ## CSS Variables webtui/start/theming: ### Font Styles webtui/start/theming: ### Colors webtui/start/theming: ### Light & Dark webtui/start/theming: ## Theme Plugins webtui/start/theming: ### Using Multiple Theme Accents talks/auroragpt-siam25: AuroraGPT talks/ai-for-science-2024: Parallel Training Methods talks/alcf-hpc-workshop-2024/alcf-hpc-workshop-2024: Deep Learning and Foundation Models at Scale talks/aurora-gpt-fm-for-electric-grid/auroragpt-fm-for-electric-grid: AuroraGPT: Foundation Models for Science talks/demo-slides: AuroraGPT: Training Foundation Models on Supercomputers talks/hpc-user-forum/auroragpt: AuroraGPT talks/incite-hackathon-2025: ALCF Incite Hackathon 2025 talks/llms-on-polaris: Training LLMs on Polaris talks/openskai25: Open SkAI2025 talks/llms-at-scale: Training LLMs at Scale posts/ai-for-physics/l2hmc-qcd: 🎒 L2HMC for LQCD posts/ai-for-physics/diffusion: 🎲 MCMC + Diffusion Sampling posts/2025/06: 06 posts/auroragpt/aurora-gpt: 🏎️ Megatron-DeepSpeed on Intel XPU posts/auroragpt/checkpoints: πŸ’Ύ Converting Checkpoints posts/auroragpt/long-sequences: πŸš‚ Loooooooong Sequence Lengths posts/auroragpt/determinstic-flash-attn/deterministic-flash-attn: 🎰 Deterministic `flash-attn` posts/auroragpt/flash-attn-sunspot: πŸ“Έ `flash-attn` on Sunspot posts/auroragpt/spike-skipper: πŸ”οΈ Spike Skipper posts/auroragpt/mpi4py-reproducer: πŸ› `mpi4py` bug on Sunspot posts/auroragpt/startup-times: 🐒 Starting Up Distributed Training on Aurora posts/auroragpt/startup-times: ## Response posts/auroragpt/startup-times: ### Measuring / Calculating Startup Time posts/auroragpt/startup-times: ## Minimal Working Example posts/jupyter/l2hmc-4dsu3: πŸ”³ `l2hmc-qcd` Example: 4D SU(3) posts/jupyter/test: 🏁 `l2hmc` Example: 2D $U(1)$ talks/auroragpt/alcf-hpc-workshop-2024/auroragpt-alcf-hands-on-hpc-workshop-2024: AuroraGPT: ANL's General Purpose Scientific LLM talks/incite-hackathon-2025/auroragpt: LLMs on Aurora: Overview talks/incite-hackathon-2025/ezpz: LLMs on Aurora: Hands-On talks/openskai25/ai4science: Scientific AI at Scale: AuroraGPT talks/openskai25/training: Scientific AI at Scale: Distributed Training posts/2025/04/28: πŸ”₯ Building PyTorch 2.6 from Source on Aurora posts/2025/05/03: 🚧 Frameworks Issue with numpy \β€Ί 2 posts/2025/06/02: πŸ§œβ€β™€οΈ Mermaid posts/2025/06/14: πŸ—οΈ Building PyTorch 2.8 from Source on Aurora posts/2025/09/17: πŸ“Š `pbs-tui`: TUI for PBS Job Scheduler Monitoring posts/2025/06/01: πŸ“° Nice Headings posts/2025/11/12: 🧊 Cooling Down Checkpoints: Best Practices for Model Evaluation posts/2026/02/28: ⏱️ Comparing Launchers on Aurora posts/2026/02/28: ## torchrun posts/2026/02/28: ## ezpz posts/2025/10/06: 🎨 Mixing Between Distributions While Training posts/2026/01/07: πŸŽ‰ Happy New Year! posts/2026/05/01: Running 50k Python Processes on Aurora with ezpz yeet posts/2026/04/27: Pre-Training AuroraGPT with TorchTitan posts/2026/04/27: ## Two-Week Summary (Apr 12–27, 2026) posts/2026/04/27: ## Detailed Breakdown posts/2026/04/27: ### Week 1: Apr 12–18 β€” Benchmarking, LR Finder, XPU Fixes posts/2026/04/27: #### Benchmarking (Apr 12–15) posts/2026/04/27: #### LR Finder (Apr 12–14) posts/2026/04/27: #### Scaling Study (Apr 12) posts/2026/04/27: #### Upstream Syncs (Apr 12–18, syncs 6–14) posts/2026/04/27: #### XPU Bug Fixes (Apr 18) posts/2026/04/27: #### RL Experiment (Apr 18) posts/2026/04/27: ### Week 1.5: Apr 18–25 β€” Production Readiness posts/2026/04/27: #### Torch 2.12 Benchmarks (Apr 18) posts/2026/04/27: #### LR Finder Extensions (Apr 20–21) posts/2026/04/27: #### XPU Fixes (Apr 23) posts/2026/04/27: #### Torch 2.13 Environment (Apr 25) posts/2026/04/27: #### 2B Scaling Study on Torch 2.13 (Apr 25) posts/2026/04/27: #### Production Training (Apr 25) posts/2026/04/27: ### Week 2: Apr 26–27 β€” Optimizer Competition posts/2026/04/27: #### RL Multi-Task Refactor (Apr 26) posts/2026/04/27: #### Docs Reorganization (Apr 26) posts/2026/04/27: #### Generic HF Dataset Streaming (Apr 26) posts/2026/04/27: #### New Optimizers (Apr 26) posts/2026/04/27: #### Architecture Tweaks (Apr 26–27) posts/2026/04/27: ## Competition Results posts/2026/04/27: ### Round 1–3: Speedrun β€” 2N, GBS=48, 1000 steps posts/2026/04/27: ### 10B Full Training β€” 8N, GBS=384, ~3,178 steps posts/2026/04/27: ### Round 4: Reproducible Speedrun β€” 2N, GAS=8, GBS=384, 1000 steps posts/2026/04/27: ## Key Discoveries posts/2026/04/27: ## Infrastructure Built posts/2026/04/27: ## High-Level posts/2026/04/27: ## Detailed Breakdown posts/2026/04/27: ### Week 1: Apr 12–18 β€” Benchmarking, LR Finder, XPU Fixes posts/2026/04/27: #### Benchmarking (Apr 12–15) posts/2026/04/27: #### LR Finder (Apr 12–14) posts/2026/04/27: #### Scaling Study (Apr 12) posts/2026/04/27: #### Upstream Syncs (Apr 12–18, syncs 6–14) posts/2026/04/27: #### XPU Bug Fixes (Apr 18) posts/2026/04/27: #### RL Experiment (Apr 18) posts/2026/04/27: ### Week 1.5: Apr 18–25 β€” Production Readiness posts/2026/04/27: #### Torch 2.12 Benchmarks (Apr 18) posts/2026/04/27: #### LR Finder Extensions (Apr 20–21) posts/2026/04/27: #### XPU Fixes (Apr 23) posts/2026/04/27: #### Torch 2.13 Environment (Apr 25) posts/2026/04/27: #### 2B Scaling Study on Torch 2.13 (Apr 25) posts/2026/04/27: #### Production Training (Apr 25) posts/2026/04/27: ### Week 2: Apr 26–27 β€” Optimizer Competition posts/2026/04/27: #### RL Multi-Task Refactor (Apr 26) posts/2026/04/27: #### Docs Reorganization (Apr 26) posts/2026/04/27: #### Generic HF Dataset Streaming (Apr 26) posts/2026/04/27: #### New Optimizers (Apr 26) posts/2026/04/27: #### Architecture Tweaks (Apr 26–27) posts/2026/04/27: ## Competition Results posts/2026/04/27: ### Round 1–3: 1000-step speedruns, 2 nodes, GBS=48 (17 configs) posts/2026/04/27: ### Round 4 (10B full training, 8 nodes, GBS=384, 5 configs) posts/2026/04/27: ### Round 5 (2 nodes, GAS=8, GBS=384, local dataset, 8 configs β€” in progress) posts/2026/04/27: ## Key Discoveries posts/2026/04/27: ## Infrastructure Built posts/ai-for-physics/l2hmc-qcd/2du1: 🎒 l2hmc-qcd Example: 2D U(1) posts/2026/01/10: πŸ‹ ezpz: distributed PyTorch across any hardware posts/jupyter/l2hmc/4dsu3: πŸ”³ l2hmc-qcd Example: 4D SU(3) talks/2025/10/08: AERIS: Argonne's Earth Systems Model talks/2025/09/24: Training Foundation Models on Supercomputers talks/2025/12/16: AuroraGPT: Training Foundation Models on Supercomputers talks/2025/10/15: Training Foundation Models on Supercomputers talks/2026/06/03: Production Pre-Training at Scale: The Good, the Bad, and the Restarts talks/2025/10/24: Training Foundation Models on Supercomputers posts/ai-for-physics/l2hmc-qcd/4dsu3nb/index-broken: πŸ•ΈοΈ l2hmc-qcd Example: 4D SU(3) posts/2025/09/12: 🍹 BlendCorpus + TorchTitan @ ALCF posts/drafts/2025/09/22: πŸ“ 2025 Annual Report
 Theme Current: Light j/k or ↑/↓ + Enter

πŸ“Š `pbs-tui`: TUI for PBS Job Scheduler Monitoring

A terminal dashboard for monitoring PBS Pro job schedulers with interactive keybindings and snapshot modes.

Sam Foreman 2025-09-17

pbs-tui

FigureΒ 1: A terminal dashboard for monitoring PBS Pro schedulers

πŸ‘€ Overview

A terminal user interface built with Textual for monitoring PBS Pro schedulers at the Argonne Leadership Computing Facility.

The dashboard surfaces job, queue, and node activity in a single view and refreshes itself automatically so operators can track workload health in real time.

🐣 Getting Started

  • Try it with uv:

    # install uv if necessary
    # curl -LsSf https://astral.sh/uv/install.sh | sh
    uv run --with pbs-tui pbs-tui
  • Or install and run:

    python3 -m pip install pbs-tui
    pbs-tui

✨ Features

  • Live PBS data – prefers the JSON (-F json) output of qstat/pbsnodes and falls back to XML or text parsing so schedulers without newer flags continue to work.

    • Automatic refresh – updates every 30 seconds by default with a manual refresh binding (r).
    • Summary cards – quick totals for job states, node states, and queue health.
  • Inline snapshot – render the current queue as a Rich table with pbs-tui --inline

    • Save to file – write the snapshot to a Markdown file with pbs-tui --inline --file snapshot.md
  • Fallback sample data – optional bundled data makes it easy to demo the interface without connecting to a production scheduler (PBS_TUI_SAMPLE_DATA=1).

🎹 Key bindings

TableΒ 1: Use the arrow keys/PageUp/PageDown to move through rows once a table has focus.

KeyAction
qQuit the application
rRefresh immediately
jFocus the jobs table
nFocus the nodes table
uFocus the queues table
^-pOpen the command palette

πŸ§ͺ Sample mode

If you want to explore the UI without a live PBS cluster, export PBS_TUI_SAMPLE_DATA=1 (or pass force_sample=True to PBSDataFetcher). The application will display bundled example jobs, nodes, and queues along with a warning banner indicating that the data is synthetic.

Headless / automated runs

For automated testing or CI environments without an interactive terminal you can run the TUI in headless mode by exporting PBS_TUI_HEADLESS=1. Pairing this with PBS_TUI_AUTOPILOT=quit presses the q binding automatically after startup so pbs-tui exits cleanly once the interface has rendered its first update.

Inline snapshot mode

When running non-interactively you can emit a Rich-rendered table summarising the active PBS jobs instead of starting the Textual interface:

PBS_TUI_SAMPLE_DATA=1 pbs-tui --inline

The command prints a table that can be pasted into terminals that support Unicode box drawing. Pass --file snapshot.md alongside --inline to also write an aligned Markdown table to snapshot.md for sharing in chat or documentation systems. Any warnings raised while collecting data are written to standard error so they remain visible in logs.

Architecture

  • pbs_tui.fetcher.PBSDataFetcher orchestrates qstat/pbsnodes calls, preferring JSON output and falling back to XML/text before converting everything into structured dataclasses (Job, Node, Queue).
  • pbs_tui.app.PBSTUI is the Textual application that renders the dashboard, periodically asks the fetcher for new data, and updates the widgets.
  • pbs_tui.samples.sample_snapshot provides the demonstration snapshot used when PBS commands cannot be executed.

The UI styles are defined in pbs_tui/app.tcss. Adjust the CSS to change layout or theme attributes.

Development notes

  • The application refresh interval defaults to 30 seconds. Pass a different value to PBSTUI(refresh_interval=...) if desired.
  • Errors encountered while running PBS commands are surfaced in the status bar so operators can quickly see when data is stale.
  • When both PBS utilities are unavailable and the fallback is disabled, the UI will show an empty dashboard with an error message in the status bar.

Screenshots

  • pbs-tui:

    pbs-tui
  • Keys and Help Panel:

    Keys and Help Panel
  • Command palette:

    Command palette
  • theme support:

    theme support
NORMAL  main  sam.onl /posts/2025/09/17 Β· Top 1:1