1. Tools

  • Teamwork: git & github + CI/CD
  • Environment:
    • Docker & Singularity (Apptainer)
    • Linux (Debian & Ubuntu & RockyLinux & Fedora)
    • Windows & WSL2 (pin memory)
    • CMake + vcpkg
    • pip/conda/uv
  • IDE: VSCode (Linux/Windows) & VS (Windows)
  • Others: Bash & Powershell & Vim & Tmux

2. Languages & Frameworks

  • Languages: C & C++ & CUDA & Python & pytorch & libtorch & Triton
  • Inference: vLLM & sglang
  • SFT: trl & unsloth
  • RLHF: verl
  • Compute Graph: MLIR

4. Concepts

4.1. NLP

  • Inference:
    • TP (Megatron)
    • Quantization: PTQ & QAT & GPTQ
    • Pruning: Unstructured & Structured
    • Paged Attention & Flash Attention & MQA & GQA & DeepSeek Sparse Attention (MLA)
    • Prefix Caching
    • Continuous Batching
    • Chunked Prefill
    • Speculative Decoding
    • Sampling: Top-k & Top-p & Temperature & Beam Search
  • Training:
  • Cluster:
    • Apptainer + Slurm + Docker + Module
    • torchrun & deepspeed & accelerate & bitsandbytes
    • NCCL & Gloo & MPI
    • Ray
    • InfiniBand
    • Evaluation: HPCG