Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -81,3 +81,5 @@ compile_commands.json
# nvFuser environment configuration tool temporary files
nvfuser_env.sh
.nvfuser-apply.*.sh

venv
131 changes: 128 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,13 +94,138 @@ If you already cloned without `--recursive`, initialize submodules:
git submodule update --init --recursive
```

2. Install Python dependencies:
2. Install system dependencies:

The build process requires a few utilities to be installed. The following is a
probably-incomplete list, expressed in terms of what must be added to Docker
image ubuntu:24.04. The commands should be run as `root`.

```bash
apt update # may get errors re: needing ca-certificates if running in a fresh Docker container
apt-get -y install ca-certificates
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
dpkg -i cuda-keyring_1.1-1_all.deb
apt update
apt-get -y install \
build-essential \
cmake \
cuda-toolkit-13-2 \
git \
libcurl4-openssl-dev \
libedit-dev \
libnccl-dev \
libzstd-dev \
llvm-dev \
python3-pip \
virtualenv \
#
```

Some additional packages are required for runtime operation and tests:

```bash
apt-get install \
cudnn9-cuda-13-2=9.22.0.52-1 \
libcudnn9-cuda-13=9.22.0.52-1 \
libcudnn9-static-cuda-13=9.22.0.52-1 \
libcudnn9-dev-cuda-13=9.22.0.52-1 \
libcudnn9-headers-cuda-13=9.22.0.52-1
```

The specific versions and packages must be adapted to suite the CUDA version
that you have installed, and the dependencies of the relevant version of
package `cudnn9-cuda-13-2`. A simple `apt install cudnn9-cuda-13-2` may yield
errors re: package version conflicts for the auto-installed dependency
libraries.

3. Set up CUDA

Some environment variables need to be set up to build against CUDA. An example
of how to do this is:

```bash
cat >> ~/.bashrc <<'ENDOFHERE'
export CUDA_HOME=/usr/local/cuda
export PATH="${CUDA_HOME}/bin":"${PATH}"
export LD_LIBRARY_PATH="${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}"
export CUDACXX="${CUDA_HOME}/bin/nvcc"
ENDOFHERE
```

Then log out and in, or otherwise restart your shell.

4. Create a Python virtual environment

By default, `pip` installs Python packages system-wide. This should never be
done on a system that uses a package-manager, and indeed recent versions of
`pip` refuse to do so, to prevent conflicts with system packages. The correct
approach is to create a Python virtual environment. `pip` supports this, and
it avoids conflicts with system-provided packages:

```bash
virtualenv venv
. ./venv/bin/activate
```

5. Install Python dependencies:

```bash
pip install -r requirements.txt
```

3. Build and install nvFuser:
Some build-time dependencies are missing from `requirements.txt`. To install
them, execute:

```bash
pip install setuptools wheel numpy
pip install torch --index-url https://download.pytorch.org/whl/cu132
```

The specific index URL for Torch should be adapted to suite the CUDA version
that you have installed.

Some additional packages are required for runtime operation on tests:

```bash
pip install \
apache-tvm-ffi \
expecttest \
looseversion \
nvidia-cutlass-dsl \
pytest \
thunder \
transformers
pip install \
--no-build-isolation transformer_engine[pytorch]
Comment on lines +196 to +199

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The trailing backslash on the transformers line causes shell line continuation, so the second pip install and its arguments are concatenated onto the first command. The shell sees one long command: pip install apache-tvm-ffi ... transformers pip install --no-build-isolation transformer_engine[pytorch], which passes the literal strings pip, install, and --no-build-isolation as package names — the transformer_engine installation never runs as a separate command.

Suggested change
thunder \
transformers \
pip install \
--no-build-isolation transformer_engine[pytorch]
thunder \
transformers
pip install \
--no-build-isolation transformer_engine[pytorch]

pip install \
lightning-thunder
```

You may need to restart your shell so that the shell "sees" the new executables
installed into any virtual environment; confirm e.g. whether `command -V pytest`
points at your virtual environment's version of `pytest` or not. `hash -r` or
`hash -r pytest` may fix any issues.

6. Optionally select a GPU architecture to build for

*IF* you are building within Docker, you will need to either:
- Import your GPU into the container (not described here), or
- Set environment variables to tell the build process which GPU architecture
to compile for (see below).

To find your GPU architecture, run the following on the host:

```bash
nvidia-smi --query-gpu=compute_cap --format=csv
```

To tell the build process which GPU to build for:

```bash
export TORCH_CUDA_ARCH_LIST="8.9"
```

7. Build and install nvFuser:

```bash
pip install --no-build-isolation -e python -v
Expand Down Expand Up @@ -146,7 +271,7 @@ MAX_JOBS=8 NVFUSER_BUILD_BUILD_TYPE=Debug pip install --no-build-isolation -e py
Test your installation with a simple fusion:

```python
python -c "import nvfuser; print('nvFuser successfully imported from:', nvfuser.__file__)"
python -c "import nvfuser_direct as nvfuser; print('nvFuser successfully imported from:', nvfuser.__file__)"
```

Run the Python test suite:
Expand Down
Loading