Conda and Anaconda
Conda is a Python package manager, virtual environment manager, and more.
Conda is a package manager, similar to pip. It helps you take care of your different packages by handling installing, updating and removing them. The advantage over pip is that it automatically creates isolated environments for different projects, and it can install data science libraries that are not written in Python (e.g., "R", C, etc.). It is the most popular package manager for data science.
Anaconda is a "batteries included" distribution of Python that includes over 150 data science packages. It uses Conda as its package manager.
Conda vs Pip#
Both Conda and Pip are package managers written in Python. The following table shows the differences between the two1:
|manages||binaries||wheel or source|
|can require compilers||no||yes|
|creates isolated environments||yes, built-in||no, requres virtualenv or venv|
|package sources||Anaconda repo or cloud||PyPI|
|recommended for||data science||Python-only code|
Setting up a Conda Environment#
To use Conda, you need to configure your shell environment using the
conda init command. You only need to do this once.
Thenceforth, when you log in to the HPC, your shell will be configured for Conda.
From a login node:
- If you need a specific version of anaconda, use
anaconda/VERSION. To see a list of versions available on the cluster, run
module avail anaconda.
- If you are using a shell other than bash (e.g.,
fish, substitute that here).
- If you are using a shell other than bash, you will need to source your shell initialization file; if you do not know what that file name is, you can log out and log back in to the HPC.
The conda initalization may cause your shell to take a second or two longer to load.
If you no longer need to use the conda package manager, you can edit your shell initialization script
~/.bashrc in the BASH environment), and remove the lines between and including
# >>> conda initialize >>> and
# <<< conda initialize <<<.
Managing Conda Environments#
You can create as many Conda environments as you wish. Each environment is isolated to a single directory, and you can have as many environments as you need:
For example, to create a Conda environment named
Read and accept the prompts. When the script completes, you will see the following:
Run the command
conda activate my_conda_app:
Work in your Conda environment:
To de-activate your Conda environment:
We provide several pre-installed Anaconda environments globally on the HPC. To load the default Anaconda version, load the environment module:
If you need a specific version, you can use the
module avail anaconda command, which will show the available
versions on the HPC:
You can ensure that you activated Anaconda successfully by checking the Python path using the
Discovering Anaconda Packages#
The list of packages included with Anaconda differs based on which module you use. The best way to determine what packages
are installed in Anaconda is to run the
conda list command.
Installing Anaconda in your home directory#
If you want to use the latest version of Anaconda, you can create a conda environment in your home directory and install Anaconda in it.