Skip to content

APRashedAhmed/time-scales

Repository files navigation

Time-Scales

PyTorch Lightning Config: hydra Code style: black Template

Investigating work relating to learning across times scales, temporal abstraction, and event modeling.

Table of Contents

Getting Started

Installing Dependencies

Clone and navigate to the project:

git clone https://github.com/APRashedAhmed/time-scales.git
cd time-scales

Install the conda environment, any extra dependencies, and the repo:

# Install the desired base environment
./run install conda-envs/ts102u2.yaml --dev --jupyter

Create a softlink to data_cifs and weights:

cd time-scales
ln -s /media/data_cifs/projects/prj_timescales/arasheda/data .
ln -s /media/data_cifs/projects/prj_timescales/arasheda/weights .

The repo uses the bouncing ball task repo to generate bouncing ball sequences. To install it as a package:

# Make sure the environment is activated
conda activate ts102u2

# Clone the repo
git clone [email protected]:APRashedAhmed/Bouncing-Ball-Task.git

# Navigate into the directory
cd Bouncing-Ball-Task

# Install the package
pip install -e .

[OPTIONAL] Create env file with environment variables that aren't tracked by git. Start by copying the template and add in relevant information:

$ cp .env.template .env
$ more .env

# This is a template of the file that can be used for storing private and user
# specific environment variables, like keys or system paths. By default .env
# will be excluded from version control the variables declared in .env are
# loaded automatically in run.py

NEPTUNE_API_TOKEN=""
SLACK_WEBHOOK_URL=""

Running Experiments

Train model with default configuration

# Default
./run experiment

# Train using GPU 3
./run experiment --cuda 3

# Train using both GPU 5 and 6
./run experiment --cuda 5,6 -a trainer.gpus=2

# Rerun the experiment 10 times
./run experiment -n 10

Train model with chosen experiment configuration from configs/experiment/

./run experiment -a experiment=experiment_name

You can override any parameter from command line like this

./run experiment -a trainer.max_epochs=20 datamodule.batch_size=64

To see more examples of how to use the run experiment API, run with the -e argument:

./run experiment -e

Importing from timescales

Since the project was installed, all its components are importable:

# Import the top-level module
import timescales as ts

# Import a specific model
from timescales.models import LinearDecoder

# Import a specific datamodule
from timescales.datamodules import SAYCamDataModule

Jupyter Notebooks

Some portions of the project were done using Jupyter notebooks. To include the jupyter packages, run the install script with the -j or --jupyter flags:

./run install <path_to_yaml> --jupyter

To install the dependencies manually, run the following command to update the existing conda environment:

conda env update -f conda-envs/jupyter.yaml

See the README file in directories with jupyter notebooks (workbooks for example) for more details.

Development

Installing Requirements

Additional requirements are necessary for development. To include the development packages, run the install script with the -d or --dev flags:

./run install <path_to_yaml> --dev

To install the dependencies manually, run the following command to update the existing conda environment:

conda env update -f conda-envs/dev.yaml

And now install the precommit hooks for the project:

pre-commit install

Development Workflow

The minimal workflow is as follows:

  1. Write your PyTorch Lightning model (see this linear_decoder.py for example)
  2. Write your PyTorch Lightning datamodule (see saycam.py for example)
  3. Write your experiment workflow if needed (see train_test.py for example)
  4. Write your experiment config, containing paths to your model and datamodule (see temporal_classification.py for example)
  5. Run the experiment with the corresponding config: ./run experiment -a experiment=<experiment_name>

Precommit Hooks

When committing changes, a set of precommit hooks will be run, which broadly check for code oversights and formatting, before executing the commit (see .pre-commit-config.yaml for the full list of hooks used in this project):

# Commit command that passes
$ git commit -m '<commit message>'

Trim Trailing Whitespace.................................................Passed
Debug Statements (Python)................................................Passed
Detect Private Key.......................................................Passed
Check Yaml...............................................................Passed
Check for merge conflicts................................................Passed
Fix End of Files.........................................................Passed
black....................................................................Passed
isort....................................................................Passed
[<branch> <commit hash>] <commit message>

In the event that a commit does not pass all the hooks, the failing files will be modified to comply with the hook requirements:

# Commit command that has two failures
$ git commit -m '<commit message>'

Trim Trailing Whitespace.................................................Passed
Debug Statements (Python)................................................Passed
Detect Private Key.......................................................Passed
Check Yaml...............................................................Passed
Check for merge conflicts................................................Passed
Fix End of Files.........................................................Passed
black....................................................................Failed
- hook id: black
- files were modified by this hook

reformatted timescales/datamodules/saycam.py
All done! ✨ 🍰 ✨
1 file reformatted, 48 files left unchanged.

isort....................................................................Failed
- hook id: isort
- files were modified by this hook

Fixing timescales/datamodules/video.py

Checking git status confirms the previously staged files have new changes:

$ git status

On branch <branch>
Your branch is up-to-date with 'origin/<branch>'.
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   timescales/datamodules/saycam.py
	modified:   timescales/datamodules/video.py

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   timescales/datamodules/saycam.py
	modified:   timescales/datamodules/video.py

Which need to be re-staged before running the commit again, after which the precommit hooks should pass:

# Stage the files
$ git add timescales/datamodules/saycam.py timescales/datamodules/video.py

# Commit with additional changes that now pass precommit hooks
$ git commit -m '<commit message>'

Trim Trailing Whitespace.................................................Passed
Debug Statements (Python)................................................Passed
Detect Private Key.......................................................Passed
Check Yaml...............................................................Passed
Check for merge conflicts................................................Passed
Fix End of Files.........................................................Passed
black....................................................................Passed
isort....................................................................Passed
[<branch> <commit hash>] <commit message>

The hooks can also be run directly without executing a commit:

# Run commit hooks on currently staged files
pre-commit run

# Run commit hooks on all valid files
pre-commit run -a

Navigating the Project

Directory Structure

The repo is structured as the following

├── .autoenv.template          # Template to auto activate a conda environment
│                              # and enable hydra tab completion
│
├── conda-envs/		           # Directory for conda environments
│   ├── cuda10.yaml                # CUDA 10 environment
│   ├── cuda11.yaml                # Identical to above but with CUDA 11 
│   └── dev.yaml                   # Development tools
│
├── configs/                   # Hydra configuration files
│   ├── callbacks/                 # Callback configs
│   ├── datamodule/                # Datamodule configs
│   ├── experiment/                # Experiment configs
│   ├── logger/                    # Logger configs (neptune only for now)
│   ├── model/                     # Model configs
│   ├── trainer/                   # Trainer configs
│   │
│   └── config.yaml                # Main project configuration file
│
├── data/                      # Directory to all data used in the repo
│   ├── raw/                       # Raw, unaltered data
│   ├── interim/                   # Data that has not been fully processed
│   └── processed/                 # Fully processed and usable data
│
├── .env.template              # Template of file for storing private
│                              # environment variables
│
├── .gitignore                 # List of files/folders ignored by git
├── LICENSE                    # Brown software license
│
├── logs/                      # Directory for logs
│   ├── multiruns/                 # Multirun logs
│   └── runs/                      # Single run logs
│
├── models/ -> weights/        # Softlink to weights directory (legacy)
├── notebooks/                 # Directory for jupyter notebooks. Naming 
│                              # convention is a number (for ordering), the 
│                              # creator's initials, and a short `-` delimited 
│                              # description, e.g. `1.0-apra-initial-data-exploration.ipynb`
│
├── notes.md                   # Development notes for the repo
├── .pre-commit-config.yaml    # Config for precommit hooks
├── pytest.ini                 # Pytest config file
├── README.md                  # README for the repo
│
├── run.py                     # Run any pipeline with chosen experiment 
│                              # configuration
│
├── setup.py                   # Python setup file for installing the repo
│
├── timescales/                 # Main code (importables) for the repo
│   ├── babyvision/                # Code taken from the Lake repo
│   ├── base/                      # Base classes
│   ├── constants.py               # Constants
│   ├── datamodules/               # Pytorch Lightning datamodules
│   ├── experiments/               # Specific experiment workflows
│   ├── index.py                   # Paths to locations outside the importables
│   ├── loss.py                    # Custom loss functions
│   ├── metrics.py                 # Custom metrics
│   ├── models/                    # Pytorch Lightning models
│   ├── tests/                     # Unit tests
│   └── utils/                     # Utility scripts
│
└── weights                    # Directory of saved weights and checkpoints

Logs

Hydra creates new working directory for every executed run.
By default, logs have the following structure:

│
├── logs
│   ├── runs                    # Folder for logs generated from single runs
│   │   ├── 2021-02-15              # Date of executing run
│   │   │   ├── 16-50-49                # Hour of executing run
│   │   │   │   ├── .hydra                  # Hydra logs
│   │   │   │   ├── wandb                   # Weights&Biases logs
│   │   │   │   ├── checkpoints             # Training checkpoints
│   │   │   │   └── ...                     # Any other thing saved during training
│   │   │   ├── ...
│   │   │   └── ...
│   │   ├── ...
│   │   └── ...
│   │
│   └── multiruns               # Folder for logs generated from multiruns (sweeps)
│       ├── 2021-02-15_16-50-49     # Date and hour of executing sweep
│       │   ├── 0                       # Job number
│       │   │   ├── .hydra                  # Hydra logs
│       │   │   ├── wandb                   # Weights&Biases logs
│       │   │   ├── checkpoints             # Training checkpoints
│       │   │   └── ...                     # Any other thing saved during training
│       │   ├── 1
│       │   ├── 2
│       │   └── ...
│       ├── ...
│       └── ...
│

You can change this structure by modifying paths in main project configuration.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors