Investigating work relating to learning across times scales, temporal abstraction, and event modeling.
Clone and navigate to the project:
git clone https://github.com/APRashedAhmed/time-scales.git
cd time-scalesInstall the conda environment, any extra dependencies, and the repo:
# Install the desired base environment
./run install conda-envs/ts102u2.yaml --dev --jupyterCreate a softlink to data_cifs and weights:
cd time-scales
ln -s /media/data_cifs/projects/prj_timescales/arasheda/data .
ln -s /media/data_cifs/projects/prj_timescales/arasheda/weights .
The repo uses the bouncing ball task repo to generate bouncing ball sequences. To install it as a package:
# Make sure the environment is activated
conda activate ts102u2
# Clone the repo
git clone [email protected]:APRashedAhmed/Bouncing-Ball-Task.git
# Navigate into the directory
cd Bouncing-Ball-Task
# Install the package
pip install -e .[OPTIONAL] Create env file with environment variables that aren't tracked by git. Start by copying the template and add in relevant information:
$ cp .env.template .env
$ more .env
# This is a template of the file that can be used for storing private and user
# specific environment variables, like keys or system paths. By default .env
# will be excluded from version control the variables declared in .env are
# loaded automatically in run.py
NEPTUNE_API_TOKEN=""
SLACK_WEBHOOK_URL=""Train model with default configuration
# Default
./run experiment
# Train using GPU 3
./run experiment --cuda 3
# Train using both GPU 5 and 6
./run experiment --cuda 5,6 -a trainer.gpus=2
# Rerun the experiment 10 times
./run experiment -n 10Train model with chosen experiment configuration from configs/experiment/
./run experiment -a experiment=experiment_nameYou can override any parameter from command line like this
./run experiment -a trainer.max_epochs=20 datamodule.batch_size=64To see more examples of how to use the run experiment API, run with the -e argument:
./run experiment -eSince the project was installed, all its components are importable:
# Import the top-level module
import timescales as ts
# Import a specific model
from timescales.models import LinearDecoder
# Import a specific datamodule
from timescales.datamodules import SAYCamDataModuleSome portions of the project were done using Jupyter notebooks. To include the
jupyter packages, run the install script with the -j or --jupyter flags:
./run install <path_to_yaml> --jupyterTo install the dependencies manually, run the following command to update the existing conda environment:
conda env update -f conda-envs/jupyter.yamlSee the README file in directories with jupyter notebooks (workbooks for example) for more details.
Additional requirements are necessary for development. To include the development
packages, run the install script with the -d or --dev flags:
./run install <path_to_yaml> --devTo install the dependencies manually, run the following command to update the existing conda environment:
conda env update -f conda-envs/dev.yamlAnd now install the precommit hooks for the project:
pre-commit installThe minimal workflow is as follows:
- Write your PyTorch Lightning model (see this linear_decoder.py for example)
- Write your PyTorch Lightning datamodule (see saycam.py for example)
- Write your experiment workflow if needed (see train_test.py for example)
- Write your experiment config, containing paths to your model and datamodule (see temporal_classification.py for example)
- Run the experiment with the corresponding config:
./run experiment -a experiment=<experiment_name>
When committing changes, a set of precommit hooks will be run, which broadly check for code oversights and formatting, before executing the commit (see .pre-commit-config.yaml for the full list of hooks used in this project):
# Commit command that passes
$ git commit -m '<commit message>'
Trim Trailing Whitespace.................................................Passed
Debug Statements (Python)................................................Passed
Detect Private Key.......................................................Passed
Check Yaml...............................................................Passed
Check for merge conflicts................................................Passed
Fix End of Files.........................................................Passed
black....................................................................Passed
isort....................................................................Passed
[<branch> <commit hash>] <commit message>In the event that a commit does not pass all the hooks, the failing files will be modified to comply with the hook requirements:
# Commit command that has two failures
$ git commit -m '<commit message>'
Trim Trailing Whitespace.................................................Passed
Debug Statements (Python)................................................Passed
Detect Private Key.......................................................Passed
Check Yaml...............................................................Passed
Check for merge conflicts................................................Passed
Fix End of Files.........................................................Passed
black....................................................................Failed
- hook id: black
- files were modified by this hook
reformatted timescales/datamodules/saycam.py
All done! ✨ 🍰 ✨
1 file reformatted, 48 files left unchanged.
isort....................................................................Failed
- hook id: isort
- files were modified by this hook
Fixing timescales/datamodules/video.pyChecking git status confirms the previously staged files have new changes:
$ git status
On branch <branch>
Your branch is up-to-date with 'origin/<branch>'.
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: timescales/datamodules/saycam.py
modified: timescales/datamodules/video.py
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: timescales/datamodules/saycam.py
modified: timescales/datamodules/video.py
Which need to be re-staged before running the commit again, after which the precommit hooks should pass:
# Stage the files
$ git add timescales/datamodules/saycam.py timescales/datamodules/video.py
# Commit with additional changes that now pass precommit hooks
$ git commit -m '<commit message>'
Trim Trailing Whitespace.................................................Passed
Debug Statements (Python)................................................Passed
Detect Private Key.......................................................Passed
Check Yaml...............................................................Passed
Check for merge conflicts................................................Passed
Fix End of Files.........................................................Passed
black....................................................................Passed
isort....................................................................Passed
[<branch> <commit hash>] <commit message>The hooks can also be run directly without executing a commit:
# Run commit hooks on currently staged files
pre-commit run
# Run commit hooks on all valid files
pre-commit run -aThe repo is structured as the following
├── .autoenv.template # Template to auto activate a conda environment
│ # and enable hydra tab completion
│
├── conda-envs/ # Directory for conda environments
│ ├── cuda10.yaml # CUDA 10 environment
│ ├── cuda11.yaml # Identical to above but with CUDA 11
│ └── dev.yaml # Development tools
│
├── configs/ # Hydra configuration files
│ ├── callbacks/ # Callback configs
│ ├── datamodule/ # Datamodule configs
│ ├── experiment/ # Experiment configs
│ ├── logger/ # Logger configs (neptune only for now)
│ ├── model/ # Model configs
│ ├── trainer/ # Trainer configs
│ │
│ └── config.yaml # Main project configuration file
│
├── data/ # Directory to all data used in the repo
│ ├── raw/ # Raw, unaltered data
│ ├── interim/ # Data that has not been fully processed
│ └── processed/ # Fully processed and usable data
│
├── .env.template # Template of file for storing private
│ # environment variables
│
├── .gitignore # List of files/folders ignored by git
├── LICENSE # Brown software license
│
├── logs/ # Directory for logs
│ ├── multiruns/ # Multirun logs
│ └── runs/ # Single run logs
│
├── models/ -> weights/ # Softlink to weights directory (legacy)
├── notebooks/ # Directory for jupyter notebooks. Naming
│ # convention is a number (for ordering), the
│ # creator's initials, and a short `-` delimited
│ # description, e.g. `1.0-apra-initial-data-exploration.ipynb`
│
├── notes.md # Development notes for the repo
├── .pre-commit-config.yaml # Config for precommit hooks
├── pytest.ini # Pytest config file
├── README.md # README for the repo
│
├── run.py # Run any pipeline with chosen experiment
│ # configuration
│
├── setup.py # Python setup file for installing the repo
│
├── timescales/ # Main code (importables) for the repo
│ ├── babyvision/ # Code taken from the Lake repo
│ ├── base/ # Base classes
│ ├── constants.py # Constants
│ ├── datamodules/ # Pytorch Lightning datamodules
│ ├── experiments/ # Specific experiment workflows
│ ├── index.py # Paths to locations outside the importables
│ ├── loss.py # Custom loss functions
│ ├── metrics.py # Custom metrics
│ ├── models/ # Pytorch Lightning models
│ ├── tests/ # Unit tests
│ └── utils/ # Utility scripts
│
└── weights # Directory of saved weights and checkpoints
Hydra creates new working directory for every executed run.
By default, logs have the following structure:
│
├── logs
│ ├── runs # Folder for logs generated from single runs
│ │ ├── 2021-02-15 # Date of executing run
│ │ │ ├── 16-50-49 # Hour of executing run
│ │ │ │ ├── .hydra # Hydra logs
│ │ │ │ ├── wandb # Weights&Biases logs
│ │ │ │ ├── checkpoints # Training checkpoints
│ │ │ │ └── ... # Any other thing saved during training
│ │ │ ├── ...
│ │ │ └── ...
│ │ ├── ...
│ │ └── ...
│ │
│ └── multiruns # Folder for logs generated from multiruns (sweeps)
│ ├── 2021-02-15_16-50-49 # Date and hour of executing sweep
│ │ ├── 0 # Job number
│ │ │ ├── .hydra # Hydra logs
│ │ │ ├── wandb # Weights&Biases logs
│ │ │ ├── checkpoints # Training checkpoints
│ │ │ └── ... # Any other thing saved during training
│ │ ├── 1
│ │ ├── 2
│ │ └── ...
│ ├── ...
│ └── ...
│
You can change this structure by modifying paths in main project configuration.