Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: CI

on:
push:
pull_request:

jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]

steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: python -m pip install -e ".[dev]"

- name: Run tests
run: python -m pytest

- name: Build package
run: python -m build
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
out
*.ugli.*
.DS_Store

# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down Expand Up @@ -127,6 +128,7 @@ venv.bak/
.mypy_cache/
.dmypy.json
dmypy.json
.ruff_cache/

# Pyre type checker
.pyre/
81 changes: 34 additions & 47 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,71 +1,58 @@
# pymini

Python minifier. Built to operate on entire libraries, persisting and supporting minification across files.
`pymini` minifies Python source code by simplifying syntax, shortening identifiers, and stripping unnecessary whitespace. It supports single-file input and small groups of related modules.

## Installation

pip install pymini

## Usage
## Status

pymini [options] <file>
This project is maintained as an AST-based minifier for Python 3.9+ code. It is best suited to scripts and small module graphs that use straightforward imports such as `from module import name`.

To uglify a library, use the following options to preserve
your ability to import and use the library's publicly-facing
utilities.
## Installation

pymini --keep-module-names --keep-global-variables <file>
```bash
python3 -m pip install pymini
```

## Comparison
## CLI

We run comparisons against the following:
Minify a single file, a directory, or a glob:

- pyminify - https://github.com/dflook/python-minifier
- pyminifier - https://github.com/liftoff/pyminifier
- mnfy - https://github.com/brettcannon/mnfy (does not work on Python >3.4)
```bash
pymini "src/**/*.py" -o out
```

To repeat our results, run the following to setup.
If you need module names and top-level public symbols to remain stable, keep them explicitly:

```
pip install python-minifier
pip install setuptools==57.5.0 && pip install pyminifier # hack to get pyminifer to install
pip install mnfy # if you're running python3.4
pip install pymini # ours
```bash
pymini src --keep-module-names --keep-global-variables -o out
```

Then, run the following to get mini'd versions of the sample file `sample/test.py`, which comes from `pyminifer`'s repository.
Create a single bundled output file:

```
mkdir -p out
pyminify --rename-globals --remove-literal-statements sample/test.py > out/pyminify.py
pyminifier --obfuscate sample/test.py > out/pyminifier.py
python -m mnfy sample/test.py > out/mnfy.py
uglipy sample/test.py > out/pyminiest.py
```bash
pymini src --single-file -o out/bundle.py
```

Then, run `ls -lh out`. You should see the following.
Without `--keep-module-names`, output filenames may also be shortened as part of the minification pass.

```
total 24
-rw-r--r-- 1 alvinwan staff 414B Nov 25 01:22 pyminiest.py
-rw-r--r-- 1 alvinwan staff 602B Nov 25 01:19 pyminifier.py
-rw-r--r-- 1 alvinwan staff 490B Nov 25 01:18 pyminify.py
```
## Python API

By comparison, the original file size was 1355B; `uglipy` achieves the smallest file size, 16% smaller than `pyminify` and 30% smaller than `pyminifier`, improving the best possible obfuscated file size reduction from 64% to 71%. We can also test against `test2.py`, which comes from `pyminify`'s repository.
```python
from pymini import minify

sources, modules = minify(
[
"def square(x):\n return x ** 2\n",
"from main import square\nprint(square(3))\n",
],
["main", "side"],
)
```
-rw-r--r-- 1 alvinwan staff 914B Nov 25 02:09 pyminiest.py
-rw-r--r-- 1 alvinwan staff 1.4K Nov 25 01:32 pyminifier.py
-rw-r--r-- 1 alvinwan staff 977B Nov 25 01:32 pyminify.py
```

By comparison, the original file size was 1990B. `uglipy`'s file size is 6% smaller than `pyminify` and 34% smaller than `pyminifier`, improving the best possible obfuscated file size reduction from 51% to 54%.

## Develop
## Development

Run tests using the following, from the root directory
Install development dependencies and run the test suite:

```bash
python3 -m pip install -e ".[dev]"
python3 -m pytest
```
py.test --doctest-modules
```
Empty file removed conftest.py
Empty file.
12 changes: 12 additions & 0 deletions pymini/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
"""Public package interface for pymini."""

from importlib.metadata import PackageNotFoundError, version

from .pymini import minify

try:
__version__ = version("pymini")
except PackageNotFoundError:
__version__ = "0.0.0"

__all__ = ["__version__", "minify"]
5 changes: 5 additions & 0 deletions pymini/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from .cli import main


if __name__ == "__main__":
raise SystemExit(main())
131 changes: 115 additions & 16 deletions pymini/cli.py
Original file line number Diff line number Diff line change
@@ -1,35 +1,134 @@
import glob
from argparse import ArgumentParser
from pathlib import Path
from typing import Iterable, Optional, Sequence

from pymini import __version__
from pymini.pymini import minify
from argparse import ArgumentParser

def main():
parser = ArgumentParser()

def build_parser() -> ArgumentParser:
parser = ArgumentParser(prog="pymini")
parser.add_argument('path', help='Path to the file or directory to minify')
parser.add_argument('--keep-module-names', action='store_true', help='Keep module names as they are. Useful for compressing libraries')
parser.add_argument('--keep-global-variables', action='store_true', help='Keep global variables as they are. Useful for compressing libraries')
parser.add_argument('--single-file', action='store_true', help='Concatenate all outputs into a single file')
parser.add_argument('-o', '--output', help='Path to the output directory', default='./')
args = parser.parse_args()
parser.add_argument('--version', action='version', version=f'%(prog)s {__version__}')
return parser


def resolve_python_files(path: str) -> tuple[list[Path], Optional[Path]]:
candidate = Path(path)
if candidate.is_file():
return ([candidate], None) if is_python_source(candidate) else ([], None)
if candidate.is_dir():
return (sorted(
file_path for file_path in candidate.rglob("*.py")
if is_python_source(file_path)
), candidate)
return (sorted(
Path(file_path) for file_path in glob.glob(path, recursive=True)
if Path(file_path).is_file() and is_python_source(Path(file_path))
), None)


def is_python_source(path: Path) -> bool:
return path.suffix == ".py" and ".ugli." not in path.name


def module_name_from_relative_path(path: Path) -> str:
parts = list(path.with_suffix("").parts)
if len(parts) > 1 and parts[-1] == "__init__":
parts = parts[:-1]
return ".".join(parts)


def load_sources(paths: Iterable[Path], *, module_root: Optional[Path]) -> tuple[list[str], list[str], dict[str, Path]]:
sources, modules = [], []
for path in glob.iglob(args.path):
if not path.endswith('.py') or '.ugli.' in path:
continue
with open(path) as f:
sources.append(f.read())
modules.append(Path(path).stem)
module_to_output_path = {}
for path in paths:
sources.append(path.read_text(encoding="utf-8"))
if module_root is None:
module = path.stem
output_path = Path(path.name)
else:
output_path = path.relative_to(module_root)
module = module_name_from_relative_path(output_path)
modules.append(module)
module_to_output_path[module] = output_path
return sources, modules, module_to_output_path


def ensure_unique_modules(modules: Sequence[str]) -> None:
duplicates = sorted({module for module in modules if modules.count(module) > 1})
if duplicates:
duplicate_list = ", ".join(repr(module) for module in duplicates)
raise ValueError(
f"input resolves to duplicate module names: {duplicate_list}. "
"Pass a package root directory instead of a narrower glob or file list."
)


def write_outputs(
sources: Sequence[str],
modules: Sequence[str],
output: Path,
*,
single_file: bool,
keep_module_names: bool,
module_to_output_path: dict[str, Path],
) -> None:
if single_file:
destination = output if output.suffix == ".py" else output / f"{modules[0]}.py"
destination.parent.mkdir(parents=True, exist_ok=True)
destination.write_text(sources[0], encoding="utf-8")
return

if output.suffix == ".py":
raise ValueError("output must be a directory unless --single-file is set")

output.mkdir(parents=True, exist_ok=True)
for source, module in zip(sources, modules):
destination = (
output / module_to_output_path[module]
if keep_module_names and module in module_to_output_path
else output / f"{module}.py"
)
destination.parent.mkdir(parents=True, exist_ok=True)
destination.write_text(source, encoding="utf-8")


def main(argv: Optional[Sequence[str]] = None) -> int:
parser = build_parser()
args = parser.parse_args(argv)
paths, module_root = resolve_python_files(args.path)
if not paths:
parser.error(f"no Python files matched {args.path!r}")

try:
sources, modules, module_to_output_path = load_sources(paths, module_root=module_root)
ensure_unique_modules(modules)
except ValueError as exc:
parser.error(str(exc))
cleaned, modules = minify(
sources, modules, keep_module_names=args.keep_module_names,
keep_global_variables=args.keep_global_variables,
output_single_file=args.single_file
)
output = Path(args.output)
output.mkdir(parents=True, exist_ok=True)
for source, module in zip(cleaned, modules):
with open(output / f'{module}.py', 'w') as f:
f.write(source)
try:
write_outputs(
cleaned,
modules,
Path(args.output),
single_file=args.single_file,
keep_module_names=args.keep_module_names,
module_to_output_path=module_to_output_path,
)
except ValueError as exc:
parser.error(str(exc))
return 0


if __name__ == '__main__':
main()
raise SystemExit(main())
Loading
Loading