Skip to content

Artifact download script#226

Merged
sfc-gh-jrasley merged 3 commits intomainfrom
jrasley/prod-scripts
Jul 1, 2025
Merged

Artifact download script#226
sfc-gh-jrasley merged 3 commits intomainfrom
jrasley/prod-scripts

Conversation

@sfc-gh-jrasley
Copy link
Copy Markdown
Collaborator

@sfc-gh-jrasley sfc-gh-jrasley commented Jul 1, 2025

Helpful script to parse a training config and download all the required data or model weights from HF, helpful especially in preparing prod training environments.

I would use arctic_training process-data config.yaml for downloading data sources but this requires a pre-defined tokenizer and additional processing time that is not always needed or transferable across models.

@sfc-gh-jrasley sfc-gh-jrasley enabled auto-merge (squash) July 1, 2025 04:53
@sfc-gh-jrasley sfc-gh-jrasley changed the title Data download script Artifact download script Jul 1, 2025
@sfc-gh-jrasley sfc-gh-jrasley merged commit 9e622f6 into main Jul 1, 2025
5 checks passed
@sfc-gh-jrasley sfc-gh-jrasley deleted the jrasley/prod-scripts branch July 1, 2025 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants