Typical workflow
Step 0: Setup AWS credentials (one time)
[my-aws-profile]
aws_access_key_id = your_aws_access_key_id
aws_secret_access_key = your_aws_secret_access_keyStep 1: Install spotml cli (one time)
pip install spotml --upgradeStep 2: Configure spotml.yaml
project:
name: mnist
syncFilters:
- exclude:
- .git/*
- .idea/*
- '*/__pycache__/*'
containers:
- &DEFAULT_CONTAINER
projectDir: /workspace/project
image: tensorflow/tensorflow:latest-py3
volumeMounts:
- name: workspace
mountPath: /workspace
env:
PYTHONPATH: /workspace/project
ports:
# tensorboard
- containerPort: 6006
hostPort: 6006
# jupyter
- containerPort: 8888
hostPort: 8888
instances:
- name: aws-1
provider: aws
parameters:
region: us-east-1
instanceType: t2.large
spotStrategy: on-demand
ports: [6006, 6007, 8888]
rootVolumeSize: 125
volumes:
- name: workspace
parameters:
size: 50
scripts:
train: |
python train.py
tensorboard: |
tensorboard --port 6006 --logdir results/
jupyter: |
CUDA_VISIBLE_DEVICES="" jupyter notebook --allow-root --ip 0.0.0.0
Step 3: Create a Docker file (optional)
Step 4: Start an instance

Step 5: SSH into the instance

Step 6: Schedule a managed run (optional)
Step 7: Check run status (optional)



Last updated