Training Service¶
Overview¶
The Training Service on Highrise Cloud provides a user-friendly interface for managing machine learning model training tasks. With this service, you can easily deploy, monitor, and manage your training jobs using the computational resources available on the platform.
Accessing the Training Dashboard¶
Step 1: Log in to the Highrise Cloud Console¶
Access the Highrise Cloud platform and navigate to the Training section.
Step 2: View Training Tasks¶
Upon entering the Training section, you will see a list of all your current training tasks. Each task is displayed with its name, GPU configuration, GPU memory, model, resource usage, and available operations.
Creating a New Training Task¶
Step 1: Click the "New" Button¶
Click the "+" button next to a task to create a new training task.
Step 2: Configure Task Parameters¶
Fill in the necessary details for your training task using the following parameters:
Parameter | Description |
---|---|
Name | A unique identifier for your training task. |
GPUs | Select the number and type of GPUs required for your task (e.g., 2 * Tesla P40). |
GPU Memory | The amount of GPU memory allocated for the task (e.g., 24GiB). |
Model | Choose the model to be used for training (e.g., meta-llama-3-8b-instruct). |
Configuring Your Task
Ensure that you select the appropriate GPUs and memory to match your training requirements.
Step 3: Review and Deploy¶
Review your configurations and click "Confirm" to deploy the training task.
Managing Training Tasks¶
Viewing Task Details¶
Click on a task name to view detailed information about its status, resource usage, and performance metrics.
Deleting a Task¶
To delete a training task, click the "Delete" button next to the task in the list.
Permanent Deletion
Deletion is permanent and cannot be undone. Ensure that you no longer need the task before deleting.
TensorBoard Integration¶
For tasks that support TensorBoard, click the "TensorBoard" link to visualize training metrics and logs.
Monitoring Resource Usage¶
Keep an eye on the "Resource" column to monitor the GPU hours consumed by each task. This helps in managing costs and resource allocation effectively.
Cost Management
Monitor your resource usage to optimize costs and resource utilization.
Next Steps¶
After deploying your training tasks, you can use the Training Dashboard to monitor their progress and performance. Adjust your training configurations as needed to optimize results. For further assistance or to explore advanced features, refer to the related sections or visit our support page on the Highrise Cloud platform.
For more information on fine-tuning your models, proceed to the Finetune Service.