This guide will show you how to customize an LLM for your application through fine-tuning. We’ll cover all of the modeling options, data inputs, and what to expect after training is complete.

You can use the Data Composer tool to generate a dataset if you do not have one. Otherwise, you can format your own according to the specifications below.

Training Your Model

Under the “Train” section in the home screen, you’ll see multiple model types available for fine-tuning. We currently support training

  • Large Language Models (text)
  • Text Embeddings
  • Image Classification
  • Image Detection
  • Image Segmentation

Click the model that fits best for your task. The language models will require a properly formatted dataset, which is described in the form. You can make a dataset using the Data Composer. The image models will synthesize a dataset as part of the training for the labels given.

The dial will help you select a model architecture that emphasizes speed vs accuracy. Select the mode that fits best for your application.

Once you’ve submitted a training job, you will see its progress status in the “Models” tab, seen on the left hand hamburger menu. Once it has completed training, you can click your model to view its dashboard. Within the dashboard view, you can see metadata about the model architecture selection, task type, and model performance metrics. On the left pannel of the dashboard, you’ll find more visualization for training metrics including:

  • loss/accuracy curves
  • text evaluations
  • plan reasoning

depending on the modeling task.

The right pannel includes options for:

  • Web browser hosted model testing
  • inference code examples
  • model conversions
  • deployment options

depending on the model type.

What’s next?

Great, you’ve specialized your model to your use case using a custom dataset. Explore other tools to help you further experiment with measuring model performance and deployment options: