Data Composer
With as little context as a seed phrase, you can design datasets for fine-tuning to a use-case. This tool will also let you augment an existing dataset if provided a file or a Hugging Face dataset. Click on [Data Composer] tool and give your data generation job a name. Provide an example prompt or a dataset to generate samples.
Score
You may also want to score the quality of a dataset. The Remyx Score tool, powered by Prometheus 2, helps you judge data on a rubric emphasizing the qualities and tasks you want to maximize. We’ll show you how to format your data and design a rubric for scoring.The Score tool is currently available for text data - more data modalities coming soon!
Design A Rubric
To design a rubric, first describe the criteria that grounds the scoring. For example, if you want to judge responses on how respectful they are, you can describe your criteria like:Example criteria
Format Your Data
Your dataset must include three string columns namedprompt
, response
, and reference_answer
. These columns represent input values, output values, and ideal output values, respectively. If your column names differ, this order will be assumed. You can point to a publicly available Hugging Face dataset or upload a CSV file.
Once you’ve provided all the inputs and submitted your scoring job, you’ll be redirected to the Scores tab view where you can see your scoring job progress. Once it is complete, click on your score job name and you’ll see the results. On the right hand side, you’ll see the full rubric created from your criteria and positive/negative descriptions. In the center, you’ll see a preview of the scores generated, including two new columns to your dataset: feedback
and score
.