Edit

Share via


Train a Foundry Tools custom translation model

A custom translation model provides translations for a specific language pair. The outcome of a successful training is a model. To train a custom translation model, three mutually exclusive document types are required: training, tuning, and testing. If only training data is provided when queuing a training, custom translation automatically assembles tuning and testing data. It uses a random subset of sentences from your training documents, and exclude these sentences from the training data itself. A minimum of 10,000 parallel training sentences are required to train a full model.

Create your custom translation model

Here are the steps to create a custom translation model:

  1. Follow Upload data, then continue here.

  2. After the data is processed, select the Train model from the left menu.

    Screenshot depicting the train model blade.

  3. Type the Model name.

  4. Select Training type.

    Note

    Full training displays all uploaded document types. Dictionary-only displays dictionary documents only.

  5. Select Next.

    Screenshot illustrating train model blade.

  6. Select the data you want to use for training and review the training cost associated with the selected number of sentences.

    Screenshot depicting a view of the train model blade.

  7. Select Next

  8. Review and select Train model.

    Screenshot illustrating the train model blade.

When to select dictionary-only training

For better results, we recommended letting the system learn from your training data. However, when you don't have enough parallel sentences to meet the 10,000 minimum requirements, and sentences and compound nouns must be rendered as-is, use dictionary-only training. Your model typically completes training faster than with full training. The resulting models use the baseline models for translation along with the dictionaries you added. You don't see BLEU scores and test report.

Note

Custom translation doesn't sentence-align dictionary files. Therefore, it's important that there are an equal number of source and target phrases/sentences in your dictionary documents and that they're precisely aligned. If not, the document upload fails.

Model details

  1. After successful model training, select the Train model from the left menu, then select the model name.

  2. Select the Model Name to review training date/time, total training time, number of sentences used for training, tuning, testing, dictionary, and whether the system generated the test and tuning sets. You use Category ID to make translation requests.

  3. Evaluate the model BLEU score. Review the test set: the BLEU score is the custom translation model score and the Baseline BLEU is the pretrained baseline model used for customization. A higher BLEU score means higher translation quality using the custom translation model.

    Screenshot illustrating the model details.

Duplicate model

  1. Select the Train model from the left menu.

  2. Hover over the model name and check the selection button.

  3. Select Duplicate model.

  4. Fill in New model name.

  5. Keep Train immediately checked if no more data is needed and ready to train, otherwise, check Save as draft.

  6. Select Duplicate.

    Note

    If you save the model as Draft, Model details is updated with the model name in Draft status.

    To add more documents, select the model name and select Manage data from the left menu. Follow Upload data

    Screenshot illustrating the duplicate model blade.

Next steps