Words with Prefix Sub

frankxwang/dpo-prefix-sharing

Each DPO training example consists of a shared prompt, a "chosen" response, and a "rejected" response. Instead of computing the shared prompt twice, we combine the prompt and pair of responses into a ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

frankxwang/dpo-prefix-sharing

Trending now