This process allows the creation of two distinct datasets... | This process allows the creation of two distinct datasets...
This process allows the creation of two distinct datasets within Open-Critic-GPT:

Code-Preference-Pairs Dataset: (SFT) Contains pairs of duplicate code examples, with the only difference being one the rejected example has the bugged code 'surgically transplanted in' while the accepted is left the same.
Open-Critic-GPT Dataset: (DPO) Trains the model to find bugs and produce working code from broken code.
Both dataset's spans a total of 127 different language/structures, (some may have been lost in conversion started with 122k ended with 55k, due to lack of structured output, a finetuned model may preform better structured outputs.)
Both datasets contain of ~55K examples each (which both come from the same parent example)