Dataset Card for Alpaca-CleanedRepository: https://github... | Dataset Card for Alpaca-CleanedRepository: https://github...
Dataset Card for Alpaca-Cleaned
Repository: https://github.com/gururise/AlpacaDataCleaned
Dataset Description
This is a cleaned version of the original Alpaca Dataset released by Stanford. The following issues have been identified in the original release and fixed in this dataset:

Hallucinations: Many instructions in the original dataset had instructions referencing data on the internet, which just caused GPT3 to hallucinate an answer.https://huggingface.co/datasets/yahma/alpaca-cleaned GitHub - gururise/AlpacaDataCleaned: Alpaca dataset from Stanford, cleaned and curated