This dataset was derived from the publication ‘Albanian fake news detection’ by ERCAN CANHASI, REXHEP SHIJAKU, and ERBLIN BERISHA. Their GitHub Repository can be found at this URL and their soon-to-be-published work may be found at this URL.

This dataset is made up of a small corpus of news that the authors of the study manually assessed as fake or not. The Figure below depicts the entire corpus formation process:

Process of building the Alb-Fake-News-Corpus

I was delighted to discover the authors’ work and wanted to publish it in a DataFrame-friendly style so that others may build on it.