TransWikia.com

Public dataset for news articles with their associated categories for multilabel data classification

Data Science Asked by Snehil R Singh on January 20, 2021

I am wondering if there are any public datasets of news, like New York Times (NYT) or similar to various news categories such as politics, entertainment, lifestyle, general news, sports etc.

I want to use such a dataset for multilabel data classification of various sentences or paragraphs i.e a sentence could belong to politics, entertainment, sports, or all, I need the dataset to classify the data into more than one label. I was planning to train a classifier with such a dataset and use it for predictions. However, I couldn’t find any. Are there any such known datasets available?

I want dataset something like this but for news category
enter image description here

One Answer

I found one on Kaggle by searching "news categories". I believe this dataset should work for you. Its a JSON file which contains a link to the article, an associated category (e.g. "crime"), headline, authors, date, and a short description.

You may also want to check out the Open Data Stackexchange:

https://opendata.stackexchange.com/

Enjoy!

https://www.kaggle.com/rmisra/news-category-dataset

Answered by adamcatto on January 20, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP