Text classification

Question

I am using SVM algorithm for text classification. I need to know where can I find twitter dataset and how can I use it in weka tool or any other tool?

tensor · Answer

These are a few sites I found for that. I am not an R programmer so I don't know any Weka tools and how to use them but hope this helps. You can find them here and here.

Isaac Freitas · Answer

Twitter has rules that limit sharing of complete datasets, instead limiting sharing to the tweet ids (see this discussion and the developer agreement). Tools such as twarc can be used to "rehydrate" the tweet metadata by calling the Twitter api and retrieving the info. The Twitter API has rate limiting which can make this a somewhat slow process.

For a package that works in R, see RTextTools or check out RWeka which bridges the gap between R and Java to use Weka. If using Python, you can also use scikit-learn's svm implementation.

For a package that works in R, see RTextTools or check out RWeka which bridges the gap between R and Java to use Weka.  If using Python, you can also use scikit-learn's svm implementation.

H Lim · Answer

This collection of twitter datasets might help you find the dataset you're looking for. Mainly sentiment analysis datasets, but moderation and classification datasets too.

Text classification

3 Answers

Add your own answers!

Ask a Question