TransWikia.com

Binary classification: how to transform features in real numbers?

Data Science Asked on December 21, 2020

I want to train a binary classification algorithm for spam detection using labeled data set. The dataset has the following features:

Email address, text message (split into subject and corpus), date

An example of data is:

Email | Subject | Corpus | Date 
[email protected] | Example | this is just an example of my dataset  | 2020/08/20

What I would like is to transform data features in real numbers and binarize email addresses.
As algorithm I was thinking of
SVM and/or Naïve Bayes.

My difficulties are, however, in how transform data features in real numbers in order to get more parameters in my classifier.

I am using Python.

Could you please give me an example of how to do it?

One Answer

The term you are looking for is text classification. There exists a huge number of tutorials and papers out there, for example this tutorial and this survey.

Answered by N. Kiefer on December 21, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP