3.4 Naïve Bayes Model in Machine Learning

Bayes Theorem

Bayes Theorem was introduced by an English statistician Thomas Bayes in the 18th century.

It is a mathematical method used to calculate the probability of an event when some other related event has already happened.

It helps models make predictions using probability.

Bayes Theorem is also called Bayes Rule or Bayes Law.

For example:

  • Email classification (Spam or Not Spam)

  • Medical diagnosis

  • Text classification

  • Fraud detection

Bayes Theorem Formula



P(A) → Prior Probability : Probability of event A before any new information.

P(B) → Marginal Probability : Probability of event B.

P(A|B) → Posterior Probability : Probability of A after B has occurred.

P(B|A) → Likelihood : Probability of B happening when A has already occurred.


1. Conditional Probability

  • When the probability of one event depends on another event.

            Example: Probability of rain given that clouds are present.

2. Prior Probability

  • Probability of an event based on previous knowledge or past data.

            Example: Past data shows 30% of emails are spam.

3. Posterior Probability

  • Updated probability after considering new information.

            Example: After seeing spam words in an email, the probability that it is spam increases.

4. Joint Probability

  • The probability of two or more events happening together.

            Example: Probability that it is cloudy and raining at the same time.


Naïve Bayes Model: -

  • Naïve Bayes is a probabilistic classification algorithm in machine learning that is based on Bayes Theorem.
  • It assumes that all features are independent and calculates the probability of each class.
  • It is used to predict the category or class of data using probability.
  • The word “Naïve” means the model assumes that all features are independent of each other, even if they are related in real life. Because of this simple assumption, the algorithm becomes very fast and easy to use.
  • The class with the highest probability is selected as the prediction.
  • It is widely used in spam detection, text classification, and sentiment analysis.
  • Naïve Bayes calculates the probability that a data point belongs to a particular class. Then it selects the class with the highest probability.

Naïve Bayes is widely used in many applications such as:

  • Email spam detection

  • Sentiment analysis (positive or negative reviews)

  • Text classification

  • Document categorization

  • Medical diagnosis


Naïve Bayes Formula

The model is based on Bayes theorem.

Where:

  • P(A) = Prior probability

  • P(B) = Probability of evidence

  • P(A|B) = Posterior probability

  • P(B|A) = Likelihood

The algorithm calculates these probabilities and chooses the class with the highest value.


Types of Naïve Bayes

There are mainly three types:

  1. Gaussian Naïve Bayes
    Used for continuous numerical data like height, weight, marks.

  2. Multinomial Naïve Bayes
    Used for text classification such as email filtering.

  3. Bernoulli Naïve Bayes
    Used when data has binary values like Yes/No or 0/1.


Advantages of Naïve Bayes

  • Simple and easy to understand

  • Very fast algorithm

  • Works well with large datasets

  • Good for text classification problems


Limitations

  • Assumes features are independent (which may not always be true)

  • Accuracy may reduce when features are strongly related


Naïve Bayes – Step-by-Step process: 

Example of email spam detection.

We want to classify an email as:

  • Spam

  • Not Spam

Step 1: Training Data

Suppose we have 10 emails in our training data.


So the prior probabilities are:

                                                    P(Spam) = 6 / 10 = 0.6

                                                    P(Not Spam) = 4 / 10 = 0.4


Step 2: Word Information

From past emails we observe the word “Free”.


So the probabilities are:

P(Free | Spam) = 4 / 6 = 0.67

P(Free | Not Spam) = 1 / 4 = 0.25


Step 3: Naïve Bayes Formula

We use Bayes theorem.

For classification we compare:

P(Spam | Free)
P(Not Spam | Free)

Since P(B)P(B) is same for both classes, we compare only:

P(FreeSpam)×P(Spam)P(Free|Spam) \times P(Spam)
P(FreeNotSpam)×P(NotSpam)P(Free|NotSpam) \times P(NotSpam)

Step 4: Calculate Probabilities

For Spam

P(Free | Spam) × P(Spam)

= 0.67 × 0.6
= 0.402


For Not Spam

P(Free | Not Spam) × P(Not Spam)

= 0.25 × 0.4
= 0.10


Step 5: Compare Results



                                        Since 0.402 > 0.10

                        Prediction:                        Email = Spam


























 

Popular posts from this blog

operators in c programming

2.4 Arrays in c programming

Variables in c