ML 1.4 Semi-Supervised Machine Learning and Reinforcement Learning

- January 27, 2026

Semi-Supervised Learning: -

Semi-supervised machine learning is a learning approach that uses a small amount of labelled data and a large amount of unlabelled data to train a model.

semi-supervised learning is needed for

1. Labelling data is time-consuming and expensive

2. Unlabelled data is easy to collect

3. Semi-supervised learning improves accuracy compared to using only a small labelled dataset

Semi-supervised learning works

1. Start with a small labelled dataset

2. Train an initial model

3. Use the model to predict labels for unlabelled data

4. Select confident predictions

5. Retrain the model using both labelled and newly labelled data

Example: - Imagine a classroom:

Teacher solves a few problems on the board (labelled data)

Students then solve many similar problems by observing patterns (unlabelled data)

Teacher corrects only important ones

Learning improves even with limited guidance.

Common semi-supervised techniques

1. Self-training
Model labels unlabeled data by itself

2. Co-training
Two models teach each other

3. Graph-based methods
Similar data points influence each other

4) Reinforcement Learning: -

Reinforcement learning is a type of machine learning where an agent learns to interact with an environment by performing actions and receiving rewards or penalties based on its actions.

Reinforcement Learning is a type of machine learning where a machine (called an agent) learns by doing actions and getting rewards or punishments.

For example, training a robot to navigate a maze.

The goal of reinforcement learning is to learn a policy, which is a mapping from states to actions, that maximizes the expected cumulative reward over time.

Good action → reward

Bad action → no reward or penalty

Over time, the agent learns what actions are best.

Reinforcement learning works:

1. Agent observes the current state

2. Agent chooses an action

3. Environment responds with a reward or penalty

4. Agent updates its strategy

5. Process repeats until optimal behaviour is learned

There are two main types of reinforcement learning:

a) Model-based reinforcement learning: -

In model-based reinforcement learning, the agent learns a model of the environment, including the transition probabilities between states and the rewards associated with each state-action pair.

The agent then uses this model to plan its actions in order to maximize its expected reward. Some popular model-based reinforcement learning algorithms include Value Iteration and Policy Iteration.

The agent first learns how the environment works, then uses that knowledge to plan its actions.

Common algorithms

Value Iteration : Value Iteration is a method where the agent figures out how good each state is, and from that, decides the best action. First find the value of every situation, then choose actions
Policy Iteration: Policy Iteration is a method where the agent starts with a policy or rule, checks how good it is, and then improves it. Start with a plan, test it, improve it, repeat

b) Model-free reinforcement learning: -

In model-free reinforcement learning, the agent learns a policy directly from experience without explicitly building a model of the environment.

The agent does not learn how the environment works.
Instead, it learns directly from trial and error.

The agent interacts with the environment and updates its policy based on the rewards it receives. Some popular model-free reinforcement learning algorithms include Q-Learning, SARSA, and Deep Reinforcement Learning.

Search This Blog

ROHIT's Smart Class Room

ML 1.4 Semi-Supervised Machine Learning and Reinforcement Learning

Common algorithms

Popular posts from this blog

operators in c programming

2.4 Arrays in c programming

Variables in c