ML 1.4 Semi-Supervised Machine Learning and Reinforcement Learning
Semi-Supervised Learning: -
- Semi-supervised machine learning is a learning approach that uses a small amount of labelled data and a large amount of unlabelled data to train a model.
- semi-supervised learning is needed for
1. Labelling
data is time-consuming and expensive
2. Unlabelled
data is easy to collect
3. Semi-supervised
learning improves accuracy compared to using only a small labelled dataset
Semi-supervised
learning works
1. Start with a
small labelled dataset
2. Train an
initial model
3. Use the
model to predict labels for unlabelled data
4. Select
confident predictions
5. Retrain the
model using both labelled and newly labelled data
Example: - Imagine
a classroom:
- Teacher solves a few problems on the board (labelled data)
- Students then solve many similar problems by observing patterns (unlabelled data)
- Teacher corrects only important ones
Learning improves even with limited guidance.
Common semi-supervised techniques
1. Self-training
Model labels unlabeled data by itself
2. Co-training
Two models teach each other
3. Graph-based
methods
Similar data points influence each other
4) Reinforcement
Learning: -
- Reinforcement learning is a type of machine learning where an agent learns to interact with an environment by performing actions and receiving rewards or penalties based on its actions.
- Reinforcement Learning is a type of machine learning where a machine (called an agent) learns by doing actions and getting rewards or punishments.
- For example, training a robot to navigate a maze.
- The goal of reinforcement learning is to learn a policy, which is a mapping from states to actions, that maximizes the expected cumulative reward over time.
- Good action → reward
- Bad action → no reward or penalty
- Over time, the agent learns what actions are best.
Reinforcement
learning works:
1. Agent
observes the current state
2. Agent
chooses an action
3. Environment
responds with a reward or penalty
4. Agent
updates its strategy
5. Process
repeats until optimal behaviour is learned
There are
two main types of reinforcement learning:
a) Model-based reinforcement learning: -
- In model-based reinforcement learning, the agent learns a model of the environment, including the transition probabilities between states and the rewards associated with each state-action pair.
- The agent then uses this model to plan its actions in order to maximize its expected reward. Some popular model-based reinforcement learning algorithms include Value Iteration and Policy Iteration.
- The agent first learns how the environment works, then uses that knowledge to plan its actions.
Common algorithms
-
Value Iteration : Value Iteration is a method where the agent figures out how good each state is, and from that, decides the best action. First find the value of every situation, then choose actions
-
Policy Iteration: Policy Iteration is a method where the agent starts with a policy or rule, checks how good it is, and then improves it. Start with a plan, test it, improve it, repeat
b) Model-free reinforcement learning: -
- In model-free reinforcement learning, the agent learns a policy directly from experience without explicitly building a model of the environment.
- The agent does not learn how the environment works.
Instead, it learns directly from trial and error.
- The agent interacts with the environment and updates its policy based on the rewards it receives. Some popular model-free reinforcement learning algorithms include Q-Learning, SARSA, and Deep Reinforcement Learning.