Machine Learning(Week 1)

What is Machine Learning?

  1. The field of study the gives the computer to ability to learn without being explicitly programmed.
  2. A computer program is set to be learn from experience E with respect to some class of Tasks T and performance measure P,If its performance at Task in T,as measured by P, improves with experience E

Supervised Learning

In supervised learning we are given a data-set and and we know about correct output how it look like.There is a relation between input and output.

Supervised Learning

Two types of supervised learning

  1. Classification
  2. Regression

Regression:-We are trying to predict result of function with continuous output. We map input variable to some continuous function. Ex. Given picture of person we have to predict age of person.(In this example input of data is map with the continuous function and later we can predict the age of person )

Classification:-We are trying to predict result of function with discrete output.We are trying to map input variable into discrete function. Ex. We have to classify the tumor between malignant and benign.

Unsupervised Learning

Unsupervised Learning Algorithm allows us to predict result with little or no idea about the result of the problem.

We can derive this structure by clustering the data based on relationships among the variables in the data.

There is no testing for supervised learning

Eg. Clustering of news data on Google news where we don’t know anything about data is unsupervised learning.

Unsupervised Learning

Model Representation

x^{(i)} to denote the “input” variables (living area in this example), also called input features, and y^{(i)}to denote the “output” or target variable that we are trying to predict (price). A pair (x^{(i)} , y^{(i)} ) is called a training example

Cost Function

we can measure the accuracy of hypothesis function by using cost function.

Cost function

This function is otherwise called the “Squared error function”, or “Mean squared error”.

Goal:-Choose value Θ1 and Θ2 so that value of hypothesis is close to y for our training set (x,y) means cost function should be minimum.

Try to reduce cost function.Lower the cost function better the hypothesis is.

Contour at same line have same value

Gradient Descent

Algorithm for minimizing the all cost function.

We get the local minima of cost function after running gradient descent algorithm.The red arrow is local minima.

We will take derivative of cost function w.r.t theta and get the slope .The slope at that point in derivative and it will give direction to move towards slope.We have to move towards the minima so we multiply by alpha and subtract it.The alpha is called learning rate.If alpha is higher the algorithm converges fast and if alpha is lower the algorithm converges slow. we have to choose alpha properly.


when it reaches to local minima then slope at that point is zero and change the value stop.

No need to decrease the value of alpha because derivative become smaller when it reaches to near to local minimum.

Gradient decent for Linear Regression

when specifically applied to linear regression gradient decent can be applied be derived .

Gradient decent for linear regression

Gradient decent applied on each training example is batch gradient descent.Since cost function of gradient decent is always convex so gradient descnet for cost function always converge to global minima.

Thanks for reading.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store