# Machine Learning(Week 1)

**What is Machine Learning?**

- The field of study the gives the computer to ability to learn without being explicitly programmed.
- A computer program is set to be learn from experience E with respect to some class of Tasks T and performance measure P,If its performance at Task in T,as measured by P, improves with experience E

**Supervised Learning**

In supervised learning we are given a data-set and and we know about correct output how it look like.There is a relation between input and output.

Two types of supervised learning

- Classification
- Regression

Regression:-We are trying to predict result of function with continuous output. We map input variable to some continuous function. Ex. Given picture of person we have to predict age of person.(In this example input of data is map with the continuous function and later we can predict the age of person )

Classification:-We are trying to predict result of function with discrete output.We are trying to map input variable into discrete function. Ex. We have to classify the tumor between malignant and benign.

**Unsupervised Learning**

Unsupervised Learning Algorithm allows us to predict result with little or no idea about the result of the problem.

We can derive this structure by clustering the data based on relationships among the variables in the data.

There is no testing for supervised learning

Eg. Clustering of news data on Google news where we don’t know anything about data is unsupervised learning.

# Model Representation

x^{(i)} to denote the “input” variables (living area in this example), also called input features, and y^{(i)}to denote the “output” or target variable that we are trying to predict (price). A pair (x^{(i)} , y^{(i)} ) is called a training example

**Cost Function**

we can measure the accuracy of hypothesis function by using cost function.

This function is otherwise called the “Squared error function”, or “Mean squared error”.

Goal:-Choose value Θ1 and Θ2 so that value of hypothesis is close to y for our training set (x,y) means cost function should be minimum.

Try to reduce cost function.Lower the cost function better the hypothesis is.

**Gradient Descent**

Algorithm for minimizing the all cost function.

We get the local minima of cost function after running gradient descent algorithm.The red arrow is local minima.

We will take derivative of cost function w.r.t theta and get the slope .The slope at that point in derivative and it will give direction to move towards slope.We have to move towards the minima so we multiply by alpha and subtract it.The alpha is called learning rate.If alpha is higher the algorithm converges fast and if alpha is lower the algorithm converges slow. we have to choose alpha properly.

θ1:=θ1−α*dθ1(θ1)/dJ

when it reaches to local minima then slope at that point is zero and change the value stop.

No need to decrease the value of alpha because derivative become smaller when it reaches to near to local minimum.

**Gradient decent for Linear Regression**

when specifically applied to linear regression gradient decent can be applied be derived .

Gradient decent applied on each training example is batch gradient descent.Since cost function of gradient decent is always convex so gradient descnet for cost function always converge to global minima.

Thanks for reading.