 ### Hendra Bunyamin

Forgiven sinner and Lecturer at Maranatha Christian University

### Deriving Normal Equation of Linear Regression ModelTweet

Cost function has been explained in Week 1 and Week 2 of Machine Learning course taught by Andrew Ng. This post tries to explain how to derive normal equation for linear regression with multiple variables. It is a good thing if all readers has studied Week 1 and Week 2 before reading this post.

The cost function of linear regression with multiple variables, $J(\theta)$ is formulated as follows:

with $m$ is number of instances in dataset, $h_{\theta}(x^{(i)})$ is our hyphotesis also known as prediction model for the $i$th instance, and $y^{(i)}$ is true value for the $i$th instance.

We also have studied that

By substituting \eqref{eq:the-hyphotesis} into \eqref{eq:cost-function}, we obtain

By defining

and

also

equation \eqref{eq:derivation-5} becomes

We have arrived into a matrix form from linear regression cost function. Our next step would be:

How can we minimize the cost function in Equation \eqref{eq:derivation-10}?

We will employ the derivation formula from Matrix Calculus; specifically, we use two scalar-by-vector identities with denominator layout (result: column vector). The identities are as follows:

and

Now equipped with these identities, let us minimize Equation \eqref{eq:derivation-10} by computing the first derivation of $J(\theta)$; specifically, the Part I is computed with Equation \eqref{eq:identity-1} and Part II with Equation \eqref{eq:identity-2}:

In order to find $\theta$ which minimize Equation \eqref{eq:derivation-10}, we need to solve

At last, we have derived the normal equation of linear regression model that is

Written on August 18, 2019