In linear regression with one variable, we have a *design matrix* ($X$) that represents our dataset and specifically has a shape as follows:

We say that the *design matrix* ($X$) in Equation \eqref{eq:x-dataset} has $m$ *training examples* and 1 *feature*, $x$.

The **logistic regression** model shall be trained on $X$. For those who are not familiar with logistic regression model can study the model in *Machine Learning Course by Andrew Ng* in Week 4. The **logistic regression** model has a model as follows:

Furthermore, **logistic regression** model has a *cost function* $J(\theta)$,

with

and $x^{(i)}$ is an $i$th *training example*, and $y^{(i)}$ is label or class from a *training example* $x^{(i)}$.

This article attempts to explain how to calculate partial derivatives from **logistic regression** *cost function* on $\theta_0$ and $\theta_1$. This partial derivatives are also called *gradient*, $\frac{\partial J}{\partial \theta}$.

**The Complete Form of Logistic Regression ***Cost Function*

*Cost Function*

By combining Equation \eqref{eq:cost-function} and \eqref{eq:cost-logistic}, a more detailed *cost function* can be obtained as follows:

Next, $\frac{\partial J}{\partial \theta_0}$ and $\frac{\partial J}{\partial \theta_1}$ shall be computed. Now we compute the partial derivative of $h_{\theta}(x)$ on $\theta_0$ or $\frac{\partial h_{\theta}}{ \partial \theta_0 }$.

From Calculus, the derivatives of $\frac{u(x)}{v(x)}$ with each $u(x)$ dan $v(x)$ is a function of $x$ are

with $u^{\prime}$ and $v^{\prime}$ are the first derivatives of $u$ and $v$, respectively.

We shall utilize this formula in Equation \eqref{eq:formula-derivatif} to calculate $\frac{\partial h_{\theta}}{ \partial \theta_0 }$ and $\frac{\partial h_{\theta}}{ \partial \theta_1 }$ as follows:

and

**Calculate $\frac{\partial J}{\partial \theta_0}$**

The partial derivative $\frac{\partial J}{\partial \theta_0}$ is calculated as follows:

Part I from Equation \eqref{eq:bagian2-theta0} is calculated with *chain rules* technique and Equation \eqref{eq:formula-derivatif-theta0} becomes

Part II from Equation \eqref{eq:bagian2-theta0} is also calculated with the chain rules and Equation \eqref{eq:formula-derivatif-theta0} becomes

By substituting Equation \eqref{eq:bagian-I-theta0} and Equation \eqref{eq:bagian-II-theta0} into Equation \eqref{eq:bagian2-theta0} we obtain

**Calculate $\frac{\partial J}{\partial \theta_1}$**

The partial derivative $\frac{\partial J}{\partial \theta_1}$ can be calculated as follows:

Part I from Persamaan \eqref{eq:bagian2-theta1} is calculated by *chain rules* and Equation \eqref{eq:formula-derivatif-theta1} becomes

Part II from Equation \eqref{eq:bagian2-theta1} is also calculated with chain rules and Equation \eqref{eq:formula-derivatif-theta1} becomes

Again, by substituting Equation \eqref{eq:bagian-I-theta1} and Equation \eqref{eq:bagian-II-theta1} into Equation \eqref{eq:bagian2-theta1}, we obtain

Therefore, the *gradient* of **logistic regression** model with 1 variable, $x$ and 2 parameters, $\theta_0$ dan $\theta_1$ is

and

In general, *gradient* of **logistic regression** model dengan $n$ variables, $x_1, x_2, \ldots, x_n$ and $n+1$ parameters, $\theta_0, \theta_1, \ldots, \theta_n$ is

\begin{equation} \frac{\partial J}{\partial \theta_j} = \frac{1}{m} \sum_{i=1}^{m} ( h_{\theta}(x^{(i)}) - y^{(i)} ) x_{j}^{(i)}. \end{equation}

Additionally, in case $j=0$, we have $x_0^{(i)} = 1$ for $i = 1, 2, \ldots, m$.