Enhanced Gradient Descent for Linear Regression

import numpy as np

def gradient_descent(X, y, theta, alpha, num_iters):
    m = len(y)
    for i in range(num_iters):
        h =, theta)
        loss = h - y
        gradient =, loss) / m
        theta = theta - alpha * gradient

    return theta


Code Analysis

The provided Python code implements a gradient descent algorithm for linear regression. It iteratively updates the theta values to minimize the cost function.


  1. The initial theta values are not defined in the function.
  2. It lacks proper convergence criteria to stop iterations when the algorithm has converged.
  3. The learning rate (alpha) is fixed and might not be optimal for all datasets.
  4. It doesn't have a mechanism to track the cost function values to ensure convergence.


  1. Define initial theta values as a parameter in the function.
  2. Add a convergence criterion based on the change in cost function.
  3. Implement a dynamic learning rate adjustment for better convergence.
  4. Track and store cost function values at each iteration for analysis.
import numpy as np

def gradient_descent(X, y, theta, alpha, num_iters, tol=1e-6):
    m = len(y)
    cost_history = []
    for i in range(num_iters):
        h =, theta)
        loss = h - y
        gradient =, loss) / m
        prev_theta = theta.copy()
        theta = theta - alpha * gradient
        cost = np.sum(loss ** 2) / (2 * m)
        if np.linalg.norm(theta - prev_theta) < tol:

    return theta, cost_history

Code Usage Example

import numpy as np

# Generate sample data
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
X_b = np.c_[np.ones((100, 1)), X]

# Initialize theta values
theta_initial = np.random.randn(2, 1)

# Run gradient descent
learned_theta, cost_history = gradient_descent(X_b, y, theta_initial, 0.1, 1000)

print("Learned theta values:", learned_theta)

