## Chunking the Algorithm

- Randomly initialize parameters for the hypothesis function
- Apply Logistic function to linear hypothesis function
- Calculate the Partial Derivative (Saket Thavanani wrote a good post on this titled
*The derivative of Cost function for Logistic Regression*) - Update parameters
- Repeat 2–4 for
*n*number of iterations (Until cost function is minimized otherwise) - Inference

**Implementation**

For this section I leverage 3 Python Frameworks: NumPy for Linear Algebra, Pandas for Data Manipulation and Scikit-Learn for Machine Learning tools.

**import** **numpy** **as** **np**

**import** **pandas** **as** **pd**

**from** **sklearn.metrics** **import** accuracy_score

**from** **sklearn.datasets** **import** load_breast_cancer

**from** **sklearn.linear_model** **import** LogisticRegression

**from** **sklearn.model_selection** **import** train_test_split

First, we need a dataset. I use `sklearn.datasets.load_breast_cancer`

which is a classic binary classification dataset — See Documentation.

*# loading the data set*

dataset = load_breast_cancer(as_frame=**True**)

df= pd.DataFrame(data= dataset.data)

df["target"] = dataset.targetdf.head()

Next, we split the predictors and the response variables then create a training and test set.

*# Seperating to X and Y *

X = df.iloc[:, :-1]

y = df.iloc[:, -1]*# splitting training and test*

X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.75, shuffle=**True**, random_state=24)

Plenty of the work we done to build Linear Regression from scratch (See link below) can borrowed with a few slight changes to adjust our model for classification using Logistic Regression.

defparam_init(X):"""Initialize parameters__________________Input(s)X: Training data__________________Output(s)params: Dictionary containing coefficients"""

params = {}# initialize dictionary

_, n_features = X.shape# shape of training data

# initializing coefficents to 0

params["W"] = np.zeros(n_features)

params["b"] = 0returnparamsdefget_z(X, W, b):"""Calculates Linear Function__________________Input(s)X: Training dataW: Weight coefficientsb: bias coefficients__________________Output(s)z: a Linear function"""

z = np.dot(X, W) + breturnzdefsigmoid(z):"""Logit model_________________Input(s)z: Linear model_________________Output(s)g: Logit function applied to linear model"""

g = 1 / (1 + np.exp(-z))returngdefgradient_descent(X, y, params, lr, n_iter):"""Gradient descent to minimize cost function__________________Input(s)X: Training datay: Labelsparams: Dictionary contatining coefficientslr: learning rate__________________Output(s)params: Dictionary containing optimized coefficients"""

W = params["W"]

b = params["b"]

m = X.shape[0]# number of training instances

for_inrange(n_iter):# prediction with random weights

g = sigmoid(get_z(X, W, b))# calculate the loss

loss = -1/m * np.sum(y * np.log(g)) + (1 - y) * np.log(1-g)# partial derivative of weights

dW = 1/m * np.dot(X.T, (g - y))

db = 1/m * np.sum(g - y)# updates to coefficients

W -= lr * dW

b -= lr * dbparams["W"] = W

params["b"] = breturnparamsdeftrain(X, y, lr=0.01, n_iter=1000):"""Train Linear Regression model with Gradient decent__________________Input(s)X: Training datay: Labelslr: learning raten_iter: Number of iterations__________________Output(s)params: Dictionary containing optimized coefficients"""

init_params = param_init(X)

params = gradient_descent(X, y, init_params, lr, n_iter)returnparamsdefpredict(X_test, params):"""Train Linear Regression model with Gradient decent__________________Input(s)X: Unseen dataparams: Dictionary contianing optimized weights from training__________________Output(s)prediction of model"""

z = np.dot(X_test, params["W"]) + params["b"]

y_pred = sigmoid(z) >= 0.5returny_pred.astype("int")

Notable differences are that we now apply a logit function to our linear model, on inference we make every output greater than 0.5 from our logit model to be classified as class one (class 0 otherwise), and we use a different cost function to work for our classification model, since MSE would make our loss function non-convex— To learn more about the cost function used then you should definitely read *The derivative of Cost function for Logistic Regression**.*

params = train(X_train, y_train)# train model

y_pred = predict(X_test, params)# inferencelr = LogisticRegression(C=0.01)

lr.fit(X_train, y_train)

sklearn_y_pred = lr.predict(X_test)print(f"My Implementation: {accuracy_score(y_test, y_pred)}nSklearn Implementation: {accuracy_score(y_test, sklearn_y_pred)}")>>>> My Implementation: 0.9300699300699301

Sklearn Implementation: 0.9300699300699301

Great, we obtain the same accuracy as the Scikit-Learn implementation.

Now, we will repeat this with Object oriented programming which is considered to be much better for collaboration.

**class** **LogReg**():

*"""*

* Custom made Logistic Regression class*

* """*

**def** __init__(self, lr=0.01, n_iter= 1000):

self.lr = lr

self.n_iter = n_iter

self.params = {}**def** param_init(self, X_train):

*"""*

* Initialize parameters *

* __________________ *

* Input(s)*

* X: Training data*

* """*

_, n_features = self.X.shape *# shape of training data*

*# initializing coefficents to 0 *

self.params["W"] = np.zeros(n_features)

self.params["b"] = 0

**return** self

**def** get_z(X, W, b):

*"""*

* Calculates Linear Function*

* __________________*

* Input(s)*

* X: Training data*

* W: Weight coefficients*

* b: bias coefficients*

* __________________*

* Output(s)*

* z: a Linear function*

* """*

z = np.dot(X, W) + b

**return** z

**def** sigmoid(z):

*"""*

* Logit model*

* _________________*

* Input(s)*

* z: Linear model *

* _________________*

* Output(s)*

* g: Logit function applied to linear model*

* """*

g = 1 / (1 + np.exp(-z))

**return** g

**def** gradient_descent(self, X_train, y_train):

*"""*

* Gradient descent to minimize cost function*

* __________________ *

* Input(s)*

* X: Training data*

* y: Labels*

* params: Dictionary contatining random coefficients*

* alpha: Model learning rate*

* __________________*

* Output(s)*

* params: Dictionary containing optimized coefficients*

* """*

W = self.params["W"]

b = self.params["b"]

m = X_train.shape[0]

**for** _ **in** range(self.n_iter):

*# prediction with random weights*

g = sigmoid(get_z(X, W, b))

*# calculate the loss*

loss = -1/m * np.sum(y * np.log(g)) + (1 - y) * np.log(1 - g)

*# partial derivative of weights *

dW = 1/m * np.dot(X.T, (g - y))

db = 1/m * np.sum(g - y)

*# updates to coefficients*

W -= self.lr * dW

b -= self.lr * db

self.params["W"] = W

self.params["b"] = b

**return** self

**def** train(self, X_train, y_train):

*"""*

* Train model with Gradient decent*

* __________________ *

* Input(s)*

* X: Training data*

* y: Labels*

* alpha: Model learning rate*

* n_iter: Number of iterations *

* __________________*

* Output(s)*

* params: Dictionary containing optimized coefficients*

* """*

self.params = param_init(X_train)

gradient_descent(X_train, y_train, self.params , self.lr, self.n_iter)

**return** self

**def** predict(self, X_test):

*"""*

* Inference *

* __________________ *

* Input(s)*

* X: Unseen data*

* params: Dictionary contianing optimized weights from training*

* __________________*

* Output(s)*

* y_preds: Predictions of model*

* """*

g = sigmoid(np.dot(X_test, self.params["W"]) + self.params["b"])

**return** g

To check if we implemented it correctly we can see if the predictions are the same as our procedural implementation as we already know this is approximately equal to Scikit-learn’s implementation.

logreg = LogReg()

logreg.train(X_train, y_train)

oop_y_pred = logreg.predict(X_test)oop_y_pred == y_preds

This returns an array that is True for each value.