Binary classification

Goal:

If elements of two classes need to be distinguished, the task is called binary classification or may be called logistic regression. The output is a number between 0 and 1, which the closer to the two extremes, the more confident the model is in its decision. If the output is very close to 0.5, the model is not confident.

CNN: Convolution, Max pooling

Convolutional neural networks, as the name implies, contain convolutional layers of a given size and number of kernels. The stride parameter can be specified, which is the step size during convolution. The convolutional layers are usually followed by max pooling layers. A 2x2 max pooling layer halves the output, by choosing the largest of the 4 elements at every 2x2 array of the image. Smaller data is easier to work with.

Batch size

Most of the time it is not possible to feed the entire training dataset to the algorithm in one go, due to the large size and memory limitations of computers. The solution is to divide the data into small groups, called batches. The batch size parameter is the number of samples in each of these small groups.

Epoch

Training is an iterative process. In an epoch, the number of iterations is equal to the number of batches that add up the entire dataset. The entire dataset flows through the network in each epoch, thereby tuning the weights, biases and convolution kernels. Ideally, the accuracy increases with each epoch.

Help

Install Keras: https://keras.io/getting_started

Install TensorFlow: https://www.tensorflow.org/install

MNIST images: Download

Source code: Training the model

        
from keras import utils, layers, Sequential

# Load and preprocess images
data_dir = "path-to-resources/mnist-binary/train"
train_data, val_data = utils.image_dataset_from_directory(
    data_dir,
    labels = "inferred",
    label_mode = "binary",
    batch_size = 128, # Default would be 32
    image_size = (28, 28),
    color_mode = "grayscale",
    subset = "both",
    validation_split = 0.2,  # 20% of data used for validation
    seed = 123  # Optional random seed for shuffling and transformations
)

# Normalize the images
train_data = train_data.map(lambda image, label: (image / 255.0, label))
val_data = val_data.map(lambda image, label: (image / 255.0, label))

# Build the CNN model
model = Sequential([
        layers.Input(shape = (28, 28, 1)),
        layers.Conv2D(32, kernel_size = (3, 3), activation = "relu"),
        layers.MaxPooling2D(pool_size = (2, 2)),
        layers.Conv2D(64, kernel_size = (3, 3), activation = "relu"),
        layers.MaxPooling2D(pool_size = (2, 2)),
        layers.Flatten(),
        layers.Dropout(0.5),
        layers.Dense(1, activation="sigmoid"),
])
model.compile(loss = "binary_crossentropy", optimizer = "adam", metrics = ["accuracy"])
model.summary()

# Train the model for 10 epochs
model.fit(train_data, epochs = 10, validation_data = val_data)

# Evaluate the trained model
score = model.evaluate(val_data, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])

# Save the model in a '.keras' file
model.save('mnist_binary_classifier.keras')        
      

Source code: Testing the model

        
import os
from keras import models
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

# Load the saved model
model = models.load_model('mnist_binary_classifier.keras')

test_folder = 'path-to-resources/mnist-binary/test'

# Define classes
class_1 = 7 # Would also work as "Seven"
class_2 = 8 # Would also work as "Eight"

# Function to load and preprocess an image
def preprocess_image(img_path):
    img = Image.open(img_path).convert("L")
    img = img.resize((28, 28))
    img_data = np.array(img, dtype=np.uint8)
    img_data = img_data / 255.0  # Normalize pixel values
    img_data = np.expand_dims(img_data, axis=0)  # Add a new axis at the 0th position (batch)
    return img, img_data

# Get a list of all image files in the folder
test_image_paths = [os.path.join(test_folder, fname) for fname in os.listdir(test_folder) if fname.endswith(('.jpg'))]

# Loop through each image, preprocess, predict, and display the result
plt.figure(figsize=(10, 5))
for i, img_path in enumerate(test_image_paths[:10]): 
    test_image, test_image_prep = preprocess_image(img_path)
    prediction = model.predict(test_image_prep)[0][0]
    predicted_class = class_2 if prediction > 0.5 else class_1
    confidence = prediction if prediction > 0.5 else 1 - prediction
    plt.subplot(2, 5, i + 1)
    plt.imshow(test_image, cmap="gray")
    plt.axis('off')
    plt.title(f"Predicted: {predicted_class}\n{confidence:.8f}")
plt.show()