If elements of two classes need to be distinguished, the task is called binary classification or may be called logistic regression. The output is a number between 0 and 1, which the closer to the two extremes, the more confident the model is in its decision. If the output is very close to 0.5, the model is not confident.
Convolutional neural networks, as the name implies, contain convolutional layers of a given size and number of kernels. The stride parameter can be specified, which is the step size during convolution. The convolutional layers are usually followed by max pooling layers. A 2x2 max pooling layer halves the output, by choosing the largest of the 4 elements at every 2x2 array of the image. Smaller data is easier to work with.
Most of the time it is not possible to feed the entire training dataset to the algorithm in one go, due to the large size and memory limitations of computers. The solution is to divide the data into small groups, called batches. The batch size parameter is the number of samples in each of these small groups.
Training is an iterative process. In an epoch, the number of iterations is equal to the number of batches that add up the entire dataset. The entire dataset flows through the network in each epoch, thereby tuning the weights, biases and convolution kernels. Ideally, the accuracy increases with each epoch.
Install Keras: https://keras.io/getting_started
Install TensorFlow: https://www.tensorflow.org/install
MNIST images: Download
from keras import utils, layers, Sequential
# Load and preprocess images
data_dir = "path-to-resources/mnist-binary/train"
train_data, val_data = utils.image_dataset_from_directory(
data_dir,
labels = "inferred",
label_mode = "binary",
batch_size = 128, # Default would be 32
image_size = (28, 28),
color_mode = "grayscale",
subset = "both",
validation_split = 0.2, # 20% of data used for validation
seed = 123 # Optional random seed for shuffling and transformations
)
# Normalize the images
train_data = train_data.map(lambda image, label: (image / 255.0, label))
val_data = val_data.map(lambda image, label: (image / 255.0, label))
# Build the CNN model
model = Sequential([
layers.Input(shape = (28, 28, 1)),
layers.Conv2D(32, kernel_size = (3, 3), activation = "relu"),
layers.MaxPooling2D(pool_size = (2, 2)),
layers.Conv2D(64, kernel_size = (3, 3), activation = "relu"),
layers.MaxPooling2D(pool_size = (2, 2)),
layers.Flatten(),
layers.Dropout(0.5),
layers.Dense(1, activation="sigmoid"),
])
model.compile(loss = "binary_crossentropy", optimizer = "adam", metrics = ["accuracy"])
model.summary()
# Train the model for 10 epochs
model.fit(train_data, epochs = 10, validation_data = val_data)
# Evaluate the trained model
score = model.evaluate(val_data, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])
# Save the model in a '.keras' file
model.save('mnist_binary_classifier.keras')
import os
from keras import models
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
# Load the saved model
model = models.load_model('mnist_binary_classifier.keras')
test_folder = 'path-to-resources/mnist-binary/test'
# Define classes
class_1 = 7 # Would also work as "Seven"
class_2 = 8 # Would also work as "Eight"
# Function to load and preprocess an image
def preprocess_image(img_path):
img = Image.open(img_path).convert("L")
img = img.resize((28, 28))
img_data = np.array(img, dtype=np.uint8)
img_data = img_data / 255.0 # Normalize pixel values
img_data = np.expand_dims(img_data, axis=0) # Add a new axis at the 0th position (batch)
return img, img_data
# Get a list of all image files in the folder
test_image_paths = [os.path.join(test_folder, fname) for fname in os.listdir(test_folder) if fname.endswith(('.jpg'))]
# Loop through each image, preprocess, predict, and display the result
plt.figure(figsize=(10, 5))
for i, img_path in enumerate(test_image_paths[:10]):
test_image, test_image_prep = preprocess_image(img_path)
prediction = model.predict(test_image_prep)[0][0]
predicted_class = class_2 if prediction > 0.5 else class_1
confidence = prediction if prediction > 0.5 else 1 - prediction
plt.subplot(2, 5, i + 1)
plt.imshow(test_image, cmap="gray")
plt.axis('off')
plt.title(f"Predicted: {predicted_class}\n{confidence:.8f}")
plt.show()