Comprehensive overview of my cybersecurity AI research work
This paper explores machine learning applications in cybersecurity, specifically using the CICIDS2017 dataset to identify patterns in malicious network traffic. It provides an in-depth comparison of CNN, LSTM, Random Forest, and Autoencoder models, focusing on detection accuracy, training time, and false positive rates. The paper also includes an analysis of model interpretability and its implications for real-world deployment in intrusion detection systems.
View Full Paper (PDF)The final deliverable is a Python-based machine learning pipeline designed to classify and flag potential cyber threats in real time. It includes a streamlined preprocessing module for transforming raw network traffic into structured feature sets, multiple ML model implementations, and evaluation metrics visualization such as ROC curves and confusion matrices. The system is optimized for rapid detection and supports modular integration into existing cybersecurity frameworks.
This year-long project aimed to build an end-to-end system for cyber threat detection by leveraging both supervised and unsupervised learning models. The process involved extensive data preprocessing, statistical analysis, model training and evaluation, and finally real-world simulation testing. By combining deep learning and classical methods, the project achieved significant improvements in accuracy and reliability compared to standard intrusion detection benchmarks.
Used to capture spatial patterns and detect abnormalities in network feature distributions. CNN models were effective in detecting DDoS and brute-force attacks by learning distinct packet-level patterns. The architecture was optimized with dropout and batch normalization to prevent overfitting. Performance tuning with hyperparameter optimization techniques such as grid search further improved generalization.
LSTM networks excelled at capturing temporal dependencies in network flows, making them particularly useful for detecting slow-rate or persistent attacks. Their sequential memory capabilities allowed for higher accuracy in classifying time-dependent anomalies compared to other deep learning models. Layer stacking and bidirectional LSTM variants were also tested for performance enhancement.
Used as a baseline for performance comparison, RF provided fast and interpretable results. It excelled in feature importance ranking and demonstrated strong performance for structured tabular data, making it a robust classical method for preliminary detection tasks. Ensemble methods and feature bagging improved robustness across different attack classes.
An unsupervised approach that reconstructed input features to determine anomalies based on reconstruction error. This method performed well in identifying zero-day and rare attacks due to its generalization ability across different traffic types and attack classes. A combination of sparse and denoising autoencoders was explored to reduce overfitting and enhance anomaly detection accuracy.
# Enhanced: LSTM Model Training for Cyber Threat Detection
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dropout, Dense, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping
model = Sequential()
model.add(LSTM(128, input_shape=(X_train.shape[1], X_train.shape[2]), return_sequences=True))
model.add(BatchNormalization())
model.add(Dropout(0.3))
model.add(LSTM(64, return_sequences=True))
model.add(BatchNormalization())
model.add(Dropout(0.3))
model.add(LSTM(32, return_sequences=False))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
early_stop = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
history = model.fit(
X_train, y_train,
epochs=50,
batch_size=64,
validation_split=0.2,
callbacks=[early_stop]
)
Validation Accuracy: 96.8%
F1 Score: 0.957
False Positive Rate: 1.8%
Detection Latency: 0.28 seconds
Model Confidence (avg): 94.1%