Research
My research interests focus on representation learning, non-linear signal processing, and big data analytics.
My research aims are (1) to rethink the use of machine learning models through bidirectional representation and learning: (2) to develop big data methods that extract and use explainable and useful priors such as hierarchical information to learn and recall Big-K patterns: and (3) to explore stochastic resonance-like benefits in machine learning methods.
Potential areas of application include seq2seq tasks, inverse problems, Markov decision processes, generative AI, optical computing, and so on.
Bidirectional Representation and Learning
The central theme here is to introduce and utilize backward mapping as a form of feedback information for controlling the learning and recall of patterns with machine learning models.
​
The bidirectional representation focuses on a family of models that share the same set of parameters for bidirectional (forward and backward) inference or approximation between the input space X and output space Y. The forward inference asks the effect-question: What is the effect of input x? The backward inference poses the cause-question: What is the cause of output y?
Bidirectional learning finds the best set of parameters for such model representation. The probabilistic representation of bidirectional training follows from the bidirectional optimization of the model's posterior. This training approach introduces a feedback structure into the training and probabilistic structure of bidirectional models.
My contribution
I introduced the bidirectional backpropagation algorithm.
​
The new bidirectional backpropagation (B-BP) algorithm trains a neural network in the forward and backward directions through the same web of synapses. Signals pass forward through weight matrices and neurons. They pass backwards through the transpose weight matrices and the same neurons. The forward pass produces the likelihood of an observed output y while the backward pass produces the dual likelihood of an observed input x.
​
Neural classifiers have an inherent but often ignored bidirectional structure. This arises because the identity input neurons define a hidden regressor when the network runs in the backward direction. Ordinary BP ignores such backward training. So it ignores the hidden regressor in the backward direction when training a classifier. Bayesian B-BP further allows prior probabilities to shape the optimization of the network’s global posterior probability structure. Figure 1 shows the bidirectional structure of a deep neural classifier and its hidden regressor in the backward direction.​
​
B-BP tends to improve the performance of neural classifiers and generative adversarial networks (GANs).

Figure 1: Bayesian Bidirectional Backpropagation: The network maximizes the joint forward and backward posterior probability. The network diagram shows the simple but practical case of a Laplacian or Lasso-like prior on the input weights. The identity neurons at the input field give rise to a vector normal likelihood and thus a squared-error error function in the backward direction. The softmax neurons at the output field give rise to a multinomial likelihood and thus a cross-entropy error function in the forward direction. So the network acts as a classifier in the forward direction and a regressor in the backward direction.
Publications
-
B. Kosko, and O. Adigun, "Bidirectional Variational Autoencoders", IEEE World Conference on Computational Intelligence (WCCI), 2024. doi:10.1109/IJCNN60899.2024.10650379
-
O. Adigun, and B. Kosko, "Bidirectional Backpropagation Autoencoding Networks for Image Compression and Denoising", 22nd IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, 2023. doi:10.1109/ICMLA58977.2023.00107
-
O. Adigun, and B. Kosko. "Deeper Bidirectional Neural Networks with Generalized Non-Vanishing Hidden Neurons", 21st IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE 2022. doi:10.1109/ICMLA55696.2022.00017
-
O. Adigun, and B. Kosko. "Bidirectional backpropagation for high-capacity blocking networks.", 20th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2021. doi:/10.1109/ICMLA52953.2021.00118
-
O. Adigun, and B. Kosko. "Bayesian Bidirectional Backpropagation Learning". 2021 International Joint Conference on Neural Networks (IJCNN). IEEE-WCCI, 2021. doi:/10.1109/IJCNN52387.2021.9533873
-
O. Adigun, and B. Kosko. "Bidirectional backpropagation." IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020. doi:/10.1109/TSMC.2019.2916096
-
O. Adigun, and B. Kosko. "Noise-boosted bidirectional backpropagation and adversarial learning." Neural Networks, 2019. doi:/10.1016/j.neunet.2019.09.016
-
O. Adigun, and B. Kosko. "Training generative adversarial networks with bidirectional backpropagation.", 17th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, 2018. doi:/10.1109/ICMLA.2018.00190
-
O. Adigun, and B. Kosko. "Bidirectional representation and backpropagation learning." 2016 International Joint Conference on Advances in Big Data Analytics, pp. 3-9, 2016. [Paper]
Large-Scale Pattern Recognition
The central theme of this topic is to explore the intersection between coding theory and machine theory. The goal is to design methods for picking data-specific codesets that capture explainable and useful priors (such as hierarchical structure) and increases the pattern capacity of neural classifiers (and other machine learning models).
My contribution
I introduced deep-neural classifier blocking with random logistic coding. This method increases the capacity of deep-neural classifiers. I also introduced the new generalized nonvanishing activation (G-NoVa) for deep networks.
Logistic coding picks binary codewords that represent class patterns at the output layer of neural classifiers or neural blocks. This scheme increases the pattern capacity of neural classifiers because logistic coding picks K pattern codewords from the 2^M vertices of the M-dimensional binary or bipolar hypercube where M is the number of output neurons. Figure 2(a) shows how the logistic coding increases the pattern capacity. Random logistic coding randomly picks K codewords from the 2^M possible codewords using the approximate orthogonality of random codewords in high dimension. Figure 2(b) also shows a set of random logistic codewords with K = 100 and reduced codelength M ∈ {20, 60}.
Random logistic coding combines with blocking to further increase the capacity of a deep neural classifier. Blocking means breaking down a very deep neural classifier into contiguous small blocks (sub-networks). Blocking maximizes the complete likelihood of a deep network instead of the output likelihood.

(a). Softmax coding vs. Logistic coding

(b). Random logistic coding
Figure 2: Softmax 1-in-K coding versus logistic coding: Logistic coding gives Big-K capacity because it uses all 2^K vertices of the unit hypercube. 1-in-K coding has a limited capacity because it uses only the K vertices of the embedded probability simplex. (a). 1-in-K coding with a maximum capacity of 3 classes because the code length M = 3. The simplex has only 3 vertices in this case. (b). These are codewords generated from the random logistic coding scheme with sampling probability p = 0.5, K = 100, and M ∈ {20, 60, 100}. M is the code length.
Publications
-
O. Adigun, and B. Kosko. "Deeper Bidirectional Neural Networks with Generalized Non-Vanishing Hidden Neurons", 21st IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE 2022. doi:10.1109/ICMLA55696.2022.00017
-
O. Adigun, and B. Kosko. "Deeper neural network with non-vanishing logistic hidden units: NoVa vs. ReLU neurons", 20th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2021. doi:10.1109/ICMLA52953.2021.00227
-
O. Adigun, and B. Kosko. "Bidirectional backpropagation for high-capacity blocking networks.", 20th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2021. doi:/10.1109/ICMLA52953.2021.00118
-
O. Adigun, and B. Kosko. "High Capacity Deep Neural Classifiers with Logistic Neurons and Random Coding." Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), IEEE-WCCI. doi:/10.1109/IJCNN48605.2020.9207218 [Video]
Patent Application: B. Kosko and O. Adigun. "Blocking Neural Networks for High Capacity", US 2023/0316050 A1, Oct. 5, 2023. [Link]
Noise Benefit in Machine Learning
This topic focuses on exploring the common theme of stochasticity in machine learning and nonlinear signal processing methods that benefits from stochastic resonance. The goal is to define and characterize beneficial noise signals that improve the performance machine learning methods.
My contribution:
I introduced the new noisy recurrent backpropagation algorithm for time-varying signals or patterns
The noisy recurrent backpropagation (N-RBP) algorithm is the noise-enhanced form of the RBP algorithm that trains deep neural networks on time-varying signals. This feedback architecture uses the reduction of the backpropagation algorithm to the Expectation-Maximization algorithm for iterative maximum-likelihood estimation. N-BRP injects the beneficial noisy expectation-maximization (NEM) noise in the neurons of recurrent neural networks. This noise boost makes the current training signal more probable as the system climbs the nearest hill of the log-likelihood. This differs from the simple dither or blind white noise that improves some nonlinear signal processing techniques in stochastic resonance.
The N-RBP improves the training speed and performance of recurrent neural classifiers and regressors. Figure 3 shows how beneficial NEM RBP noise differs for a classifier and a regressor. The noise benefit also extends to the new bidirectional backpropagation.


(a). Additive NEM noise: Classifier
(b). Additive NEM noise: Regressor
Figure 3: Beneficial NEM noise for a recurrent backpropagation classifier and regressor. The NEM inequality leads to different beneficial noise samples for the two networks and speeds up their backpropagation training on average. (a) Classifier NEM RBP noise lies below the NEM hyperplane in noise space. Here the output activation was a = [0.6, 0.3, 0.1] and the target was y = [1, 0, 0]. (b) Regression NEM RBP noise lies inside the sphere that arises from the NEM inequality. The output activation was a = [1.0, 2.0, 1.0] and the target was y = [2.0, 3.0, 2.0].
Publications:
-
O. Adigun, and B. Kosko. "Noise-boosted recurrent backpropagation." Neurocomputing, vol. 559, pp. 126438, 2023. doi.org/10.1016/j.neucom.2023.126438
-
O. Adigun, and B. Kosko. "Noise-boosted bidirectional backpropagation and adversarial learning." Neural Networks, vol. 120, pp 9-31, 2019. doi:/10.1016/j.neunet.2019.09.016
-
O. Adigun, and B. Kosko. "Using noise to speed up video classification with recurrent backpropagation.", International Joint Conference on Neural Networks (IJCNN), pp. 108-115. IEEE, 2017. [doi:/10.1109/IJCNN.2017.7965843] [Best Paper Award]
Intellectual Property: SmartNoise [Link]
Other Collaborations
-
P. Olsen, R. Chandra, and O. Adigun. "Super resolution for satellite images." U.S. Patent No. 12,045,311. 23 July 2024. [Patent]
-
O. Adigun, P. Olsen, and R. Chandeer. "Location-Aware Super-Resolution for Satellite Data Fusion", 2022 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), IEEE 2022. doi:/10.1109/IGARSS46834.2022.9884391]
-
J. Qijia, O. Adigun, H. Narasimhan, M. Milani Fard, and M. Gupta. "Optimizing black-box metrics with adaptive surrogates." In International Conference on Machine Learning, pp. 4784-4793. PMLR, 2020. [Paper]