Optimizing Portfolios with Modified NSGA-II
Solutions
Kaiyuan Lou
DukeKunshan University
Suzhou, China
AbstractIn financial management, portfolio optimization
remains a central challenge. From Markowitz's Mean-
Variance Optimization to modern deep learning methods,
capturing the multi-faceted nature of portfolio decisions is
complex. This paper emphasizes the potential of Multi-
objective Evolutionary Algorithms (MOEAs), specifically the
NSGA-II. We propose key modifications to NSGA-II, including
refined selection strategies, dynamic mutation parameters and
optimization for initialization. Initial results showcase
enhanced performance and a more comprehensive Pareto front.
Additionally, we employ the Monte Carlo Markov Chain
(MCMC) for a ten-year portfolio projection. Future avenues
include comparisons with other algorithms and expanding our
refined NSGA-II's application to diverse optimization tasks.
KeywordsNSGA-II, portfolio optimization, multi-object
optimization, evolutionary algorithm
I. INTRODUCTION
In the field of financial management, portfolio
optimization stands as a cornerstone, striving to craft the
ideal balance of assets that maximizes returns while
minimizing risks. However, achieving this balance is often a
complex process, with numerous variables and objectives to
consider, where algorithms play an instrumental role. One
such powerful algorithm, the NSGA-II [1], has been lauded
for its capabilities in multi-objective optimization scenarios.
Yet, as the complexities of financial markets evolve and the
demands of investors become more nuanced, there's an
urgent need to further refine and adapt the computational
strategies, and there is room for NSGA-II's further
enhancement. This article aims to bridge this gap by
proposing and examining a set of principal modifications to
the NSGA-II algorithm. Through a series of modifications,
including improved tournament selection strategies, dynamic
mutation parameters, and optimized population initialization,
we hope to push the boundaries of what's possible in
generating optimal asset allocations. In the subsequent
sections, we introduce a literature review detailing portfolio
optimization algorithms and Multi-objective Evolutionary
Algorithms (MOEAs). The article then elucidates our
proposed methodology and present the contrast between
traditional and our enhanced NSGA-II approaches. We
further employ the Monte Carlo Markov Chain (MCMC) for
a ten-year return prediction comparison. In conclusion
section, we offer a thorough evaluation and discussion of our
findings.
II. LITERATURE REVIEW
Portfolio optimization has consistently anchored financial
management and investment analysis discussions over the
past decades. Classical Linear Programming (LP) models,
such as Markowitz's Mean-Variance Optimization (MVO)
and the Black-Litterman Model, laid the foundation for
understanding risk-return dynamics. As the financial
landscape evolved, researchers began exploring advanced
strategies like machine learning approach [2] and deep
learning approach [3] to enhance portfolio performance.
However, as pointed out by Renata Mansini et al. [4], at
the multiple criteria nature of the portfolio problem, a single
LP model might not capture all the nuances of real-world
portfolio optimization. And for models using Artificial
Neuron Network (ANN), it's hard for them to generate the
entire pareto front base on NN's structural limitation.
This brings our attention to Multi-objective Evolutionary
Algorithms (MOEAs), specifically NSGA-II [1]. Renowned
for its capability to mirror real-world problems, NSGA-II
presents an entire Pareto front instead of just a single
optimum. Following its success, several augmentations
emerged. MNSGA-II [5] introduced the Dynamic Crowding
Distance (DCD) to maintain diversity in populations, while a
Novel Ranking Scheme [6] optimized algorithmic time
complexity. NSGA-III [7] and its variants, such as U-
NSGA-III [8] and θ-NSGA-III [9], inherited NSGA-II's core
principles but substituted crowding distance with reference
directions to bolster diversity. Further enhancements came
from Zimin Gu and Gaige Wang [10], who integrated
information feedback mechanisms with NSGA-III for large-
scale multi-objective optimization.
While NSGA-III shows advancements over NSGA-II, the
latter still garners considerable attention among academic
researchers. This is partly because NSGA-III is only one
possible evolution for traditional NSGA-II and there are
many other possibilities. Recent studies have delved into the
acceleration of the crossover process [11] within NSGA-II,
while others have proved its non-trivial lower bound [12].
Notably, a contemporary research piece [13] combined deep
learning with NSGA-II, addressing emission dispatch
challenges.
In light of these developments, our intention is clear: to
revisit the core NSGA-II algorithm and enhance its
efficiency through strategic modifications. Unlike NSGA-III
which "pin" the solutions to the solution space uniformly
using strict rule, our interest lies in guiding individuals to
explore the solution space more intelligently and effectively.
III. METHOD
To enhance the efficacy of the NSGA-II algorithm, three
principal modifications are suggested. These
recommendations emphasize (1) the normalization of
population distribution, (2) the promotion of mutation and
exploration as convergence approaches, and (3) the
optimization of the initial population distribution.
Additionally, throughout our implementation, we employ an
Improved Crowding Distance formula [13] to further
enhance performance.
We performed an empirical test using the data of 26
different stocks collected from various sectors for portfolio
optimization, with the goal of pinpointing an optimal Pareto
front.
A. Population Distribution
In the conventional NSGA-II algorithm, the fronts and
the crowding distance are the two critical factors that
determine whether a new individual will survive or an
existing individual will become a parent after tournament
selection.
While crowding distance and tournament selection are
mechanisms intended to encourage population diversity, it is
still possible for the population to cluster within a certain
segment of the Pareto-optimal front, rather than achieving an
equal distribution throughout. During tournament
comparisons, the rank attributed to the non-domination front
takes precedence over crowding distance. Consequently, if a
subset of the population gravitates towards a specific region
on the Pareto front that inherently has a superior rank, this
subset is likely to dominate over other solutions with a lower
rank in the non-dominance hierarchy. Such dominance has
the potential to pull solutions of a lower rank towards its
vicinity, culminating in a progressively denser concentration
in these regions.
Fig. 1. Population distributation at pareto front generated by NSGA-II
Fig. 1 illustrates an optimal solution set established by
the NSGA-II for the portfolio optimization task. The plot
distinctly reveals that the solution is extremely dense in the
center of the front, and excessively sparse on both sides. This
uneven distribution is not the desired outcome. An
imbalanced distribution of solutions implies that the
algorithm might overlook potential viable solutions on both
ends. On the other hand, solutions in the center might be
over-explored (since this part may have already converged,
an excess of computing resources can only lead to marginal
improvements), which would result in a wastage of
computational resources.
To address this, we refined the tournament selection
strategy. In the traditional NSGA-II algorithm, candidates for
the tournament were chosen randomly from the entire
population set, which means the selection is normally
distribute for the population rather than the possible solution
space. If the population distribution is skewed, the
tournament selection will be biased as well. Unlike NSGA-
III [7] which use reference directions and introduce
additional computational complexity, we don't "pin" the
solutions around reference vector, but still offer them certain
opportunity to explore.
The improved selection strategy is shown below
Where:
f is the array representing the fronts of each solution.
c is the array representing the crowding distances of
each solution.
k is the number of candidates in the tournament.
b is the bias parameter, where 0 means no bias (then
the algorithm will retrograde to normal NSGA-II), 1
means strictly pick solutions with highest crowding
distance.
The novel strategy we propose is to bias the random
selection in favor of less crowded solutions, making these
more likely to be selected as tournament candidates. This
approach will ensure that sparse areas (those needing more
exploration) are more actively explored.
Fig. 2. Population distributation at pareto front comparsion
Fig. 2 displays the results produced by both the original
NSGA-II and the NSGA-II with an optimized selection
strategy on the same task. From the visualization, it's evident
that the optimized version yields a more continuous and
superior Pareto front.
B. Dynamic Mutation Parameters
In Evolutionary Algorithms, two critical parameters
pertain to mutation: the mutation probability and sigma. The
former dictates the likelihood of a mutation occurring, while
the latter determines the magnitude of the mutation. Usually,
as the generations increase, a decay mechanism will be
introduced to reduce the mutation probability. This idea
bears some resemblance to the optimizer and learning rate
concepts in Artificial Neural Networks (ANN).
In our methodology, we dynamically adjust both the
mutation probability and sigma throughout the evolutionary
process to promote exploration and maintain diversity
within the population. The foundational principle for this
dynamic adaptation is straightforward: the denser the Pareto
front becomes, the higher the likelihood and magnitude of
mutations. By making the population more "energetic" or
active to try to jump out of the local optimal as convergence
nears, the algorithm may require more generations to reach
convergence. However, this can increase the chances of
discovering an optimal solution.
We've retained the decay mechanism, ensuring that over
time the dynamic mutation probability and sigma gradually
diminish, thereby not hindering the eventual convergence of
solutions.
C. Population Initialization
In NSGA-II, the population is initialized randomly. For
portfolio optimization tasks, the starting position of the
population plays a crucial role. If random initialization
serendipitously yields several promising solutions, the
Pareto fronts can potentially improve. This is because these
superior solutions will rapidly produce numerous offspring,
minimizing the time spent on exploration and increasing the
likelihood of convergence to an optimal solution.
The size of the population is pivotal in this context. A
larger population ensures more extensive coverage of the
solution space, thereby increasing the probability of an
advantageous start. However, a significant drawback is that
larger populations demand considerable computational
resources, which often isn't feasible.
A strategic approach to harness the benefits of both a
good start from a larger population and the computational
efficiency of a smaller one involves an initial "burst" of
exploration. Start with an exceptionally large population
size for just one generation to scout for optimal starting
positions, and then scale down to a standard-sized
population for subsequent exploration. This technique can
markedly enhance result quality with negligible additional
computational overhead.
Additionally, employing a combination of rank in the
non-dominant front and crowding distance to downsize the
population post-initialization yields dual benefits. Apart
from achieving a superior Pareto front, the overall
distribution of the population aligns more closely with the
Pareto front. This proximity facilitates accelerated
exploration in subsequent generations, optimizing the
algorithm's efficiency and efficacy.
Fig. 3. Standard initialization vs. optimized initialization
The two graphs in Fig. 3 vividly highlight the stark
contrast between standard initialization and optimized
initialization. It's clear that the optimized approach
outperforms the standard one significantly. The blue dots
symbolize random solutions, providing a visual of the
solution space; red dots represent the optimal solutions
identified, while the yellow dots signify other members of
the population. Even though both scenarios encompass
3,000 individuals, the latter not only produces superior
Pareto fronts but also boasts a more favorable distribution of
the population.
IV. EXPERIMENTAL ANALYSIS
For a portfolio optimization task, three different setups
were executed and compared using the same dataset:
Standard NSGA-II
NSGA-II with selection optimization
NSGA-II enhanced with selection optimization,
dynamic mutation, and optimized initialization
Upon analyzing the Pareto fronts generated by each of
these algorithms, we proceeded to select the solution with
the highest Sharpe Ratio from each set. These chosen
solutions were then subjected to a Monte Carlo Markov
Chain (MCMC) analysis to further evaluate their predictive
capabilities.
A. Optimized Pareto Fronts
During the iteration, the hypervolume for pareto front
was used to track and visualize the performance of the three
NSGA algorithms.
Fig. 4. Hypervolume across iterations comparsion
The graph clearly illustrates the superior performance of
the fully optimized strategy, followed by the selection-
optimized NSGA-II, while the standard NSGA-II trails
behind.
From the visual representation:
Standard NSGA-II: Both the standard NSGA-II and
the selection-optimized variant start with nearly
identical hypervolume values, suggesting a similar
initial performance. In stark contrast, the fully
optimized strategy with its tailored initialization
begins its journey with a significantly enhanced
hypervolume, indicating a promising outset.
Evolutionary Progression: The standard NSGA-II
showcases two prominent increase in its hypervolume,
signifying two key moments when its Pareto front
witnessed substantial advancements. On the other
hand, the two optimized versions depict more
consistent and frequent upticks in their hypervolumes.
This regularity in growth accentuates the
effectiveness of the refined tournament selection
strategy, affirming its role in directing the
population's exploration of the solution space more
proficiently.
Convergence and Duration: The algorithm
equipped with dynamic mutation parameters exhibits
a decelerated convergence rate, leading to an
extended training duration. However, its hypervolume
continues to expand, especially as it nears
convergence. This persistence in growth underscores
the advantage of dynamic mutation, allowing the
algorithm to continually refine and potentially
discover superior solutions even in the latter stages of
the optimization process.
Fig. 5. Pareto front found by 3 algorithms
Fig. 5 illustrates the final solution sets produced by all
three algorithms, highlighting the distribution of the Pareto
front determined by each. The diagram effectively
showcases the advancements in selection optimization,
dynamic mutation parameters, and initialization
optimization.
When compared to the Pareto front of the standard
NSGA-II, selection optimization offers a Pareto front with
superior distribution and a broader coverage of all potential
solutions. The standard version not only lacks solutions for
portfolios with extreme returns but also offers limited
optimization for solutions not situated at the center.
However, in the standard NSGA-II's favor, a high
population density in the middle indicates that this narrow
region has been thoroughly explored, meaning its solutions
within this range outperform the selection optimization
algorithm.
The introduction of dynamic mutation parameters and
initialization optimization nudges the Pareto front closer to
the ideal direction. In the fully optimized version, the entire
front outperforms the other two algorithms. This implies
that for every solution found by the other algorithms, the
fully optimized algorithm has identified solutions that are
superior in both return and volatility, achieving complete
dominance. Consequently, as shown in the plot with the
Sharpe Ratio, the optimized algorithm gives a higher value.
B. MCMC Projection
The Monte Carlo Markov Chain (MCMC) is a statistical
method that uses the principles of Markov chains to sample
from a probability distribution. This sampling technique is
particularly useful for dimensions that are difficult to
navigate, making it a robust tool in forecasting complex
systems.
In portfolio optimization, a Pareto front that offers
comprehensive coverage over the solution space, catering to
both aggressive and conservative investors is not the only
dimension we aiming at. Additionally, we seek the singular
best portfolio considering both return and volatility. So, for
each algorithm, we choose the solution with the highest
Sharpe ratio as the optimal one and employ MCMC to
forecast its performance over the next ten years for
comparison.
Fig. 6. MCMC projection comparsion for 3 algorithms
The box plot Fig. 6 displays the projected returns using
MCMC for the next ten years. From the diagram, it's evident
that the selection-optimized solution excels in terms of max
return, min return, majority, and average. Furthermore, the
fully optimized version surpasses even that.
TABLE I. RETURN FOR NSGA-II
Year
<0
>0
>=5
>=10
1
30.86
69.14
0.00
0.00
2
23.43
76.66
0.00
0.00
3
18.54
81.46
0.11
0.00
4
15.80
84.20
0.87
0.03
5
12.77
87.23
3.05
0.12
6
10.52
89.48
6.53
0.65
7
8.93
91.07
12.02
1.83
8
7.84
92.52
18.06
4.21
9
6.64
93.54
24.63
7.20
10
5.31
94.69
31.05
11.29
TABLE II. RETURN FOR SELECTION OPTIMIZED NSGA-II
Year
<0
>0
>=5
>=10
1
30.46
69.54
0.00
0.00
2
23.65
76.35
0.01
0.00
3
18.70
81.30
0.22
0.00
4
15.61
84.39
1.49
0.05
5
12.47
87.53
4.50
0.24
6
10.62
89.38
8.96
1.27
7
8.56
91.44
14.45
3.11
8
7.29
92.71
20.44
5.74
9
5.94
94.06
27.43
9.54
10
5.19
94.81
34.09
14.19
TABLE III. RETURN FOR FULLY OPTIMIZED NSGA-II
Year
<0
>0
>=5
>=10
1
25.27
74.73
0.00
0.00
2
17.39
82.61
0.01
0.00
3
13.25
26.75
0.43
0.00
4
9.97
90.03
2.54
0.16
5
7.30
92.70
7.37
0.70
6
5.74
94.26
14.83
2.67
7
4.61
95.39
23.66
6.34
8
3.38
96.62
32.17
11.24
9
2.68
97.32
41.47
17.65
10
2.09
97.91
50.53
24.46
The three tables generated by MCMC further underscore
the complete dominance of the optimized algorithm's returns
over the standard NSGA-II algorithm. Table I (NSGA-II)
shows the results of the standard NSGA-II algorithm. Table
II (Selection Optimized) illustrates the improvements made
through selection optimization. Table III (Fully Optimized)
presents the fully optimized results, demonstrating the
complete dominance of the optimized algorithm's returns
over the standard NSGA-II algorithm. For each year in the
upcoming decade, the optimized solution presents a higher
likelihood of yielding superior returns.
V. CONCLUSION
. Over the course of this study, we first revisited the
foundational principles of portfolio optimization and delved
into the nuances of Multi-objective Evolutionary Algorithms,
specifically the NSGA-II. Our primary goal was to optimize
performance of the original NSGA-II algorithm, by applying
a set of strategic modifications. Our proposed enhancements
aimed to catalyze individuals' exploration of the solution
space more intelligently and effectively. Through different
testing and comparison, particularly using the MCMC for
forecasting over a decade, we observed evident
improvements in the modified NSGA-II's efficiency.
Additionally, with the growing interest in the application
of deep neural networks (DNN) to optimization tasks,
contrasting our model with DNN-based optimization
strategies could provide fresh insights and pave the way for
hybrid solutions.
In summary, while many advancements like NSGA-III
have exhibited superior capabilities, the potential of the
original NSGA-II remains vast. By tweaking its rules and
functions, we've illustrated the algorithm's potential in
navigating the intricacies of modern portfolio optimization.
REFERENCES
[1] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist
multiobjective genetic algorithm: NSGA-II,” IEEE Transactions on
Evolutionary Computation, vol. 6, no. 2, pp. 182197, Apr. 2002, doi:
https://doi.org/10.1109/4235.996017.
[2] G.-Y. Ban, N. El Karoui, and A. E. B. Lim, “Machine Learning and
Portfolio Optimization,” Management Science, vol. 64, no. 3, pp.
11361154, Mar. 2018, doi: https://doi.org/10.1287/mnsc.2016.2644.
[3] Z. Zhang, S. Zohren, and S. Roberts, “Deep Learning for Portfolio
Optimization,” The Journal of Financial Data Science, vol. 2, no. 4,
pp. 820, Aug. 2020, doi: https://doi.org/10.3905/jfds.2020.1.042.
[4] R. Mansini, W. Ogryczak, and M. G. Speranza, “Twenty years of
linear programming based portfolio optimization,” European Journal
of Operational Research, vol. 234, no. 2, pp. 518535, Apr. 2014, doi:
https://doi.org/10.1016/j.ejor.2013.08.035.
[5] S. Dhanalakshmi, S. Kannan, K. Mahadevan, and S. Baskar,
“Application of modified NSGA-II algorithm to Combined Economic
and Emission Dispatch problem,” International Journal of Electrical
Power & Energy Systems, vol. 33, no. 4, pp. 9921002, May 2011,
doi: https://doi.org/10.1016/j.ijepes.2011.01.014.
[6] Rio D’Souza, K. Chandra Sekaran, and A. Kandasamy, Improved
NSGA-II Based on a Novel Ranking Scheme,” Feb. 2010, doi:
https://doi.org/10.48550/arxiv.1002.4005.
[7] K. Deb and H. Jain, “An Evolutionary Many-Objective Optimization
Algorithm Using Reference-Point-Based Nondominated Sorting
Approach, Part I: Solving Problems With Box Constraints,” IEEE
Transactions on Evolutionary Computation, vol. 18, no. 4, pp. 577
601, Aug. 2014, doi: https://doi.org/10.1109/tevc.2013.2281535.
[8] Haitham Seada and K. Deb, “U-NSGA-III: A Unified Evolutionary
Optimization Procedure for Single, Multiple, and Many Objectives:
Proof-of-Principle Results,” pp. 3449, Jan. 2015, doi:
https://doi.org/10.1007/978-3-319-15892-1_3.
[9] Q. Liu, X. Liu, J. Wu, and Y. Li, “An Improved NSGA-Algorithm
Using Genetic K-means Clustering Algorithm,” IEEE Access, pp. 11,
2019, doi: https://doi.org/10.1109/access.2019.2960531.
[10] Z.-M. Gu and G.-G. Wang, Improving NSGA-III algorithms with
information feedback models for large-scale many-objective
optimization,” Future Generation Computer Systems, vol. 107, pp.
4969, Jun. 2020, doi: https://doi.org/10.1016/j.future.2020.01.048.
[11] B. Doerr and Z. Qu, “Runtime Analysis for the NSGA-II: Provable
Speed-Ups from Crossover,” vol. 37, no. 10, pp. 1239912407, Jun.
2023, doi: https://doi.org/10.1609/aaai.v37i10.26461.
[12] B. Doerr and Z. Qu, “From Understanding the Population Dynamics
of the NSGA-II to the First Proven Lower Bounds,” vol. 37, no. 10,
pp. 1240812416, Jun. 2023, doi:
https://doi.org/10.1609/aaai.v37i10.26462.
[13] X. Chu and X. Yu, “Improved Crowding Distance for NSGA-II,” Nov.
2018, doi: https://doi.org/10.48550/arxiv.1811.12667.