Optimizing Portfolios with Modified NSGA-II

Solutions

Kaiyuan Lou

DukeKunshan University

Suzhou, China

Abstract—In financial management, portfolio optimization

remains a central challenge. From Markowitz's Mean-

Variance Optimization to modern deep learning methods,

capturing the multi-faceted nature of portfolio decisions is

complex. This paper emphasizes the potential of Multi-

objective Evolutionary Algorithms (MOEAs), specifically the

NSGA-II. We propose key modifications to NSGA-II, including

refined selection strategies, dynamic mutation parameters and

optimization for initialization. Initial results showcase

enhanced performance and a more comprehensive Pareto front.

Additionally, we employ the Monte Carlo Markov Chain

(MCMC) for a ten-year portfolio projection. Future avenues

include comparisons with other algorithms and expanding our

refined NSGA-II's application to diverse optimization tasks.

Keywords—NSGA-II, portfolio optimization, multi-object

optimization, evolutionary algorithm

I. INTRODUCTION

In the field of financial management, portfolio

optimization stands as a cornerstone, striving to craft the

ideal balance of assets that maximizes returns while

minimizing risks. However, achieving this balance is often a

complex process, with numerous variables and objectives to

consider, where algorithms play an instrumental role. One

such powerful algorithm, the NSGA-II [1], has been lauded

for its capabilities in multi-objective optimization scenarios.

Yet, as the complexities of financial markets evolve and the

demands of investors become more nuanced, there's an

urgent need to further refine and adapt the computational

strategies, and there is room for NSGA-II's further

enhancement. This article aims to bridge this gap by

proposing and examining a set of principal modifications to

the NSGA-II algorithm. Through a series of modifications,

including improved tournament selection strategies, dynamic

mutation parameters, and optimized population initialization,

we hope to push the boundaries of what's possible in

generating optimal asset allocations. In the subsequent

sections, we introduce a literature review detailing portfolio

optimization algorithms and Multi-objective Evolutionary

Algorithms (MOEAs). The article then elucidates our

proposed methodology and present the contrast between

traditional and our enhanced NSGA-II approaches. We

further employ the Monte Carlo Markov Chain (MCMC) for

a ten-year return prediction comparison. In conclusion

section, we offer a thorough evaluation and discussion of our

findings.

II. LITERATURE REVIEW

Portfolio optimization has consistently anchored financial

management and investment analysis discussions over the

past decades. Classical Linear Programming (LP) models,

such as Markowitz's Mean-Variance Optimization (MVO)

and the Black-Litterman Model, laid the foundation for

understanding risk-return dynamics. As the financial

landscape evolved, researchers began exploring advanced

strategies like machine learning approach [2] and deep

learning approach [3] to enhance portfolio performance.

However, as pointed out by Renata Mansini et al. [4], at

the multiple criteria nature of the portfolio problem, a single

LP model might not capture all the nuances of real-world

portfolio optimization. And for models using Artificial

Neuron Network (ANN), it's hard for them to generate the

entire pareto front base on NN's structural limitation.

This brings our attention to Multi-objective Evolutionary

Algorithms (MOEAs), specifically NSGA-II [1]. Renowned

for its capability to mirror real-world problems, NSGA-II

presents an entire Pareto front instead of just a single

optimum. Following its success, several augmentations

emerged. MNSGA-II [5] introduced the Dynamic Crowding

Distance (DCD) to maintain diversity in populations, while a

Novel Ranking Scheme [6] optimized algorithmic time

complexity. NSGA-III [7] and its variants, such as U-

NSGA-III [8] and θ-NSGA-III [9], inherited NSGA-II's core

principles but substituted crowding distance with reference

directions to bolster diversity. Further enhancements came

from Zimin Gu and Gaige Wang [10], who integrated

information feedback mechanisms with NSGA-III for large-

scale multi-objective optimization.

While NSGA-III shows advancements over NSGA-II, the

latter still garners considerable attention among academic

researchers. This is partly because NSGA-III is only one

possible evolution for traditional NSGA-II and there are

many other possibilities. Recent studies have delved into the

acceleration of the crossover process [11] within NSGA-II,

while others have proved its non-trivial lower bound [12].

Notably, a contemporary research piece [13] combined deep

learning with NSGA-II, addressing emission dispatch

challenges.

In light of these developments, our intention is clear: to

revisit the core NSGA-II algorithm and enhance its

efficiency through strategic modifications. Unlike NSGA-III

which "pin" the solutions to the solution space uniformly

using strict rule, our interest lies in guiding individuals to

explore the solution space more intelligently and effectively.

III. METHOD

To enhance the efficacy of the NSGA-II algorithm, three

principal modifications are suggested. These

recommendations emphasize (1) the normalization of

population distribution, (2) the promotion of mutation and

exploration as convergence approaches, and (3) the

optimization of the initial population distribution.

Additionally, throughout our implementation, we employ an

Improved Crowding Distance formula [13] to further

enhance performance.

We performed an empirical test using the data of 26

different stocks collected from various sectors for portfolio

optimization, with the goal of pinpointing an optimal Pareto

front.

A. Population Distribution

In the conventional NSGA-II algorithm, the fronts and

the crowding distance are the two critical factors that

determine whether a new individual will survive or an

existing individual will become a parent after tournament

selection.

While crowding distance and tournament selection are

mechanisms intended to encourage population diversity, it is

still possible for the population to cluster within a certain

segment of the Pareto-optimal front, rather than achieving an

equal distribution throughout. During tournament

comparisons, the rank attributed to the non-domination front

takes precedence over crowding distance. Consequently, if a

subset of the population gravitates towards a specific region

on the Pareto front that inherently has a superior rank, this

subset is likely to dominate over other solutions with a lower

rank in the non-dominance hierarchy. Such dominance has

the potential to pull solutions of a lower rank towards its

vicinity, culminating in a progressively denser concentration

in these regions.

Fig. 1. Population distributation at pareto front generated by NSGA-II

Fig. 1 illustrates an optimal solution set established by

the NSGA-II for the portfolio optimization task. The plot

distinctly reveals that the solution is extremely dense in the

center of the front, and excessively sparse on both sides. This

uneven distribution is not the desired outcome. An

imbalanced distribution of solutions implies that the

algorithm might overlook potential viable solutions on both

ends. On the other hand, solutions in the center might be

over-explored (since this part may have already converged,

an excess of computing resources can only lead to marginal

improvements), which would result in a wastage of

computational resources.

To address this, we refined the tournament selection

strategy. In the traditional NSGA-II algorithm, candidates for

the tournament were chosen randomly from the entire

population set, which means the selection is normally

distribute for the population rather than the possible solution

space. If the population distribution is skewed, the

tournament selection will be biased as well. Unlike NSGA-

III [7] which use reference directions and introduce

additional computational complexity, we don't "pin" the

solutions around reference vector, but still offer them certain

opportunity to explore.

The improved selection strategy is shown below

Where:

• f is the array representing the fronts of each solution.

• c is the array representing the crowding distances of

each solution.

• k is the number of candidates in the tournament.

• b is the bias parameter, where 0 means no bias (then

the algorithm will retrograde to normal NSGA-II), 1

means strictly pick solutions with highest crowding

distance.

The novel strategy we propose is to bias the random

selection in favor of less crowded solutions, making these

more likely to be selected as tournament candidates. This

approach will ensure that sparse areas (those needing more

exploration) are more actively explored.

Fig. 2. Population distributation at pareto front comparsion

Fig. 2 displays the results produced by both the original

NSGA-II and the NSGA-II with an optimized selection

strategy on the same task. From the visualization, it's evident

that the optimized version yields a more continuous and

superior Pareto front.

B. Dynamic Mutation Parameters

In Evolutionary Algorithms, two critical parameters

pertain to mutation: the mutation probability and sigma. The

former dictates the likelihood of a mutation occurring, while

the latter determines the magnitude of the mutation. Usually,

as the generations increase, a decay mechanism will be

introduced to reduce the mutation probability. This idea

bears some resemblance to the optimizer and learning rate

concepts in Artificial Neural Networks (ANN).

In our methodology, we dynamically adjust both the

mutation probability and sigma throughout the evolutionary

process to promote exploration and maintain diversity

within the population. The foundational principle for this

dynamic adaptation is straightforward: the denser the Pareto

front becomes, the higher the likelihood and magnitude of

mutations. By making the population more "energetic" or

active to try to jump out of the local optimal as convergence

nears, the algorithm may require more generations to reach

convergence. However, this can increase the chances of

discovering an optimal solution.

We've retained the decay mechanism, ensuring that over

time the dynamic mutation probability and sigma gradually

diminish, thereby not hindering the eventual convergence of

solutions.

C. Population Initialization

In NSGA-II, the population is initialized randomly. For

portfolio optimization tasks, the starting position of the

population plays a crucial role. If random initialization

serendipitously yields several promising solutions, the

Pareto fronts can potentially improve. This is because these

superior solutions will rapidly produce numerous offspring,

minimizing the time spent on exploration and increasing the

likelihood of convergence to an optimal solution.

The size of the population is pivotal in this context. A

larger population ensures more extensive coverage of the

solution space, thereby increasing the probability of an

advantageous start. However, a significant drawback is that

larger populations demand considerable computational

resources, which often isn't feasible.

A strategic approach to harness the benefits of both a

good start from a larger population and the computational

efficiency of a smaller one involves an initial "burst" of

exploration. Start with an exceptionally large population

size for just one generation to scout for optimal starting

positions, and then scale down to a standard-sized

population for subsequent exploration. This technique can

markedly enhance result quality with negligible additional

computational overhead.

Additionally, employing a combination of rank in the

non-dominant front and crowding distance to downsize the

population post-initialization yields dual benefits. Apart

from achieving a superior Pareto front, the overall

distribution of the population aligns more closely with the

Pareto front. This proximity facilitates accelerated

exploration in subsequent generations, optimizing the

algorithm's efficiency and efficacy.

Fig. 3. Standard initialization vs. optimized initialization

The two graphs in Fig. 3 vividly highlight the stark

contrast between standard initialization and optimized

initialization. It's clear that the optimized approach

outperforms the standard one significantly. The blue dots

symbolize random solutions, providing a visual of the

solution space; red dots represent the optimal solutions

identified, while the yellow dots signify other members of

the population. Even though both scenarios encompass

3,000 individuals, the latter not only produces superior

Pareto fronts but also boasts a more favorable distribution of

the population.

IV. EXPERIMENTAL ANALYSIS

For a portfolio optimization task, three different setups

were executed and compared using the same dataset:

• Standard NSGA-II

• NSGA-II with selection optimization

• NSGA-II enhanced with selection optimization,

dynamic mutation, and optimized initialization

Upon analyzing the Pareto fronts generated by each of

these algorithms, we proceeded to select the solution with

the highest Sharpe Ratio from each set. These chosen

solutions were then subjected to a Monte Carlo Markov

Chain (MCMC) analysis to further evaluate their predictive

capabilities.

A. Optimized Pareto Fronts

During the iteration, the hypervolume for pareto front

was used to track and visualize the performance of the three

NSGA algorithms.

Fig. 4. Hypervolume across iterations comparsion

The graph clearly illustrates the superior performance of

the fully optimized strategy, followed by the selection-

optimized NSGA-II, while the standard NSGA-II trails

behind.

From the visual representation:

• Standard NSGA-II: Both the standard NSGA-II and

the selection-optimized variant start with nearly

identical hypervolume values, suggesting a similar

initial performance. In stark contrast, the fully

optimized strategy — with its tailored initialization

— begins its journey with a significantly enhanced

hypervolume, indicating a promising outset.

• Evolutionary Progression: The standard NSGA-II

showcases two prominent increase in its hypervolume,

signifying two key moments when its Pareto front

witnessed substantial advancements. On the other

hand, the two optimized versions depict more

consistent and frequent upticks in their hypervolumes.

This regularity in growth accentuates the

effectiveness of the refined tournament selection

strategy, affirming its role in directing the

population's exploration of the solution space more

proficiently.

• Convergence and Duration: The algorithm

equipped with dynamic mutation parameters exhibits

a decelerated convergence rate, leading to an

extended training duration. However, its hypervolume

continues to expand, especially as it nears

convergence. This persistence in growth underscores

the advantage of dynamic mutation, allowing the

algorithm to continually refine and potentially

discover superior solutions even in the latter stages of

the optimization process.

Fig. 5. Pareto front found by 3 algorithms

Fig. 5 illustrates the final solution sets produced by all

three algorithms, highlighting the distribution of the Pareto

front determined by each. The diagram effectively

showcases the advancements in selection optimization,

dynamic mutation parameters, and initialization

optimization.

When compared to the Pareto front of the standard

NSGA-II, selection optimization offers a Pareto front with

superior distribution and a broader coverage of all potential

solutions. The standard version not only lacks solutions for

portfolios with extreme returns but also offers limited

optimization for solutions not situated at the center.

However, in the standard NSGA-II's favor, a high

population density in the middle indicates that this narrow

region has been thoroughly explored, meaning its solutions

within this range outperform the selection optimization

algorithm.

The introduction of dynamic mutation parameters and

initialization optimization nudges the Pareto front closer to

the ideal direction. In the fully optimized version, the entire

front outperforms the other two algorithms. This implies

that for every solution found by the other algorithms, the

fully optimized algorithm has identified solutions that are

superior in both return and volatility, achieving complete

dominance. Consequently, as shown in the plot with the

Sharpe Ratio, the optimized algorithm gives a higher value.

B. MCMC Projection

The Monte Carlo Markov Chain (MCMC) is a statistical

method that uses the principles of Markov chains to sample

from a probability distribution. This sampling technique is

particularly useful for dimensions that are difficult to

navigate, making it a robust tool in forecasting complex

systems.

In portfolio optimization, a Pareto front that offers

comprehensive coverage over the solution space, catering to

both aggressive and conservative investors is not the only

dimension we aiming at. Additionally, we seek the singular

best portfolio considering both return and volatility. So, for

each algorithm, we choose the solution with the highest

Sharpe ratio as the optimal one and employ MCMC to

forecast its performance over the next ten years for

comparison.

Fig. 6. MCMC projection comparsion for 3 algorithms

The box plot Fig. 6 displays the projected returns using

MCMC for the next ten years. From the diagram, it's evident

that the selection-optimized solution excels in terms of max

return, min return, majority, and average. Furthermore, the

fully optimized version surpasses even that.

TABLE I. RETURN FOR NSGA-II

Year

>=5

>=10

30.86

69.14

0.00

23.43

76.66

0.00

18.54

81.46

0.11

0.00

15.80

84.20

0.87

0.03

12.77

87.23

3.05

0.12

10.52

89.48

6.53

0.65

8.93

91.07

12.02

1.83

7.84

92.52

18.06

4.21

6.64

93.54

24.63

7.20

5.31

94.69

31.05

11.29

TABLE II. RETURN FOR SELECTION OPTIMIZED NSGA-II

Year

>=5

>=10

30.46

69.54

0.00

23.65

76.35

0.01

0.00

18.70

81.30

0.22

0.00

15.61

84.39

1.49

0.05

12.47

87.53

4.50

0.24

10.62

89.38

8.96

1.27

8.56

91.44

14.45

3.11

7.29

92.71

20.44

5.74

5.94

94.06

27.43

9.54

5.19

94.81

34.09

14.19

TABLE III. RETURN FOR FULLY OPTIMIZED NSGA-II

Year

>=5

>=10

25.27

74.73

0.00

17.39

82.61

0.01

0.00

13.25

26.75

0.43

0.00

9.97

90.03

2.54

0.16

7.30

92.70

7.37

0.70

5.74

94.26

14.83

2.67

4.61

95.39

23.66

6.34

3.38

96.62

32.17

11.24

2.68

97.32

41.47

17.65

2.09

97.91

50.53

24.46

The three tables generated by MCMC further underscore

the complete dominance of the optimized algorithm's returns

over the standard NSGA-II algorithm. Table I (NSGA-II)

shows the results of the standard NSGA-II algorithm. Table

II (Selection Optimized) illustrates the improvements made

through selection optimization. Table III (Fully Optimized)

presents the fully optimized results, demonstrating the

complete dominance of the optimized algorithm's returns

over the standard NSGA-II algorithm. For each year in the

upcoming decade, the optimized solution presents a higher

likelihood of yielding superior returns.

V. CONCLUSION

. Over the course of this study, we first revisited the

foundational principles of portfolio optimization and delved

into the nuances of Multi-objective Evolutionary Algorithms,

specifically the NSGA-II. Our primary goal was to optimize

performance of the original NSGA-II algorithm, by applying

a set of strategic modifications. Our proposed enhancements

aimed to catalyze individuals' exploration of the solution

space more intelligently and effectively. Through different

testing and comparison, particularly using the MCMC for

forecasting over a decade, we observed evident

improvements in the modified NSGA-II's efficiency.

Additionally, with the growing interest in the application

of deep neural networks (DNN) to optimization tasks,

contrasting our model with DNN-based optimization

strategies could provide fresh insights and pave the way for

hybrid solutions.

In summary, while many advancements like NSGA-III

have exhibited superior capabilities, the potential of the

original NSGA-II remains vast. By tweaking its rules and

functions, we've illustrated the algorithm's potential in

navigating the intricacies of modern portfolio optimization.

REFERENCES

[1] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist

multiobjective genetic algorithm: NSGA-II,” IEEE Transactions on

Evolutionary Computation, vol. 6, no. 2, pp. 182–197, Apr. 2002, doi:

https://doi.org/10.1109/4235.996017.

[2] G.-Y. Ban, N. El Karoui, and A. E. B. Lim, “Machine Learning and

Portfolio Optimization,” Management Science, vol. 64, no. 3, pp.

1136–1154, Mar. 2018, doi: https://doi.org/10.1287/mnsc.2016.2644.

[3] Z. Zhang, S. Zohren, and S. Roberts, “Deep Learning for Portfolio

Optimization,” The Journal of Financial Data Science, vol. 2, no. 4,

pp. 8–20, Aug. 2020, doi: https://doi.org/10.3905/jfds.2020.1.042.

[4] R. Mansini, W. Ogryczak, and M. G. Speranza, “Twenty years of

linear programming based portfolio optimization,” European Journal

of Operational Research, vol. 234, no. 2, pp. 518–535, Apr. 2014, doi:

https://doi.org/10.1016/j.ejor.2013.08.035.

[5] S. Dhanalakshmi, S. Kannan, K. Mahadevan, and S. Baskar,

“Application of modified NSGA-II algorithm to Combined Economic

and Emission Dispatch problem,” International Journal of Electrical

Power & Energy Systems, vol. 33, no. 4, pp. 992–1002, May 2011,

doi: https://doi.org/10.1016/j.ijepes.2011.01.014.

[6] Rio D’Souza, K. Chandra Sekaran, and A. Kandasamy, “Improved

NSGA-II Based on a Novel Ranking Scheme,” Feb. 2010, doi:

https://doi.org/10.48550/arxiv.1002.4005.

[7] K. Deb and H. Jain, “An Evolutionary Many-Objective Optimization

Algorithm Using Reference-Point-Based Nondominated Sorting

Approach, Part I: Solving Problems With Box Constraints,” IEEE

Transactions on Evolutionary Computation, vol. 18, no. 4, pp. 577–

601, Aug. 2014, doi: https://doi.org/10.1109/tevc.2013.2281535.

[8] Haitham Seada and K. Deb, “U-NSGA-III: A Unified Evolutionary

Optimization Procedure for Single, Multiple, and Many Objectives:

Proof-of-Principle Results,” pp. 34–49, Jan. 2015, doi:

https://doi.org/10.1007/978-3-319-15892-1_3.

[9] Q. Liu, X. Liu, J. Wu, and Y. Li, “An Improved NSGA-Ⅲ Algorithm

Using Genetic K-means Clustering Algorithm,” IEEE Access, pp. 1–1,

2019, doi: https://doi.org/10.1109/access.2019.2960531.

[10] Z.-M. Gu and G.-G. Wang, “Improving NSGA-III algorithms with

information feedback models for large-scale many-objective

optimization,” Future Generation Computer Systems, vol. 107, pp.

49–69, Jun. 2020, doi: https://doi.org/10.1016/j.future.2020.01.048.

[11] B. Doerr and Z. Qu, “Runtime Analysis for the NSGA-II: Provable

Speed-Ups from Crossover,” vol. 37, no. 10, pp. 12399–12407, Jun.

2023, doi: https://doi.org/10.1609/aaai.v37i10.26461.

[12] B. Doerr and Z. Qu, “From Understanding the Population Dynamics

of the NSGA-II to the First Proven Lower Bounds,” vol. 37, no. 10,

pp. 12408–12416, Jun. 2023, doi:

https://doi.org/10.1609/aaai.v37i10.26462.

[13] X. Chu and X. Yu, “Improved Crowding Distance for NSGA-II,” Nov.

2018, doi: https://doi.org/10.48550/arxiv.1811.12667.