Dynamic Portfolio Management with Deep Reinforcement Learning (Portfolio of Fidelity® Funds)

Back to Portfolio of Projects | LearnableLoopAI.com | Blog |

1. Problem: Dynamic Asset Allocation

Asset allocation refers to the partitioning of available funds in a portfolio among various investment products. Investment products may be categorized in multiple ways. One common (and high level) classification contains three classes:

Stocks
Bonds (fixed income)
Cash and cash equivalents

Other classification systems may sub-classify each of these categories into subclasses, for example stocks into US and International stocks, or into small-cap, medium-cap, and large-cap stocks, etc.

Another aspect of asset allocation is concerns the underlying strategy. Some strategies are:

Tactical asset allocation
Strategic asset allocation
Constant weight asset allocation
Integrated asset allocation
Insured asset allocation
Dynamic asset allocation

We will focus on the last of these strategies: Dynamic asset allocation. With this approach the combination of assets are adjusted at regular intervals to capitalize on the strengthening and weakening of the economy and rise and fall of markets. This strategy depends on the decisions of a portfolio manager. In this project, however, we will attempt to replace the portfolio manager with an AI agent. The agent will be trained on past market behavior by means of Deep Reinforcement Learning.

For this Proof-Of-Concept (POC) project we will make things as simple as possible. The portfolio will consist of 9 Fidelity mutual funds. We will use the DJIA as a handy reference against which the performance of our agent’s dynamic behavior can be compared. Our agent will have the opportunity to adjust the mix of these funds on a daily basis. In practice, a 401K agent, for example, might be setup to make monthly adjustments to reduce transation costs or to avoid constraints related to trading. For simplicity, trading costs will not be taken into account for now.

We have selected a mutual fund from a number of fund categories:

Large Value
- Fidelity® Blue Chip Value Fund (FBCVX)
Small/Mid Value
- Fidelity® Mid-Cap Value Fund (FSMVX)
Income-Oriented
- Fidelity® Dividend Growth Fund (FDGFX)
Large Blend
- Fidelity® US Low Volatility Equity Fund (FULVX)
Small/Mid Blend
- Fidelity® Stock Selector Small-Cap Fund (FDSCX)
Go-Anywhere
- Fidelity® Capital Appreciation Fund (FDCAX)
Large Growth
- Fidelity® Growth Discovery Fund (FDSVX)
Small/Mid Growth
- Fidelity® Small-Cap Growth Fund (FCPGX)
Diversifiers
- Fidelity® Founders Fund (FIFNX)

Our agent will have $ 1,000,000 when the project starts.

To implement this POC we will make use of the FinRL framework as well as some Yahoo technology to acquire financial data.

2. Solution Proposal

Investment decisions are sequential by nature. Furthermore, an optimal decision in the present may turn out not to be optimal over the longer term. Then there are the complexities of the investment landscape like varying market conditions, political disruptive events, and other economic uncertainties. The ideal tool for this kind of problem is Reinforcement Learning.

The solution requires the setup of a digital twin for the investor’s portfolio. In RL terms this model of the portfolio is called an environment. The environment contains states which are modified by applying actions to it.

We will choose the following state vector (measured daily in our case) for the environment:

\begin{aligned} s_{1} & = Value of FBCVX holdings \\ s_{2} & = Value of FSMVX holdings \\ s_{3} & = Value of FDGFX holdings \\ ... \\ s_{8} & = Value of FCPGX holdings \\ s_{9} & = Value of FIFNX holdings \end{aligned}

The following action vector (applied daily in our case) will be setup to influence the environment/portfolio:

\begin{aligned} a_{1} & = Fraction to be invested in FBCVX this cycle \\ a_{2} & = Fraction to be invested in FSMVX this cycle \\ a_{3} & = Fraction to be invested in FDGFX this cycle \\ ... \\ a_{8} & = Fraction to be invested in FCPGX this cycle \\ a_{9} & = Fraction to be invested in FIFNX this cycle \end{aligned}

All action values are in the [0, 1] interval.

The reward r is given by

\begin{aligned} r (s, a, s^{'}) & = l o g (v^{'} / v) \end{aligned}

where v and v’ are the portfolio value at states $s^{'}$ and $s$ respectively.

The model of the portfolio/environment will have the following parameters:

\begin{aligned} θ_{1} & = Initial Amount \\ θ_{2} & = Transaction Cost Pct \\ θ_{3} & = Size of State and Action Spaces \\ θ_{4} & = Reward Scaling \\ θ_{5} & = Technical Indicator List \end{aligned}

3. Implementation of the Solution

To implement the environment we use the OpenAI Gym tools. We will use the FinRL python library to implement the agent. For a function approximator for the agent, two algorithms will be investigated:

Advantage Actor-Critic (A2C)
Proximal Policy Optimization (PPO)

This implementation allows the agent to allocate the investor’s funds between the 30 DJIA instruments. The goal is to maximize net worth at the end of the investment horizon.

The plotly library will be used for visualization. The code will run on the Google Colab platform. To start with, we install the python packages needed.

# hide
# install plotly and finrl library
!pip install plotly==4.4.1
!wget https://github.com/plotly/orca/releases/download/v1.2.1/orca-1.2.1-x86_64.AppImage -O /usr/local/bin/orca
!chmod +x /usr/local/bin/orca
!apt-get install xvfb libgtk2.0-0 libgconf-2-4
!pip install git+https://github.com/AI4Finance-LLC/FinRL-Library.git
!pip install PyPortfolioOpt

Requirement already satisfied: plotly==4.4.1 in /usr/local/lib/python3.7/dist-packages (4.4.1)
Requirement already satisfied: retrying>=1.3.3 in /usr/local/lib/python3.7/dist-packages (from plotly==4.4.1) (1.3.3)
Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from plotly==4.4.1) (1.15.0)
--2021-11-19 15:08:13--  https://github.com/plotly/orca/releases/download/v1.2.1/orca-1.2.1-x86_64.AppImage
Resolving github.com (github.com)... 140.82.121.3
Connecting to github.com (github.com)|140.82.121.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/99037241/9dc3a580-286a-11e9-8a21-4312b7c8a512?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20211119%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20211119T150813Z&X-Amz-Expires=300&X-Amz-Signature=b952476c497cf1f16fd402204acd4317bb511e91a51cd8c2ac92419cd447cded&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=99037241&response-content-disposition=attachment%3B%20filename%3Dorca-1.2.1-x86_64.AppImage&response-content-type=application%2Foctet-stream [following]
--2021-11-19 15:08:13--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/99037241/9dc3a580-286a-11e9-8a21-4312b7c8a512?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20211119%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20211119T150813Z&X-Amz-Expires=300&X-Amz-Signature=b952476c497cf1f16fd402204acd4317bb511e91a51cd8c2ac92419cd447cded&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=99037241&response-content-disposition=attachment%3B%20filename%3Dorca-1.2.1-x86_64.AppImage&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.108.133, 185.199.110.133, 185.199.109.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 51607939 (49M) [application/octet-stream]
Saving to: ‘/usr/local/bin/orca’

/usr/local/bin/orca 100%[===================>]  49.22M   160MB/s    in 0.3s    

2021-11-19 15:08:15 (160 MB/s) - ‘/usr/local/bin/orca’ saved [51607939/51607939]

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  gconf-service gconf-service-backend gconf2-common libdbus-glib-1-2
  libgail-common libgail18 libgtk2.0-bin libgtk2.0-common
Suggested packages:
  gvfs
The following NEW packages will be installed:
  gconf-service gconf-service-backend gconf2-common libdbus-glib-1-2
  libgail-common libgail18 libgconf-2-4 libgtk2.0-0 libgtk2.0-bin
  libgtk2.0-common xvfb
0 upgraded, 11 newly installed, 0 to remove and 37 not upgraded.
Need to get 3,715 kB of archives.
After this operation, 17.2 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic/main amd64 libdbus-glib-1-2 amd64 0.110-2 [58.3 kB]
Get:2 http://archive.ubuntu.com/ubuntu bionic/universe amd64 gconf2-common all 3.2.6-4ubuntu1 [700 kB]
Get:3 http://archive.ubuntu.com/ubuntu bionic/universe amd64 libgconf-2-4 amd64 3.2.6-4ubuntu1 [84.8 kB]
Get:4 http://archive.ubuntu.com/ubuntu bionic/universe amd64 gconf-service-backend amd64 3.2.6-4ubuntu1 [58.1 kB]
Get:5 http://archive.ubuntu.com/ubuntu bionic/universe amd64 gconf-service amd64 3.2.6-4ubuntu1 [2,036 B]
Get:6 http://archive.ubuntu.com/ubuntu bionic/main amd64 libgtk2.0-common all 2.24.32-1ubuntu1 [125 kB]
Get:7 http://archive.ubuntu.com/ubuntu bionic/main amd64 libgtk2.0-0 amd64 2.24.32-1ubuntu1 [1,769 kB]
Get:8 http://archive.ubuntu.com/ubuntu bionic/main amd64 libgail18 amd64 2.24.32-1ubuntu1 [14.2 kB]
Get:9 http://archive.ubuntu.com/ubuntu bionic/main amd64 libgail-common amd64 2.24.32-1ubuntu1 [112 kB]
Get:10 http://archive.ubuntu.com/ubuntu bionic/main amd64 libgtk2.0-bin amd64 2.24.32-1ubuntu1 [7,536 B]
Get:11 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 xvfb amd64 2:1.19.6-1ubuntu4.9 [784 kB]
Fetched 3,715 kB in 1s (2,780 kB/s)
Selecting previously unselected package libdbus-glib-1-2:amd64.
(Reading database ... 155219 files and directories currently installed.)
Preparing to unpack .../00-libdbus-glib-1-2_0.110-2_amd64.deb ...
Unpacking libdbus-glib-1-2:amd64 (0.110-2) ...
Selecting previously unselected package gconf2-common.
Preparing to unpack .../01-gconf2-common_3.2.6-4ubuntu1_all.deb ...
Unpacking gconf2-common (3.2.6-4ubuntu1) ...
Selecting previously unselected package libgconf-2-4:amd64.
Preparing to unpack .../02-libgconf-2-4_3.2.6-4ubuntu1_amd64.deb ...
Unpacking libgconf-2-4:amd64 (3.2.6-4ubuntu1) ...
Selecting previously unselected package gconf-service-backend.
Preparing to unpack .../03-gconf-service-backend_3.2.6-4ubuntu1_amd64.deb ...
Unpacking gconf-service-backend (3.2.6-4ubuntu1) ...
Selecting previously unselected package gconf-service.
Preparing to unpack .../04-gconf-service_3.2.6-4ubuntu1_amd64.deb ...
Unpacking gconf-service (3.2.6-4ubuntu1) ...
Selecting previously unselected package libgtk2.0-common.
Preparing to unpack .../05-libgtk2.0-common_2.24.32-1ubuntu1_all.deb ...
Unpacking libgtk2.0-common (2.24.32-1ubuntu1) ...
Selecting previously unselected package libgtk2.0-0:amd64.
Preparing to unpack .../06-libgtk2.0-0_2.24.32-1ubuntu1_amd64.deb ...
Unpacking libgtk2.0-0:amd64 (2.24.32-1ubuntu1) ...
Selecting previously unselected package libgail18:amd64.
Preparing to unpack .../07-libgail18_2.24.32-1ubuntu1_amd64.deb ...
Unpacking libgail18:amd64 (2.24.32-1ubuntu1) ...
Selecting previously unselected package libgail-common:amd64.
Preparing to unpack .../08-libgail-common_2.24.32-1ubuntu1_amd64.deb ...
Unpacking libgail-common:amd64 (2.24.32-1ubuntu1) ...
Selecting previously unselected package libgtk2.0-bin.
Preparing to unpack .../09-libgtk2.0-bin_2.24.32-1ubuntu1_amd64.deb ...
Unpacking libgtk2.0-bin (2.24.32-1ubuntu1) ...
Selecting previously unselected package xvfb.
Preparing to unpack .../10-xvfb_2%3a1.19.6-1ubuntu4.9_amd64.deb ...
Unpacking xvfb (2:1.19.6-1ubuntu4.9) ...
Setting up gconf2-common (3.2.6-4ubuntu1) ...

Creating config file /etc/gconf/2/path with new version
Setting up libgtk2.0-common (2.24.32-1ubuntu1) ...
Setting up libdbus-glib-1-2:amd64 (0.110-2) ...
Setting up xvfb (2:1.19.6-1ubuntu4.9) ...
Setting up libgconf-2-4:amd64 (3.2.6-4ubuntu1) ...
Setting up libgtk2.0-0:amd64 (2.24.32-1ubuntu1) ...
Setting up libgail18:amd64 (2.24.32-1ubuntu1) ...
Setting up libgail-common:amd64 (2.24.32-1ubuntu1) ...
Setting up libgtk2.0-bin (2.24.32-1ubuntu1) ...
Setting up gconf-service-backend (3.2.6-4ubuntu1) ...
Setting up gconf-service (3.2.6-4ubuntu1) ...
Processing triggers for libc-bin (2.27-3ubuntu1.3) ...
/sbin/ldconfig.real: /usr/local/lib/python3.7/dist-packages/ideep4py/lib/libmkldnn.so.0 is not a symbolic link

Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
Collecting git+https://github.com/AI4Finance-LLC/FinRL-Library.git
  Cloning https://github.com/AI4Finance-LLC/FinRL-Library.git to /tmp/pip-req-build-_8zb7h83
  Running command git clone -q https://github.com/AI4Finance-LLC/FinRL-Library.git /tmp/pip-req-build-_8zb7h83
Collecting pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2
  Cloning https://github.com/quantopian/pyfolio.git to /tmp/pip-install-13hbve6s/pyfolio_603330aa9d534670a4e38df6b5e676c2
  Running command git clone -q https://github.com/quantopian/pyfolio.git /tmp/pip-install-13hbve6s/pyfolio_603330aa9d534670a4e38df6b5e676c2
Collecting elegantrl@ git+https://github.com/AI4Finance-Foundation/ElegantRL.git#egg=elegantrl
  Cloning https://github.com/AI4Finance-Foundation/ElegantRL.git to /tmp/pip-install-13hbve6s/elegantrl_8915e4c5e3124658acb8713a9aeff6f2
  Running command git clone -q https://github.com/AI4Finance-Foundation/ElegantRL.git /tmp/pip-install-13hbve6s/elegantrl_8915e4c5e3124658acb8713a9aeff6f2
Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.7/dist-packages (from finrl==0.3.3) (1.19.5)
Requirement already satisfied: pandas>=1.1.5 in /usr/local/lib/python3.7/dist-packages (from finrl==0.3.3) (1.1.5)
Collecting stockstats
  Downloading stockstats-0.3.2-py2.py3-none-any.whl (13 kB)
Collecting yfinance
  Downloading yfinance-0.1.66-py2.py3-none-any.whl (25 kB)
Collecting elegantrl
  Downloading elegantrl-0.3.2-py3-none-any.whl (73 kB)
     |████████████████████████████████| 73 kB 1.7 MB/s 
Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from finrl==0.3.3) (3.2.2)
Requirement already satisfied: scikit-learn>=0.21.0 in /usr/local/lib/python3.7/dist-packages (from finrl==0.3.3) (1.0.1)
Requirement already satisfied: gym>=0.17 in /usr/local/lib/python3.7/dist-packages (from finrl==0.3.3) (0.17.3)
Collecting stable-baselines3[extra]
  Downloading stable_baselines3-1.3.0-py3-none-any.whl (174 kB)
     |████████████████████████████████| 174 kB 23.8 MB/s 
Collecting ray[default]
  Downloading ray-1.8.0-cp37-cp37m-manylinux2014_x86_64.whl (54.7 MB)
     |████████████████████████████████| 54.7 MB 44 kB/s 
Collecting lz4
  Downloading lz4-3.1.3-cp37-cp37m-manylinux2010_x86_64.whl (1.8 MB)
     |████████████████████████████████| 1.8 MB 38.1 MB/s 
Collecting tensorboardX
  Downloading tensorboardX-2.4-py2.py3-none-any.whl (124 kB)
     |████████████████████████████████| 124 kB 23.8 MB/s 
Collecting gputil
  Downloading GPUtil-1.4.0.tar.gz (5.5 kB)
Collecting trading_calendars
  Downloading trading_calendars-2.1.1.tar.gz (108 kB)
     |████████████████████████████████| 108 kB 45.5 MB/s 
Collecting alpaca_trade_api
  Downloading alpaca_trade_api-1.4.1-py3-none-any.whl (36 kB)
Collecting ccxt
  Downloading ccxt-1.61.57-py2.py3-none-any.whl (2.1 MB)
     |████████████████████████████████| 2.1 MB 39.7 MB/s 
Collecting jqdatasdk
  Downloading jqdatasdk-1.8.10-py3-none-any.whl (153 kB)
     |████████████████████████████████| 153 kB 49.0 MB/s 
Collecting wrds
  Downloading wrds-3.1.1-py3-none-any.whl (12 kB)
Requirement already satisfied: pytest in /usr/local/lib/python3.7/dist-packages (from finrl==0.3.3) (3.6.4)
Requirement already satisfied: setuptools>=41.4.0 in /usr/local/lib/python3.7/dist-packages (from finrl==0.3.3) (57.4.0)
Requirement already satisfied: wheel>=0.33.6 in /usr/local/lib/python3.7/dist-packages (from finrl==0.3.3) (0.37.0)
Collecting pre-commit
  Downloading pre_commit-2.15.0-py2.py3-none-any.whl (191 kB)
     |████████████████████████████████| 191 kB 37.6 MB/s 
Collecting pybullet
  Downloading pybullet-3.2.0-cp37-cp37m-manylinux1_x86_64.whl (89.3 MB)
     |████████████████████████████████| 89.3 MB 69 kB/s 
Requirement already satisfied: torch in /usr/local/lib/python3.7/dist-packages (from elegantrl@ git+https://github.com/AI4Finance-Foundation/ElegantRL.git#egg=elegantrl->finrl==0.3.3) (1.10.0+cu111)
Requirement already satisfied: opencv-python in /usr/local/lib/python3.7/dist-packages (from elegantrl@ git+https://github.com/AI4Finance-Foundation/ElegantRL.git#egg=elegantrl->finrl==0.3.3) (4.1.2.30)
Collecting box2d-py
  Downloading box2d_py-2.3.8-cp37-cp37m-manylinux1_x86_64.whl (448 kB)
     |████████████████████████████████| 448 kB 20.5 MB/s 
Requirement already satisfied: ipython>=3.2.3 in /usr/local/lib/python3.7/dist-packages (from pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (5.5.0)
Requirement already satisfied: pytz>=2014.10 in /usr/local/lib/python3.7/dist-packages (from pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (2018.9)
Requirement already satisfied: scipy>=0.14.0 in /usr/local/lib/python3.7/dist-packages (from pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (1.4.1)
Requirement already satisfied: seaborn>=0.7.1 in /usr/local/lib/python3.7/dist-packages (from pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (0.11.2)
Collecting empyrical>=0.5.0
  Downloading empyrical-0.5.5.tar.gz (52 kB)
     |████████████████████████████████| 52 kB 1.3 MB/s 
Requirement already satisfied: pandas-datareader>=0.2 in /usr/local/lib/python3.7/dist-packages (from empyrical>=0.5.0->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (0.9.0)
Requirement already satisfied: cloudpickle<1.7.0,>=1.2.0 in /usr/local/lib/python3.7/dist-packages (from gym>=0.17->finrl==0.3.3) (1.3.0)
Requirement already satisfied: pyglet<=1.5.0,>=1.4.0 in /usr/local/lib/python3.7/dist-packages (from gym>=0.17->finrl==0.3.3) (1.5.0)
Requirement already satisfied: pygments in /usr/local/lib/python3.7/dist-packages (from ipython>=3.2.3->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (2.6.1)
Requirement already satisfied: pickleshare in /usr/local/lib/python3.7/dist-packages (from ipython>=3.2.3->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (0.7.5)
Requirement already satisfied: simplegeneric>0.8 in /usr/local/lib/python3.7/dist-packages (from ipython>=3.2.3->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (0.8.1)
Requirement already satisfied: decorator in /usr/local/lib/python3.7/dist-packages (from ipython>=3.2.3->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (4.4.2)
Requirement already satisfied: pexpect in /usr/local/lib/python3.7/dist-packages (from ipython>=3.2.3->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (4.8.0)
Requirement already satisfied: traitlets>=4.2 in /usr/local/lib/python3.7/dist-packages (from ipython>=3.2.3->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (5.1.1)
Requirement already satisfied: prompt-toolkit<2.0.0,>=1.0.4 in /usr/local/lib/python3.7/dist-packages (from ipython>=3.2.3->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (1.0.18)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->finrl==0.3.3) (2.8.2)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->finrl==0.3.3) (1.3.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->finrl==0.3.3) (0.11.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->finrl==0.3.3) (2.4.7)
Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.7/dist-packages (from pandas-datareader>=0.2->empyrical>=0.5.0->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (2.23.0)
Requirement already satisfied: lxml in /usr/local/lib/python3.7/dist-packages (from pandas-datareader>=0.2->empyrical>=0.5.0->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (4.2.6)
Requirement already satisfied: wcwidth in /usr/local/lib/python3.7/dist-packages (from prompt-toolkit<2.0.0,>=1.0.4->ipython>=3.2.3->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (0.2.5)
Requirement already satisfied: six>=1.9.0 in /usr/local/lib/python3.7/dist-packages (from prompt-toolkit<2.0.0,>=1.0.4->ipython>=3.2.3->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (1.15.0)
Requirement already satisfied: future in /usr/local/lib/python3.7/dist-packages (from pyglet<=1.5.0,>=1.4.0->gym>=0.17->finrl==0.3.3) (0.16.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->pandas-datareader>=0.2->empyrical>=0.5.0->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (2021.10.8)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->pandas-datareader>=0.2->empyrical>=0.5.0->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (3.0.4)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->pandas-datareader>=0.2->empyrical>=0.5.0->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (2.10)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->pandas-datareader>=0.2->empyrical>=0.5.0->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (1.24.3)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-learn>=0.21.0->finrl==0.3.3) (1.1.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn>=0.21.0->finrl==0.3.3) (3.0.0)
Requirement already satisfied: msgpack==1.0.2 in /usr/local/lib/python3.7/dist-packages (from alpaca_trade_api->finrl==0.3.3) (1.0.2)
Collecting websocket-client<2,>=0.56.0
  Downloading websocket_client-1.2.1-py2.py3-none-any.whl (52 kB)
     |████████████████████████████████| 52 kB 1.4 MB/s 
Collecting websockets<10,>=8.0
  Downloading websockets-9.1-cp37-cp37m-manylinux2010_x86_64.whl (103 kB)
     |████████████████████████████████| 103 kB 49.4 MB/s 
Collecting PyYAML==5.4.1
  Downloading PyYAML-5.4.1-cp37-cp37m-manylinux1_x86_64.whl (636 kB)
     |████████████████████████████████| 636 kB 49.6 MB/s 
Collecting aiohttp==3.7.4
  Downloading aiohttp-3.7.4-cp37-cp37m-manylinux2014_x86_64.whl (1.3 MB)
     |████████████████████████████████| 1.3 MB 39.1 MB/s 
Requirement already satisfied: typing-extensions>=3.6.5 in /usr/local/lib/python3.7/dist-packages (from aiohttp==3.7.4->alpaca_trade_api->finrl==0.3.3) (3.10.0.2)
Collecting yarl<2.0,>=1.0
  Downloading yarl-1.7.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (271 kB)
     |████████████████████████████████| 271 kB 41.2 MB/s 
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.7/dist-packages (from aiohttp==3.7.4->alpaca_trade_api->finrl==0.3.3) (21.2.0)
Collecting async-timeout<4.0,>=3.0
  Downloading async_timeout-3.0.1-py3-none-any.whl (8.2 kB)
Collecting multidict<7.0,>=4.5
  Downloading multidict-5.2.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (160 kB)
     |████████████████████████████████| 160 kB 47.1 MB/s 
Collecting cryptography>=2.6.1
  Downloading cryptography-35.0.0-cp36-abi3-manylinux_2_24_x86_64.whl (3.5 MB)
     |████████████████████████████████| 3.5 MB 38.1 MB/s 
Collecting ccxt
  Downloading ccxt-1.61.56-py2.py3-none-any.whl (2.1 MB)
     |████████████████████████████████| 2.1 MB 40.5 MB/s 
  Downloading ccxt-1.61.55-py2.py3-none-any.whl (2.1 MB)
     |████████████████████████████████| 2.1 MB 35.5 MB/s 
  Downloading ccxt-1.61.54-py2.py3-none-any.whl (2.1 MB)
     |████████████████████████████████| 2.1 MB 33.5 MB/s 
  Downloading ccxt-1.61.53-py2.py3-none-any.whl (2.1 MB)
     |████████████████████████████████| 2.1 MB 31.3 MB/s 
  Downloading ccxt-1.61.52-py2.py3-none-any.whl (2.1 MB)
     |████████████████████████████████| 2.1 MB 25.7 MB/s 
  Downloading ccxt-1.61.51-py2.py3-none-any.whl (2.1 MB)
     |████████████████████████████████| 2.1 MB 27.5 MB/s 
Collecting aiodns>=1.1.1
  Downloading aiodns-3.0.0-py3-none-any.whl (5.0 kB)
Collecting yarl<2.0,>=1.0
  Downloading yarl-1.6.3-cp37-cp37m-manylinux2014_x86_64.whl (294 kB)
     |████████████████████████████████| 294 kB 45.1 MB/s 
Collecting pycares>=4.0.0
  Downloading pycares-4.1.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (291 kB)
     |████████████████████████████████| 291 kB 44.1 MB/s 
Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.7/dist-packages (from cryptography>=2.6.1->ccxt->finrl==0.3.3) (1.15.0)
Requirement already satisfied: pycparser in /usr/local/lib/python3.7/dist-packages (from cffi>=1.12->cryptography>=2.6.1->ccxt->finrl==0.3.3) (2.21)
Collecting thriftpy2>=0.3.9
  Downloading thriftpy2-0.4.14.tar.gz (361 kB)
     |████████████████████████████████| 361 kB 45.8 MB/s 
Requirement already satisfied: SQLAlchemy>=1.2.8 in /usr/local/lib/python3.7/dist-packages (from jqdatasdk->finrl==0.3.3) (1.4.26)
Collecting pymysql>=0.7.6
  Downloading PyMySQL-1.0.2-py3-none-any.whl (43 kB)
     |████████████████████████████████| 43 kB 2.1 MB/s 
Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.7/dist-packages (from SQLAlchemy>=1.2.8->jqdatasdk->finrl==0.3.3) (1.1.2)
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/dist-packages (from SQLAlchemy>=1.2.8->jqdatasdk->finrl==0.3.3) (4.8.2)
Collecting ply<4.0,>=3.4
  Downloading ply-3.11-py2.py3-none-any.whl (49 kB)
     |████████████████████████████████| 49 kB 5.7 MB/s 
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata->SQLAlchemy>=1.2.8->jqdatasdk->finrl==0.3.3) (3.6.0)
Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.7/dist-packages (from pexpect->ipython>=3.2.3->pyfolio@ git+https://github.com/quantopian/pyfolio.git#egg=pyfolio-0.9.2->finrl==0.3.3) (0.7.0)
Collecting identify>=1.0.0
  Downloading identify-2.4.0-py2.py3-none-any.whl (98 kB)
     |████████████████████████████████| 98 kB 6.6 MB/s 
Collecting cfgv>=2.0.0
  Downloading cfgv-3.3.1-py2.py3-none-any.whl (7.3 kB)
Collecting virtualenv>=20.0.8
  Downloading virtualenv-20.10.0-py2.py3-none-any.whl (5.6 MB)
     |████████████████████████████████| 5.6 MB 15.4 MB/s 
Collecting nodeenv>=0.11.1
  Downloading nodeenv-1.6.0-py2.py3-none-any.whl (21 kB)
Requirement already satisfied: toml in /usr/local/lib/python3.7/dist-packages (from pre-commit->finrl==0.3.3) (0.10.2)
Collecting backports.entry-points-selectable>=1.0.4
  Downloading backports.entry_points_selectable-1.1.1-py2.py3-none-any.whl (6.2 kB)
Collecting distlib<1,>=0.3.1
  Downloading distlib-0.3.3-py2.py3-none-any.whl (496 kB)
     |████████████████████████████████| 496 kB 34.4 MB/s 
Collecting platformdirs<3,>=2
  Downloading platformdirs-2.4.0-py3-none-any.whl (14 kB)
Requirement already satisfied: filelock<4,>=3.2 in /usr/local/lib/python3.7/dist-packages (from virtualenv>=20.0.8->pre-commit->finrl==0.3.3) (3.3.2)
Requirement already satisfied: py>=1.5.0 in /usr/local/lib/python3.7/dist-packages (from pytest->finrl==0.3.3) (1.11.0)
Requirement already satisfied: more-itertools>=4.0.0 in /usr/local/lib/python3.7/dist-packages (from pytest->finrl==0.3.3) (8.11.0)
Requirement already satisfied: atomicwrites>=1.0 in /usr/local/lib/python3.7/dist-packages (from pytest->finrl==0.3.3) (1.4.0)
Requirement already satisfied: pluggy<0.8,>=0.5 in /usr/local/lib/python3.7/dist-packages (from pytest->finrl==0.3.3) (0.7.1)
Requirement already satisfied: grpcio>=1.28.1 in /usr/local/lib/python3.7/dist-packages (from ray[default]->finrl==0.3.3) (1.41.1)
Requirement already satisfied: jsonschema in /usr/local/lib/python3.7/dist-packages (from ray[default]->finrl==0.3.3) (2.6.0)
Requirement already satisfied: protobuf>=3.15.3 in /usr/local/lib/python3.7/dist-packages (from ray[default]->finrl==0.3.3) (3.17.3)
Collecting redis>=3.5.0
  Downloading redis-4.0.1-py3-none-any.whl (118 kB)
     |████████████████████████████████| 118 kB 51.7 MB/s 
Requirement already satisfied: click>=7.0 in /usr/local/lib/python3.7/dist-packages (from ray[default]->finrl==0.3.3) (7.1.2)
Collecting colorful
  Downloading colorful-0.5.4-py2.py3-none-any.whl (201 kB)
     |████████████████████████████████| 201 kB 54.1 MB/s 
Collecting aiohttp-cors
  Downloading aiohttp_cors-0.7.0-py3-none-any.whl (27 kB)
Collecting aioredis<2
  Downloading aioredis-1.3.1-py3-none-any.whl (65 kB)
     |████████████████████████████████| 65 kB 3.7 MB/s 
Requirement already satisfied: prometheus-client>=0.7.1 in /usr/local/lib/python3.7/dist-packages (from ray[default]->finrl==0.3.3) (0.12.0)
Collecting opencensus
  Downloading opencensus-0.8.0-py2.py3-none-any.whl (128 kB)
     |████████████████████████████████| 128 kB 49.9 MB/s 
Collecting py-spy>=0.2.0
  Downloading py_spy-0.3.11-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (3.0 MB)
     |████████████████████████████████| 3.0 MB 37.7 MB/s 
Collecting gpustat>=1.0.0b1
  Downloading gpustat-1.0.0b1.tar.gz (82 kB)
     |████████████████████████████████| 82 kB 205 kB/s 
Collecting hiredis
  Downloading hiredis-2.0.0-cp37-cp37m-manylinux2010_x86_64.whl (85 kB)
     |████████████████████████████████| 85 kB 3.6 MB/s 
Requirement already satisfied: nvidia-ml-py3>=7.352.0 in /usr/local/lib/python3.7/dist-packages (from gpustat>=1.0.0b1->ray[default]->finrl==0.3.3) (7.352.0)
Requirement already satisfied: psutil in /usr/local/lib/python3.7/dist-packages (from gpustat>=1.0.0b1->ray[default]->finrl==0.3.3) (5.4.8)
Collecting blessed>=1.17.1
  Downloading blessed-1.19.0-py2.py3-none-any.whl (57 kB)
     |████████████████████████████████| 57 kB 5.4 MB/s 
Collecting deprecated
  Downloading Deprecated-1.2.13-py2.py3-none-any.whl (9.6 kB)
Requirement already satisfied: wrapt<2,>=1.10 in /usr/local/lib/python3.7/dist-packages (from deprecated->redis>=3.5.0->ray[default]->finrl==0.3.3) (1.13.3)
Requirement already satisfied: google-api-core<3.0.0,>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from opencensus->ray[default]->finrl==0.3.3) (1.26.3)
Collecting opencensus-context==0.1.2
  Downloading opencensus_context-0.1.2-py2.py3-none-any.whl (4.4 kB)
Requirement already satisfied: googleapis-common-protos<2.0dev,>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]->finrl==0.3.3) (1.53.0)
Requirement already satisfied: packaging>=14.3 in /usr/local/lib/python3.7/dist-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]->finrl==0.3.3) (21.2)
Requirement already satisfied: google-auth<2.0dev,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]->finrl==0.3.3) (1.35.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/dist-packages (from google-auth<2.0dev,>=1.21.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]->finrl==0.3.3) (4.7.2)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from google-auth<2.0dev,>=1.21.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]->finrl==0.3.3) (4.2.4)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from google-auth<2.0dev,>=1.21.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]->finrl==0.3.3) (0.2.8)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/dist-packages (from pyasn1-modules>=0.2.1->google-auth<2.0dev,>=1.21.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]->finrl==0.3.3) (0.4.8)
Requirement already satisfied: tabulate in /usr/local/lib/python3.7/dist-packages (from ray[default]->finrl==0.3.3) (0.8.9)
Requirement already satisfied: tensorboard>=2.2.0 in /usr/local/lib/python3.7/dist-packages (from stable-baselines3[extra]->finrl==0.3.3) (2.7.0)
Requirement already satisfied: atari-py~=0.2.0 in /usr/local/lib/python3.7/dist-packages (from stable-baselines3[extra]->finrl==0.3.3) (0.2.9)
Requirement already satisfied: pillow in /usr/local/lib/python3.7/dist-packages (from stable-baselines3[extra]->finrl==0.3.3) (7.1.2)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->stable-baselines3[extra]->finrl==0.3.3) (1.8.0)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->stable-baselines3[extra]->finrl==0.3.3) (0.6.1)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->stable-baselines3[extra]->finrl==0.3.3) (0.4.6)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->stable-baselines3[extra]->finrl==0.3.3) (1.0.1)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->stable-baselines3[extra]->finrl==0.3.3) (3.3.4)
Requirement already satisfied: absl-py>=0.4 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=2.2.0->stable-baselines3[extra]->finrl==0.3.3) (0.12.0)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard>=2.2.0->stable-baselines3[extra]->finrl==0.3.3) (1.3.0)
Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard>=2.2.0->stable-baselines3[extra]->finrl==0.3.3) (3.1.1)
Collecting int-date>=0.1.7
  Downloading int_date-0.1.8-py2.py3-none-any.whl (5.0 kB)
Requirement already satisfied: toolz in /usr/local/lib/python3.7/dist-packages (from trading_calendars->finrl==0.3.3) (0.11.2)
Collecting psycopg2-binary
  Downloading psycopg2_binary-2.9.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
     |████████████████████████████████| 3.0 MB 35.8 MB/s 
Collecting mock
  Downloading mock-4.0.3-py3-none-any.whl (28 kB)
Requirement already satisfied: multitasking>=0.0.7 in /usr/local/lib/python3.7/dist-packages (from yfinance->finrl==0.3.3) (0.0.9)
Collecting lxml
  Downloading lxml-4.6.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (6.3 MB)
     |████████████████████████████████| 6.3 MB 35.1 MB/s 
Building wheels for collected packages: finrl, elegantrl, pyfolio, empyrical, gputil, thriftpy2, gpustat, trading-calendars
  Building wheel for finrl (setup.py) ... done
  Created wheel for finrl: filename=finrl-0.3.3-py3-none-any.whl size=3883630 sha256=662917b3f91e52d062105b1d2f7ddc9f7ab5cceefe0453efe4a37ebab6729222
  Stored in directory: /tmp/pip-ephem-wheel-cache-9r816opf/wheels/17/ff/bd/1bc602a0352762b0b24041b88536d803ae343ed0a711fcf55e
  Building wheel for elegantrl (setup.py) ... done
  Created wheel for elegantrl: filename=elegantrl-0.3.2-py3-none-any.whl size=50821 sha256=64c6d226d2441d45fe9a3a5ecc257bd7a065a065aa35c65d09b3e3e9e3e3ee80
  Stored in directory: /tmp/pip-ephem-wheel-cache-9r816opf/wheels/99/85/5e/86cb3a9f47adfca5e248295e93113e1b298d60883126d62c84
  Building wheel for pyfolio (setup.py) ... done
  Created wheel for pyfolio: filename=pyfolio-0.9.2+75.g4b901f6-py3-none-any.whl size=75775 sha256=a01954834fc5c0737be96daba28c9ac343e27b75ec3f2d961580433166b87459
  Stored in directory: /tmp/pip-ephem-wheel-cache-9r816opf/wheels/ef/09/e5/2c1bf37c050d22557c080deb1be986d06424627c04aeca19b9
  Building wheel for empyrical (setup.py) ... done
  Created wheel for empyrical: filename=empyrical-0.5.5-py3-none-any.whl size=39777 sha256=1dec0f7852cd8e70d9a9a80eaefcfc572188a74192a53679f308a7e007775dfe
  Stored in directory: /root/.cache/pip/wheels/d9/91/4b/654fcff57477efcf149eaca236da2fce991526cbab431bf312
  Building wheel for gputil (setup.py) ... done
  Created wheel for gputil: filename=GPUtil-1.4.0-py3-none-any.whl size=7411 sha256=2cacf5d9b0ad9b32607ccdfd40a496273512ba9fae4c6a2450ee241a2c161b9e
  Stored in directory: /root/.cache/pip/wheels/6e/f8/83/534c52482d6da64622ddbf72cd93c35d2ef2881b78fd08ff0c
  Building wheel for thriftpy2 (setup.py) ... done
  Created wheel for thriftpy2: filename=thriftpy2-0.4.14-cp37-cp37m-linux_x86_64.whl size=940450 sha256=3b6d699a3655c695c8b0dcc0d22c98ce9ae6e849a03240901d68f7e6fba3bb6c
  Stored in directory: /root/.cache/pip/wheels/2a/f5/49/9c0d851aa64b58db72883cf9393cc824d536bdf13f5c83cff4
  Building wheel for gpustat (setup.py) ... done
  Created wheel for gpustat: filename=gpustat-1.0.0b1-py3-none-any.whl size=15979 sha256=3a7931f5b120d315acb6982e455236505f32446780cf48bc985f71d8c31492c0
  Stored in directory: /root/.cache/pip/wheels/1a/16/e2/3e2437fba4c4b6a97a97bd96fce5d14e66cff5c4966fb1cc8c
  Building wheel for trading-calendars (setup.py) ... done
  Created wheel for trading-calendars: filename=trading_calendars-2.1.1-py3-none-any.whl size=140937 sha256=6710d84917b035adc8a0a8735e0c859773676c0d1d1353b1a14c6bdd075cf3e8
  Stored in directory: /root/.cache/pip/wheels/62/9c/d1/46a21e1b99e064cba79b85e9f95e6a208ac5ba4c29ae5962ec
Successfully built finrl elegantrl pyfolio empyrical gputil thriftpy2 gpustat trading-calendars
Installing collected packages: multidict, yarl, lxml, deprecated, async-timeout, redis, PyYAML, pycares, ply, platformdirs, opencensus-context, hiredis, distlib, blessed, backports.entry-points-selectable, aiohttp, websockets, websocket-client, virtualenv, thriftpy2, tensorboardX, stable-baselines3, ray, pymysql, pybullet, py-spy, psycopg2-binary, opencensus, nodeenv, mock, int-date, identify, gpustat, empyrical, cryptography, colorful, cfgv, box2d-py, aioredis, aiohttp-cors, aiodns, yfinance, wrds, trading-calendars, stockstats, pyfolio, pre-commit, lz4, jqdatasdk, gputil, elegantrl, ccxt, alpaca-trade-api, finrl
  Attempting uninstall: lxml
    Found existing installation: lxml 4.2.6
    Uninstalling lxml-4.2.6:
      Successfully uninstalled lxml-4.2.6
  Attempting uninstall: PyYAML
    Found existing installation: PyYAML 3.13
    Uninstalling PyYAML-3.13:
      Successfully uninstalled PyYAML-3.13
Successfully installed PyYAML-5.4.1 aiodns-3.0.0 aiohttp-3.7.4 aiohttp-cors-0.7.0 aioredis-1.3.1 alpaca-trade-api-1.4.1 async-timeout-3.0.1 backports.entry-points-selectable-1.1.1 blessed-1.19.0 box2d-py-2.3.8 ccxt-1.61.51 cfgv-3.3.1 colorful-0.5.4 cryptography-35.0.0 deprecated-1.2.13 distlib-0.3.3 elegantrl-0.3.2 empyrical-0.5.5 finrl-0.3.3 gpustat-1.0.0b1 gputil-1.4.0 hiredis-2.0.0 identify-2.4.0 int-date-0.1.8 jqdatasdk-1.8.10 lxml-4.6.4 lz4-3.1.3 mock-4.0.3 multidict-5.2.0 nodeenv-1.6.0 opencensus-0.8.0 opencensus-context-0.1.2 platformdirs-2.4.0 ply-3.11 pre-commit-2.15.0 psycopg2-binary-2.9.2 py-spy-0.3.11 pybullet-3.2.0 pycares-4.1.2 pyfolio-0.9.2+75.g4b901f6 pymysql-1.0.2 ray-1.8.0 redis-4.0.1 stable-baselines3-1.3.0 stockstats-0.3.2 tensorboardX-2.4 thriftpy2-0.4.14 trading-calendars-2.1.1 virtualenv-20.10.0 websocket-client-1.2.1 websockets-9.1 wrds-3.1.1 yarl-1.6.3 yfinance-0.1.66
Collecting PyPortfolioOpt
  Downloading PyPortfolioOpt-1.5.1-py3-none-any.whl (61 kB)
     |████████████████████████████████| 61 kB 3.4 MB/s 
Requirement already satisfied: pandas>=0.19 in /usr/local/lib/python3.7/dist-packages (from PyPortfolioOpt) (1.1.5)
Requirement already satisfied: numpy<2.0,>=1.12 in /usr/local/lib/python3.7/dist-packages (from PyPortfolioOpt) (1.19.5)
Requirement already satisfied: scipy<2.0,>=1.3 in /usr/local/lib/python3.7/dist-packages (from PyPortfolioOpt) (1.4.1)
Collecting cvxpy<2.0.0,>=1.1.10
  Downloading cvxpy-1.1.17-cp37-cp37m-manylinux_2_24_x86_64.whl (2.8 MB)
     |████████████████████████████████| 2.8 MB 10.2 MB/s 
Requirement already satisfied: ecos>=2 in /usr/local/lib/python3.7/dist-packages (from cvxpy<2.0.0,>=1.1.10->PyPortfolioOpt) (2.0.7.post1)
Requirement already satisfied: scs>=1.1.6 in /usr/local/lib/python3.7/dist-packages (from cvxpy<2.0.0,>=1.1.10->PyPortfolioOpt) (2.1.4)
Requirement already satisfied: osqp>=0.4.1 in /usr/local/lib/python3.7/dist-packages (from cvxpy<2.0.0,>=1.1.10->PyPortfolioOpt) (0.6.2.post0)
Requirement already satisfied: qdldl in /usr/local/lib/python3.7/dist-packages (from osqp>=0.4.1->cvxpy<2.0.0,>=1.1.10->PyPortfolioOpt) (0.1.5.post0)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.19->PyPortfolioOpt) (2.8.2)
Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.19->PyPortfolioOpt) (2018.9)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.7.3->pandas>=0.19->PyPortfolioOpt) (1.15.0)
Installing collected packages: cvxpy, PyPortfolioOpt
  Attempting uninstall: cvxpy
    Found existing installation: cvxpy 1.0.31
    Uninstalling cvxpy-1.0.31:
      Successfully uninstalled cvxpy-1.0.31
Successfully installed PyPortfolioOpt-1.5.1 cvxpy-1.1.17

Import the packages needed:

import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
matplotlib.use('Agg')
%matplotlib inline
import datetime

from finrl.apps import config
from finrl.neo_finrl.preprocessor.yahoodownloader import YahooDownloader
from finrl.neo_finrl.preprocessor.preprocessors import FeatureEngineer, data_split
from finrl.neo_finrl.env_portfolio_allocation.env_portfolio import StockPortfolioEnv
from finrl.drl_agents.stablebaselines3.models import DRLAgent
from finrl.plot import backtest_stats, backtest_plot, get_daily_return, get_baseline,convert_daily_return_to_pyfolio_ts

import gym
from gym.utils import seeding
from gym import spaces
from stable_baselines3.common.vec_env import DummyVecEnv

from pyfolio import timeseries
import plotly
import plotly.graph_objs as go

/usr/local/lib/python3.7/dist-packages/pyfolio/pos.py:27: UserWarning: Module "zipline.assets" not found; multipliers will not be applied to position notionals.
  'Module "zipline.assets" not found; multipliers will not be applied'

import sys
sys.path.append("../FinRL-Library")

# hide
pd.set_option('display.max_rows', 100)

Setup some directories:

import os
if not os.path.exists("./" + config.DATA_SAVE_DIR):
    os.makedirs("./" + config.DATA_SAVE_DIR)
if not os.path.exists("./" + config.TRAINED_MODEL_DIR):
    os.makedirs("./" + config.TRAINED_MODEL_DIR)
if not os.path.exists("./" + config.TENSORBOARD_LOG_DIR):
    os.makedirs("./" + config.TENSORBOARD_LOG_DIR)
if not os.path.exists("./" + config.RESULTS_DIR):
    os.makedirs("./" + config.RESULTS_DIR)

3.0 Parameters

DATA_START = '2008-01-01' 
TRAIN_START = '2009-01-01'
TRADE_START = '2020-07-01'
DATA_END = '2021-09-01'
LOOKBACK = 252 #trading days in one year
INITIAL_AMOUNT = 1_000_000 #dollars
TRANSACTION_COST_PCT = 0
REWARD_SCALING = 1e-1 #scaling factor applied to the reward signal

3.1 Download Data

We use the data from Yahoo Finance.

# hide
config

<module 'finrl.apps.config' from '/usr/local/lib/python3.7/dist-packages/finrl/apps/config.py'>

# hide
config.DOW_30_TICKER

['AXP',
 'AMGN',
 'AAPL',
 'BA',
 'CAT',
 'CSCO',
 'CVX',
 'GS',
 'HD',
 'HON',
 'IBM',
 'INTC',
 'JNJ',
 'KO',
 'JPM',
 'MCD',
 'MMM',
 'MRK',
 'MSFT',
 'NKE',
 'PG',
 'TRV',
 'UNH',
 'CRM',
 'VZ',
 'V',
 'WBA',
 'WMT',
 'DIS',
 'DOW']

# https://www.fidelity.com/mutual-funds/fidelity-funds/overview
my_tickers = [
  # Large Value
  'FBCVX', #Fidelity® Blue Chip Value Fund
  # 'FLVEX', #Fidelity® Large-Cap Value Enhanced Index Fund
  # 'FSLVX', #Fidelity® Stock Selector Large-Cap Value Fund
  # 'FVDFX', #Fidelity® Value Discovery Fund

  # Small/Mid Value
  # 'FLPSX', #Fidelity® Low-Priced Stock Fund
  'FSMVX', #Fidelity® Mid-Cap Value Fund
  # 'FDVLX', #Fidelity® Value Fund
  # 'FSLSX', #Fidelity® Value Strategies Fund
  # 'FCPVX', #Fidelity® Small-Cap Value Fund

  # Income-Oriented
  # 'FEQTX', #Fidelity® Equity Dividend Income Fund
  # 'FEQIX', #Fidelity® Equity-Income Fund
  # 'FGRIX', #Fidelity® Growth & Income Portfolio Fund
  'FDGFX', #Fidelity® Dividend Growth Fund

  # Large Blend
  # 'FSEBX', #Fidelity® Sustainability U.S. Equity Fund, NEW
  'FULVX', #Fidelity® US Low Volatility Equity Fund
  # 'FDEQX', #Fidelity® Disciplined Equity Fund
  # 'FLCEX', #Fidelity® Large-Cap Core Enhanced Index Fund
  # 'FLCSX', #Fidelity® Large-Cap Stock Fund
  # 'FGRTX', #Fidelity® Mega-Cap Stock Fund

  # Small/Mid Blend
  # 'FMEIX', #Fidelity® Mid-Cap Enhanced Index Fund
  # 'FCPEX', #Fidelity® Small-Cap Enhanced Index Fund
  # 'FSLCX', #Fidelity® Small-Cap Stock Fund
  'FDSCX', #Fidelity® Stock Selector Small-Cap Fund
  # 'FSCRX', #Fidelity® Small-Cap Discovery Fund

  # Go-Anywhere
  'FDCAX', #Fidelity® Capital Appreciation Fund
  # 'FCNTX', #Fidelity® Contrafund®
  # 'FMAGX', #Fidelity® Magellan® Fund
  # 'FMILX', #Fidelity® New Millennium Fund

  # Large Growth
  # 'FBGRX', #Fidelity® Blue Chip Growth Fund
  # 'FEXPX', #Fidelity® Export & Multinational Fund
  # 'FTQGX', #Fidelity® Focused Stock Fund
  # 'FFIDX', #Fidelity® Fund
  'FDSVX', #Fidelity® Growth Discovery Fund
  # 'FDGRX', #Fidelity® Growth Company Fund
  # 'FLGEX', #Fidelity® Large-Cap Growth Enhanced Index Fund
  # 'FOCPX', #Fidelity® OTC Portfolio
  # 'FDSSX', #Fidelity® Stock Selector All Cap Fund
  # 'FTRNX', #Fidelity® Trend Fund

  # Small/Mid Growth
  # 'FDEGX', #Fidelity® Growth Strategies Fund
  # 'FMCSX', #Fidelity® Mid-Cap Stock Fund
  # 'FSSMX', #Fidelity® Stock Selector Mid-Cap Fund
  'FCPGX', #Fidelity® Small-Cap Growth Fund

  # Diversifiers
  # 'FWOMX', #Fidelity® Women's Leadership Fund
  'FIFNX', #Fidelity® Founders Fund
  # 'FLVCX', #Fidelity® Leveraged Company Stock Fund
]

df = YahooDownloader(start_date = DATA_START,
                     end_date = DATA_END,
                     # ticker_list = config.DOW_30_TICKER).fetch_data()
                     ticker_list = my_tickers).fetch_data()

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
Shape of DataFrame:  (25183, 8)

df

	date	open	high	low	close	volume	tic	day
0	2008-01-02	14.420000	14.420000	14.420000	11.721002	0	FBCVX	2
1	2008-01-02	15.660000	15.660000	15.660000	6.663240	0	FCPGX	2
2	2008-01-02	26.370001	26.370001	26.370001	11.040798	0	FDCAX	2
3	2008-01-02	28.930000	28.930000	28.930000	10.950327	0	FDGFX	2
4	2008-01-02	19.670000	19.670000	19.670000	11.192782	0	FDSCX	2
...	...	...	...	...	...	...	...	...
25178	2021-08-31	36.700001	36.700001	36.700001	36.700001	0	FDSCX	1
25179	2021-08-31	56.790001	56.790001	56.790001	56.790001	0	FDSVX	1
25180	2021-08-31	19.110001	19.110001	19.110001	19.110001	0	FIFNX	1
25181	2021-08-31	29.379999	29.379999	29.379999	29.379999	0	FSMVX	1
25182	2021-08-31	12.160000	12.160000	12.160000	12.160000	0	FULVX	1

25183 rows × 8 columns

# hide
print(df['day'].unique())
# df.loc[100:150, 'day'] #assume day-of-week: 0 - 4

[2 3 4 0 1]

# 
# Verify 2 unique tickers
# https://www.investopedia.com/ask/answers/who-or-what-is-dow-jones/
lst = list(df['tic'].unique())
print(len(lst), lst)

9 ['FBCVX', 'FCPGX', 'FDCAX', 'FDGFX', 'FDSCX', 'FDSVX', 'FSMVX', 'FIFNX', 'FULVX']

3.2 Data Understanding and Preparation

We will keep showing snippets of the data set as it evolves to assist with understanding. There is a need to check for missing data and also to do some feature engineering. We rely on the FeatureEngineer class to take care of these needs. Some indicators used are:

Moving Average Convergence Divergence (MACD)

The MACD is primarily used to gauge the strength of stock price movement. It does this by measuring the divergence of two exponential moving averages (EMAs), commonly a 12-period EMA and a 26-period EMA.

Relative Strength Index (RSI)

The RSI aims to indicate whether a market is considered to be overbought or oversold in relation to recent price levels.

Commodity Channel Index (CCI)

The Commodity Channel Index (CCI) is a momentum-based oscillator used to help determine when an investment vehicle is reaching a condition of being overbought or oversold.

FinRL also uses the financial turbulence index that measures extreme asset price fluctuation.

3.2.1 Add technical indicators

%%time
fe = FeatureEngineer(
  use_technical_indicator=True,
  use_turbulence=False,
  user_defined_feature=False)
df = fe.preprocess_data(df)

Successfully added technical indicators
CPU times: user 15.6 s, sys: 1.76 s, total: 17.4 s
Wall time: 15.7 s

# df.head(100)
df

	date	open	high	low	close	volume	tic	day	macd	boll_ub	boll_lb	rsi_30	cci_30	dx_30	close_30_sma	close_60_sma
0	2008-01-02	14.420000	14.420000	14.420000	11.721002	0	FBCVX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	11.721002	11.721002
3441	2008-01-02	15.660000	15.660000	15.660000	6.663240	0	FCPGX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	6.663240	6.663240
6882	2008-01-02	26.370001	26.370001	26.370001	11.040798	0	FDCAX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	11.040798	11.040798
10323	2008-01-02	28.930000	28.930000	28.930000	10.950327	0	FDGFX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	10.950327	10.950327
13764	2008-01-02	19.670000	19.670000	19.670000	11.192782	0	FDSCX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	11.192782	11.192782
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
10322	2021-08-31	50.099998	50.099998	50.099998	50.099998	0	FDCAX	1	0.485560	50.233594	47.837405	63.290963	186.262675	33.935129	48.798333	48.066166
13763	2021-08-31	37.290001	37.290001	37.290001	35.175362	0	FDGFX	1	0.160866	35.429981	34.401928	56.769455	108.062182	14.914749	34.818795	34.498705
17204	2021-08-31	36.700001	36.700001	36.700001	36.700001	0	FDSCX	1	0.355394	36.956232	34.422768	56.811784	158.442705	22.592461	35.431666	35.356833
20645	2021-08-31	56.790001	56.790001	56.790001	56.790001	0	FDSVX	1	0.616046	56.891673	53.754767	63.869985	13.372480	1.499019	55.064085	54.107896
24086	2021-08-31	29.379999	29.379999	29.379999	29.379999	0	FSMVX	1	0.269141	29.717076	28.292924	56.730943	100.071603	17.984678	28.728666	28.399833

24087 rows × 16 columns

We see that the FeatureEngineer has added some features:

macd
boll_ub (upper Bollinger Band)
boll_lb (lower Bollinger Band)
rsi_30 (with a lookback of 30)
cci_30 (with a lookback of 30)
dx_30 (with a lookback of 30)
close_30_sma (close price simple moving average with a lookback of 30)
close_60_sma (close price simple moving average with a lookback of 60)

# hide
# on stockstats library: (seems like FeatureEngineer makes use of it)
# https://medium.com/codex/this-python-library-will-help-you-get-stock-technical-indicators-in-one-line-of-code-c11ed2c8e45f

3.2.2 Add covariance matrix as a feature

Adding the portfolio’s covariance matrix as a feature has some advantages. It can be used to quantify the risk (standard deviation) associated with a portfolio.

df = df.sort_values(['date','tic'], ignore_index=True)

df

	date	open	high	low	close	volume	tic	day	macd	boll_ub	boll_lb	rsi_30	cci_30	dx_30	close_30_sma	close_60_sma
0	2008-01-02	14.420000	14.420000	14.420000	11.721002	0	FBCVX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	11.721002	11.721002
1	2008-01-02	15.660000	15.660000	15.660000	6.663240	0	FCPGX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	6.663240	6.663240
2	2008-01-02	26.370001	26.370001	26.370001	11.040798	0	FDCAX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	11.040798	11.040798
3	2008-01-02	28.930000	28.930000	28.930000	10.950327	0	FDGFX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	10.950327	10.950327
4	2008-01-02	19.670000	19.670000	19.670000	11.192782	0	FDSCX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	11.192782	11.192782
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
24082	2021-08-31	50.099998	50.099998	50.099998	50.099998	0	FDCAX	1	0.485560	50.233594	47.837405	63.290963	186.262675	33.935129	48.798333	48.066166
24083	2021-08-31	37.290001	37.290001	37.290001	35.175362	0	FDGFX	1	0.160866	35.429981	34.401928	56.769455	108.062182	14.914749	34.818795	34.498705
24084	2021-08-31	36.700001	36.700001	36.700001	36.700001	0	FDSCX	1	0.355394	36.956232	34.422768	56.811784	158.442705	22.592461	35.431666	35.356833
24085	2021-08-31	56.790001	56.790001	56.790001	56.790001	0	FDSVX	1	0.616046	56.891673	53.754767	63.869985	13.372480	1.499019	55.064085	54.107896
24086	2021-08-31	29.379999	29.379999	29.379999	29.379999	0	FSMVX	1	0.269141	29.717076	28.292924	56.730943	100.071603	17.984678	28.728666	28.399833

24087 rows × 16 columns

df.index = df.date.factorize()[0] #. now each new date has a new index

df

	date	open	high	low	close	volume	tic	day	macd	boll_ub	boll_lb	rsi_30	cci_30	dx_30	close_30_sma	close_60_sma
0	2008-01-02	14.420000	14.420000	14.420000	11.721002	0	FBCVX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	11.721002	11.721002
0	2008-01-02	15.660000	15.660000	15.660000	6.663240	0	FCPGX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	6.663240	6.663240
0	2008-01-02	26.370001	26.370001	26.370001	11.040798	0	FDCAX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	11.040798	11.040798
0	2008-01-02	28.930000	28.930000	28.930000	10.950327	0	FDGFX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	10.950327	10.950327
0	2008-01-02	19.670000	19.670000	19.670000	11.192782	0	FDSCX	2	0.000000	11.743293	11.674326	0.000000	-66.666667	100.000000	11.192782	11.192782
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
3440	2021-08-31	50.099998	50.099998	50.099998	50.099998	0	FDCAX	1	0.485560	50.233594	47.837405	63.290963	186.262675	33.935129	48.798333	48.066166
3440	2021-08-31	37.290001	37.290001	37.290001	35.175362	0	FDGFX	1	0.160866	35.429981	34.401928	56.769455	108.062182	14.914749	34.818795	34.498705
3440	2021-08-31	36.700001	36.700001	36.700001	36.700001	0	FDSCX	1	0.355394	36.956232	34.422768	56.811784	158.442705	22.592461	35.431666	35.356833
3440	2021-08-31	56.790001	56.790001	56.790001	56.790001	0	FDSVX	1	0.616046	56.891673	53.754767	63.869985	13.372480	1.499019	55.064085	54.107896
3440	2021-08-31	29.379999	29.379999	29.379999	29.379999	0	FSMVX	1	0.269141	29.717076	28.292924	56.730943	100.071603	17.984678	28.728666	28.399833

24087 rows × 16 columns

# hide
# len(df.index.unique())
# range(lookback, len(df.index.unique()))
lst = [i for i in range(LOOKBACK, len(df.index.unique()))]
lst[:20], lst[-1]

%%time
cov_list = []
return_list = []
for i in range(LOOKBACK, len(df.index.unique())):
  data_lookback = df.loc[i-LOOKBACK:i, :]
  price_lookback = data_lookback.pivot_table(index='date', columns='tic', values='close')
  return_lookback = price_lookback.pct_change().dropna()
  return_list.append(return_lookback)
  covs = return_lookback.cov().values 
  cov_list.append(covs)

CPU times: user 43.9 s, sys: 581 ms, total: 44.4 s
Wall time: 43.8 s

len(df['date'].unique()), len(df['date'].unique()[LOOKBACK:])

(3441, 3189)

len(cov_list), len(return_list)

(3189, 3189)

# hide
# cov_list[0]
# return_list[:1]

# 
# form a dataframe with the cov_list and return_list
df_cov = pd.DataFrame({'date':df.date.unique()[LOOKBACK:], 'cov_list':cov_list, 'return_list':return_list})
df_cov

	date	cov_list	return_list
0	2008-12-31	[[0.0008916783690050918, 0.0007199340713349982...	tic FBCVX FCPGX FDCAX ... ...
1	2009-01-02	[[0.0008961331319844811, 0.0007229547753450831...	tic FBCVX FCPGX FDCAX ... ...
2	2009-01-05	[[0.0008942290687322837, 0.0007206030648001816...	tic FBCVX FCPGX FDCAX ... ...
3	2009-01-06	[[0.0008951474693283972, 0.0007217056039965993...	tic FBCVX FCPGX FDCAX ... ...
4	2009-01-07	[[0.0008975149225319093, 0.0007241915142094865...	tic FBCVX FCPGX FDCAX ... ...
...	...	...	...
3184	2021-08-25	[[8.139554607177268e-05, 7.076161408798078e-05...	tic FBCVX FCPGX FDCAX ... ...
3185	2021-08-26	[[8.15840000747442e-05, 7.098311940863228e-05,...	tic FBCVX FCPGX FDCAX ... ...
3186	2021-08-27	[[8.155510926610815e-05, 7.162918231818502e-05...	tic FBCVX FCPGX FDCAX ... ...
3187	2021-08-30	[[8.162658991218995e-05, 7.155445795151861e-05...	tic FBCVX FCPGX FDCAX ... ...
3188	2021-08-31	[[8.137781480231036e-05, 7.150181363650654e-05...	tic FBCVX FCPGX FDCAX ... ...

3189 rows × 3 columns

# 
# merge df_cov with the main dataframe
df = df.merge(df_cov, on='date')
df

	date	open	high	low	close	volume	tic	day	macd	boll_ub	boll_lb	rsi_30	cci_30	dx_30	close_30_sma	close_60_sma	cov_list	return_list
0	2008-12-31	7.900000	7.900000	7.900000	6.548186	0	FBCVX	2	-0.003032	6.625188	6.032323	48.240961	102.573711	6.437515	6.187183	6.453092	[[0.0008916783690050918, 0.0007199340713349982...	tic FBCVX FCPGX FDCAX ... ...
1	2008-12-31	8.690000	8.690000	8.690000	3.697545	0	FCPGX	2	0.015371	3.687956	3.319084	48.638490	137.594713	11.647507	3.427356	3.594576	[[0.0008916783690050918, 0.0007199340713349982...	tic FBCVX FCPGX FDCAX ... ...
2	2008-12-31	15.730000	15.730000	15.730000	6.669960	0	FDCAX	2	0.025542	6.761376	6.140492	48.908868	110.240063	8.794697	6.289111	6.487707	[[0.0008916783690050918, 0.0007199340713349982...	tic FBCVX FCPGX FDCAX ... ...
3	2008-12-31	15.790000	15.790000	15.790000	6.350790	0	FDGFX	2	0.025277	6.407065	5.710326	48.573492	115.808269	8.665272	5.894904	6.192935	[[0.0008916783690050918, 0.0007199340713349982...	tic FBCVX FCPGX FDCAX ... ...
4	2008-12-31	10.530000	10.530000	10.530000	6.006501	0	FDSCX	2	0.025076	6.001624	5.402930	48.564497	128.462223	10.677362	5.559625	5.839951	[[0.0008916783690050918, 0.0007199340713349982...	tic FBCVX FCPGX FDCAX ... ...
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
22318	2021-08-31	50.099998	50.099998	50.099998	50.099998	0	FDCAX	1	0.485560	50.233594	47.837405	63.290963	186.262675	33.935129	48.798333	48.066166	[[8.137781480231036e-05, 7.150181363650654e-05...	tic FBCVX FCPGX FDCAX ... ...
22319	2021-08-31	37.290001	37.290001	37.290001	35.175362	0	FDGFX	1	0.160866	35.429981	34.401928	56.769455	108.062182	14.914749	34.818795	34.498705	[[8.137781480231036e-05, 7.150181363650654e-05...	tic FBCVX FCPGX FDCAX ... ...
22320	2021-08-31	36.700001	36.700001	36.700001	36.700001	0	FDSCX	1	0.355394	36.956232	34.422768	56.811784	158.442705	22.592461	35.431666	35.356833	[[8.137781480231036e-05, 7.150181363650654e-05...	tic FBCVX FCPGX FDCAX ... ...
22321	2021-08-31	56.790001	56.790001	56.790001	56.790001	0	FDSVX	1	0.616046	56.891673	53.754767	63.869985	13.372480	1.499019	55.064085	54.107896	[[8.137781480231036e-05, 7.150181363650654e-05...	tic FBCVX FCPGX FDCAX ... ...
22322	2021-08-31	29.379999	29.379999	29.379999	29.379999	0	FSMVX	1	0.269141	29.717076	28.292924	56.730943	100.071603	17.984678	28.728666	28.399833	[[8.137781480231036e-05, 7.150181363650654e-05...	tic FBCVX FCPGX FDCAX ... ...

22323 rows × 18 columns

df = df.sort_values(['date','tic']).reset_index(drop=True)

df

	date	open	high	low	close	volume	tic	day	macd	boll_ub	boll_lb	rsi_30	cci_30	dx_30	close_30_sma	close_60_sma	cov_list	return_list
0	2008-12-31	7.900000	7.900000	7.900000	6.548186	0	FBCVX	2	-0.003032	6.625188	6.032323	48.240961	102.573711	6.437515	6.187183	6.453092	[[0.0008916783690050918, 0.0007199340713349982...	tic FBCVX FCPGX FDCAX ... ...
1	2008-12-31	8.690000	8.690000	8.690000	3.697545	0	FCPGX	2	0.015371	3.687956	3.319084	48.638490	137.594713	11.647507	3.427356	3.594576	[[0.0008916783690050918, 0.0007199340713349982...	tic FBCVX FCPGX FDCAX ... ...
2	2008-12-31	15.730000	15.730000	15.730000	6.669960	0	FDCAX	2	0.025542	6.761376	6.140492	48.908868	110.240063	8.794697	6.289111	6.487707	[[0.0008916783690050918, 0.0007199340713349982...	tic FBCVX FCPGX FDCAX ... ...
3	2008-12-31	15.790000	15.790000	15.790000	6.350790	0	FDGFX	2	0.025277	6.407065	5.710326	48.573492	115.808269	8.665272	5.894904	6.192935	[[0.0008916783690050918, 0.0007199340713349982...	tic FBCVX FCPGX FDCAX ... ...
4	2008-12-31	10.530000	10.530000	10.530000	6.006501	0	FDSCX	2	0.025076	6.001624	5.402930	48.564497	128.462223	10.677362	5.559625	5.839951	[[0.0008916783690050918, 0.0007199340713349982...	tic FBCVX FCPGX FDCAX ... ...
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
22318	2021-08-31	50.099998	50.099998	50.099998	50.099998	0	FDCAX	1	0.485560	50.233594	47.837405	63.290963	186.262675	33.935129	48.798333	48.066166	[[8.137781480231036e-05, 7.150181363650654e-05...	tic FBCVX FCPGX FDCAX ... ...
22319	2021-08-31	37.290001	37.290001	37.290001	35.175362	0	FDGFX	1	0.160866	35.429981	34.401928	56.769455	108.062182	14.914749	34.818795	34.498705	[[8.137781480231036e-05, 7.150181363650654e-05...	tic FBCVX FCPGX FDCAX ... ...
22320	2021-08-31	36.700001	36.700001	36.700001	36.700001	0	FDSCX	1	0.355394	36.956232	34.422768	56.811784	158.442705	22.592461	35.431666	35.356833	[[8.137781480231036e-05, 7.150181363650654e-05...	tic FBCVX FCPGX FDCAX ... ...
22321	2021-08-31	56.790001	56.790001	56.790001	56.790001	0	FDSVX	1	0.616046	56.891673	53.754767	63.869985	13.372480	1.499019	55.064085	54.107896	[[8.137781480231036e-05, 7.150181363650654e-05...	tic FBCVX FCPGX FDCAX ... ...
22322	2021-08-31	29.379999	29.379999	29.379999	29.379999	0	FSMVX	1	0.269141	29.717076	28.292924	56.730943	100.071603	17.984678	28.728666	28.399833	[[8.137781480231036e-05, 7.150181363650654e-05...	tic FBCVX FCPGX FDCAX ... ...

22323 rows × 18 columns

3.3 Modeling

The portfolio within the market will be modeled by the OpenAI Gym framework. This is referred to as the environment. For the agent, we will use the FinRL framework. As the agent interacts with the environment it will gradually learn a trading strategy based on the reward function. The agent is rewarded according to the total value of the portfolio.

3.3.1 Training data

TRAIN_START, TRADE_START

('2009-01-01', '2020-07-01')

train = data_split(df, TRAIN_START, TRADE_START)
train

	date	open	high	low	close	volume	tic	day	macd	boll_ub	boll_lb	rsi_30	cci_30	dx_30	close_30_sma	close_60_sma	cov_list	return_list
0	2009-01-02	8.150000	8.150000	8.150000	6.755406	0	FBCVX	4	0.030432	6.702175	6.010826	49.991970	152.551464	13.863211	6.209512	6.445367	[[0.0008961331319844811, 0.0007229547753450831...	tic FBCVX FCPGX FDCAX ... ...
0	2009-01-02	8.870000	8.870000	8.870000	3.774134	0	FCPGX	4	0.032454	3.737249	3.305107	49.880034	162.366452	16.518011	3.442107	3.588619	[[0.0008961331319844811, 0.0007229547753450831...	tic FBCVX FCPGX FDCAX ... ...
0	2009-01-02	16.280001	16.280001	16.280001	6.903177	0	FDCAX	4	0.058199	6.834997	6.136275	51.154864	170.139325	17.925843	6.316292	6.487900	[[0.0008961331319844811, 0.0007229547753450831...	tic FBCVX FCPGX FDCAX ... ...
0	2009-01-02	16.370001	16.370001	16.370001	6.584068	0	FDGFX	4	0.060919	6.498272	5.695824	50.606719	165.116529	16.838559	5.921930	6.184977	[[0.0008961331319844811, 0.0007229547753450831...	tic FBCVX FCPGX FDCAX ... ...
0	2009-01-02	10.740000	10.740000	10.740000	6.126287	0	FDSCX	4	0.052369	6.074467	5.390190	49.704302	150.989986	15.152331	5.582884	5.829957	[[0.0008961331319844811, 0.0007229547753450831...	tic FBCVX FCPGX FDCAX ... ...
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
2892	2020-06-30	35.930000	35.930000	35.930000	33.073399	0	FDCAX	1	0.501019	33.705111	31.615998	57.598799	82.309417	17.475881	32.193710	30.489567	[[0.000538045795431353, 0.0004554362595919771,...	tic FBCVX FCPGX FDCAX ... ...
2892	2020-06-30	25.700001	25.700001	25.700001	23.743460	0	FDGFX	1	0.034722	26.677363	22.296985	50.881596	-25.411227	0.081819	24.063118	22.812814	[[0.000538045795431353, 0.0004554362595919771,...	tic FBCVX FCPGX FDCAX ... ...
2892	2020-06-30	22.870001	22.870001	22.870001	22.532381	0	FDSCX	1	0.174035	23.630273	21.395077	53.642964	31.031452	8.899931	22.311030	21.079976	[[0.000538045795431353, 0.0004554362595919771,...	tic FBCVX FCPGX FDCAX ... ...
2892	2020-06-30	45.220001	45.220001	45.220001	37.346981	0	FDSVX	1	0.708226	37.957114	35.252713	59.415280	99.055416	21.233245	36.033532	33.922269	[[0.000538045795431353, 0.0004554362595919771,...	tic FBCVX FCPGX FDCAX ... ...
2892	2020-06-30	18.670000	18.670000	18.670000	18.302086	0	FSMVX	1	0.013502	20.369239	17.161312	51.025374	-19.628866	1.400737	18.481807	17.546607	[[0.000538045795431353, 0.0004554362595919771,...	tic FBCVX FCPGX FDCAX ... ...

20251 rows × 18 columns

3.3.2 Portfolio (Environment)

class Portfolio(gym.Env):
    """A portfolio/market environment
    Attributes
    ----------
        df: DataFrame
            input data
        stock_dim : int
            number of unique stocks
        hmax : int
            maximum number of shares to trade
        initial_amount : int
            start money
        transaction_cost_pct: float
            transaction cost percentage per trade
        reward_scaling: float
            scaling factor for reward, good for training
        state_space: int
            the dimension of input features
        action_space: int
            equals stock dimension
        tech_indicator_list: list
            a list of technical indicator names
        turbulence_threshold: int
            a threshold to control risk aversion
        day: int
            an increment number to control date
    Methods
    -------
    _sell_stock()
        perform sell action based on the sign of the action
    _buy_stock()
        perform buy action based on the sign of the action
    step()
        at each step the agent will return actions, then 
        we will calculate the reward, and return the next observation.
    reset()
        reset the environment
    render()
        use render to return other functions
    save_asset_memory()
        return account value at each time step
    save_action_memory()
        return actions/positions at each time step
    """
    metadata = {'render.modes': ['human']}

    def __init__(self, 
                df,
                stock_dim,
                hmax,
                initial_amount,
                transaction_cost_pct,
                reward_scaling,
                state_space,
                action_space,
                tech_indicator_list,
                turbulence_threshold=None,
                lookback=LOOKBACK,
                day=0):
        #super(StockEnv, self).__init__()
        #money = 10 , scope = 1
        self.day = day
        self.lookback = lookback
        self.df = df
        self.stock_dim = stock_dim
        self.hmax = hmax
        self.initial_amount = initial_amount
        self.transaction_cost_pct = transaction_cost_pct
        self.reward_scaling = reward_scaling
        self.state_space = state_space
        self.action_space = action_space
        self.tech_indicator_list = tech_indicator_list

        # action_space normalization and shape is self.stock_dim
        self.action_space = spaces.Box(low=0, high=1, shape=(self.action_space,)) 
        self.observation_space = spaces.Box(low=-np.inf, high=np.inf, shape=(self.state_space + len(self.tech_indicator_list), self.state_space))

        # load data from a pandas dataframe
        self.data = self.df.loc[self.day, :]
        self.covs = self.data['cov_list'].values[0]
        self.state =  np.append(np.array(self.covs), [self.data[tech].values.tolist() for tech in self.tech_indicator_list ], axis=0)
        self.terminal = False
        self.turbulence_threshold = turbulence_threshold
        # initalize state: inital portfolio return + individual stock return + individual weights
        self.portfolio_value = self.initial_amount

        # memorize portfolio value each step
        self.asset_memory = [self.initial_amount]
        # memorize portfolio return each step
        self.portfolio_return_memory = [0]
        self.actions_memory = [[1/self.stock_dim]*self.stock_dim]
        self.date_memory = [self.data.date.unique()[0]]

    def step(self, actions):
        self.terminal = self.day >= len(self.df.index.unique()) - 1

        if self.terminal:
            df = pd.DataFrame(self.portfolio_return_memory)
            df.columns = ['daily_return']
            plt.plot(df.daily_return.cumsum(), 'r')
            plt.savefig('results/cumulative_reward.png')
            plt.close()
            
            plt.plot(self.portfolio_return_memory, 'r')
            plt.savefig('results/rewards.png')
            plt.close()

            print("=================================")
            print("begin_total_asset:{}".format(self.asset_memory[0]))           
            print("end_total_asset:{}".format(self.portfolio_value))

            df_daily_return = pd.DataFrame(self.portfolio_return_memory)
            df_daily_return.columns = ['daily_return']
            if df_daily_return['daily_return'].std() !=0:
              sharpe = (252**0.5)*df_daily_return['daily_return'].mean()/ \
                       df_daily_return['daily_return'].std()
              print("Sharpe: ",sharpe)
            print("=================================")
            
            return self.state, self.reward, self.terminal, {}
        else:
            weights = self.softmax_normalization(actions) 
            self.actions_memory.append(weights)
            last_day_memory = self.data

            #load next state
            self.day += 1
            self.data = self.df.loc[self.day,:]
            self.covs = self.data['cov_list'].values[0]
            self.state =  np.append(np.array(self.covs), [self.data[tech].values.tolist() for tech in self.tech_indicator_list ], axis=0)
            portfolio_return = sum(((self.data.close.values / last_day_memory.close.values)-1)*weights)
            log_portfolio_return = np.log(sum((self.data.close.values / last_day_memory.close.values)*weights))
            # update portfolio value
            new_portfolio_value = self.portfolio_value*(1+portfolio_return)
            self.portfolio_value = new_portfolio_value

            # save into memory
            self.portfolio_return_memory.append(portfolio_return)
            self.date_memory.append(self.data.date.unique()[0])            
            self.asset_memory.append(new_portfolio_value)

            # the reward is the new portfolio value or end portfolo value
            self.reward = new_portfolio_value
        return self.state, self.reward, self.terminal, {}

    def reset(self):
        self.asset_memory = [self.initial_amount]
        self.day = 0
        self.data = self.df.loc[self.day,:]
        # load states
        self.covs = self.data['cov_list'].values[0]
        self.state =  np.append(np.array(self.covs), [self.data[tech].values.tolist() for tech in self.tech_indicator_list ], axis=0)
        self.portfolio_value = self.initial_amount
        #self.cost = 0
        #self.trades = 0
        self.terminal = False 
        self.portfolio_return_memory = [0]
        self.actions_memory=[[1/self.stock_dim]*self.stock_dim]
        self.date_memory=[self.data.date.unique()[0]] 
        return self.state
    
    def render(self, mode='human'):
        return self.state
        
    def softmax_normalization(self, actions):
        numerator = np.exp(actions)
        denominator = np.sum(np.exp(actions))
        softmax_output = numerator/denominator
        return softmax_output

    def save_asset_memory(self):
        date_list = self.date_memory
        portfolio_return = self.portfolio_return_memory
        #print(len(date_list))
        #print(len(asset_list))
        df_account_value = pd.DataFrame({'date':date_list,'daily_return':portfolio_return})
        return df_account_value

    def save_action_memory(self):
        # date and close price length must match actions length
        date_list = self.date_memory
        df_date = pd.DataFrame(date_list)
        df_date.columns = ['date']
        
        action_list = self.actions_memory
        df_actions = pd.DataFrame(action_list)
        df_actions.columns = self.data.tic.values
        df_actions.index = df_date.date
        #df_actions = pd.DataFrame({'date':date_list,'actions':action_list})
        return df_actions

    def _seed(self, seed=None):
        self.np_random, seed = seeding.np_random(seed)
        return [seed]

    def get_sb_env(self):
        e = DummyVecEnv([lambda: self])
        obs = e.reset()
        return e, obs

# hide
#. was 29 in original notebook !
stock_dimension = len(train['tic'].unique())
state_space = stock_dimension
print(f"Stock Dimension: {stock_dimension}, State Space: {state_space}")

Stock Dimension: 7, State Space: 7

# hide
config.TECHNICAL_INDICATORS_LIST

['macd',
 'boll_ub',
 'boll_lb',
 'rsi_30',
 'cci_30',
 'dx_30',
 'close_30_sma',
 'close_60_sma']

env_kwargs = {
    "hmax": 100, 
    "initial_amount": INITIAL_AMOUNT, 
    "transaction_cost_pct": TRANSACTION_COST_PCT, 
    "state_space": state_space, 
    "stock_dim": stock_dimension, 
    "tech_indicator_list": config.TECHNICAL_INDICATORS_LIST, 
    "action_space": stock_dimension, 
    "reward_scaling": REWARD_SCALING,
}
e_train_gym = Portfolio(df=train, **env_kwargs)

env_train, _ = e_train_gym.get_sb_env()
print(type(env_train))

<class 'stable_baselines3.common.vec_env.dummy_vec_env.DummyVecEnv'>

3.3.3 Portfolio Manager (Agent)

We investigate the performance of two models for the agent:
- A2C
- PPO

Both models are based on algorithm implementations in the OpenAI Baselines and Stable Baselines libraries.

# hide
# https://towardsdatascience.com/finrl-for-quantitative-finance-tutorial-for-portfolio-allocation-9b417660c7cd

# hide
# agent = DRLAgent(env=env_train)
# DDPG_PARAMS = {
#   # "n_steps": 10, 
#   # "ent_coef": 0.005, 
#   "learning_rate": 0.0004,
# }
# model_ddpg = agent.get_model(model_name="ddpg", model_kwargs=DDPG_PARAMS)
# model_ddpg

# hide
# %%time
# trained_ddpg = agent.train_model(model=model_ddpg, tb_log_name='ddpg', total_timesteps=1_000)

# hide
# agent = DRLAgent(env=env_train)
# SAC_PARAMS = {
#   # "n_steps": 10, 
#   # "ent_coef": 0.005, 
#   "learning_rate": 0.0004,
# }
# model_sac = agent.get_model(model_name="sac", model_kwargs=SAC_PARAMS)
# model_sac

# hide
# %%time
# trained_sac = agent.train_model(model=model_sac, tb_log_name='sac', total_timesteps=1_000)

# hide
# agent = DRLAgent(env=env_train)
# TD3_PARAMS = {
#   # "n_steps": 10, 
#   # "ent_coef": 0.005, 
#   "learning_rate": 0.0004,
# }
# model_td3 = agent.get_model(model_name="td3", model_kwargs=TD3_PARAMS)
# model_td3

# hide
# %%time
# trained_sac = agent.train_model(model=model_sac, tb_log_name='sac', total_timesteps=1_000)

# hide
# agent = DRLAgent(env=env_train)
# MADDDPG_PARAMS = {
#   # "n_steps": 10, 
#   # "ent_coef": 0.005, 
#   "learning_rate": 0.0004,
# }
# model_madddpg = agent.get_model(model_name="madddpg", model_kwargs=MADDDPG_PARAMS)
# model_madddpg

# hide
# %%time
# trained_madddpg = agent.train_model(model=model_madddpg, tb_log_name='madddpg', total_timesteps=1_000)

3.3.3.1 A2C

agent = DRLAgent(env=env_train)
A2C_PARAMS = {
  "n_steps": 10, 
  "ent_coef": 0.005, 
  "learning_rate": 0.0004,
}
model_a2c = agent.get_model(model_name="a2c", model_kwargs=A2C_PARAMS)
model_a2c

{'n_steps': 10, 'ent_coef': 0.005, 'learning_rate': 0.0004}
Using cuda device

<stable_baselines3.a2c.a2c.A2C at 0x7fc9fc325d10>

%%time
trained_a2c = agent.train_model(model=model_a2c, tb_log_name='a2c', total_timesteps=50_000)

Logging to tensorboard_log/a2c/a2c_1
-------------------------------------
| time/                 |           |
|    fps                | 84        |
|    iterations         | 100       |
|    time_elapsed       | 11        |
|    total_timesteps    | 1000      |
| train/                |           |
|    entropy_loss       | -9.88     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0004    |
|    n_updates          | 99        |
|    policy_loss        | 9.6e+07   |
|    reward             | 1770969.9 |
|    std                | 0.993     |
|    value_loss         | 1.09e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 126       |
|    iterations         | 200       |
|    time_elapsed       | 15        |
|    total_timesteps    | 2000      |
| train/                |           |
|    entropy_loss       | -9.87     |
|    explained_variance | -2.38e-07 |
|    learning_rate      | 0.0004    |
|    n_updates          | 199       |
|    policy_loss        | 1.51e+08  |
|    reward             | 2978305.2 |
|    std                | 0.992     |
|    value_loss         | 3.02e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4110079.559657067
Sharpe:  0.6979807594386715
=================================
-------------------------------------
| time/                 |           |
|    fps                | 148       |
|    iterations         | 300       |
|    time_elapsed       | 20        |
|    total_timesteps    | 3000      |
| train/                |           |
|    entropy_loss       | -9.88     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 299       |
|    policy_loss        | 5.4e+07   |
|    reward             | 1094988.2 |
|    std                | 0.992     |
|    value_loss         | 3.82e+13  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 164       |
|    iterations         | 400       |
|    time_elapsed       | 24        |
|    total_timesteps    | 4000      |
| train/                |           |
|    entropy_loss       | -9.89     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0004    |
|    n_updates          | 399       |
|    policy_loss        | 1.04e+08  |
|    reward             | 2092528.2 |
|    std                | 0.993     |
|    value_loss         | 1.53e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 175       |
|    iterations         | 500       |
|    time_elapsed       | 28        |
|    total_timesteps    | 5000      |
| train/                |           |
|    entropy_loss       | -9.88     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 499       |
|    policy_loss        | 1.49e+08  |
|    reward             | 3133837.2 |
|    std                | 0.992     |
|    value_loss         | 3.65e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:3976451.6208797866
Sharpe:  0.6857333704791829
=================================
-------------------------------------
| time/                 |           |
|    fps                | 182       |
|    iterations         | 600       |
|    time_elapsed       | 32        |
|    total_timesteps    | 6000      |
| train/                |           |
|    entropy_loss       | -9.86     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 599       |
|    policy_loss        | 6.88e+07  |
|    reward             | 1255545.6 |
|    std                | 0.99      |
|    value_loss         | 6.26e+13  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 190       |
|    iterations         | 700       |
|    time_elapsed       | 36        |
|    total_timesteps    | 7000      |
| train/                |           |
|    entropy_loss       | -9.85     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 699       |
|    policy_loss        | 1.3e+08   |
|    reward             | 2371918.2 |
|    std                | 0.989     |
|    value_loss         | 1.86e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 196       |
|    iterations         | 800       |
|    time_elapsed       | 40        |
|    total_timesteps    | 8000      |
| train/                |           |
|    entropy_loss       | -9.86     |
|    explained_variance | 1.19e-07  |
|    learning_rate      | 0.0004    |
|    n_updates          | 799       |
|    policy_loss        | 1.8e+08   |
|    reward             | 3547805.2 |
|    std                | 0.989     |
|    value_loss         | 4.39e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4161686.6336636837
Sharpe:  0.7041053033538833
=================================
-------------------------------------
| time/                 |           |
|    fps                | 199       |
|    iterations         | 900       |
|    time_elapsed       | 45        |
|    total_timesteps    | 9000      |
| train/                |           |
|    entropy_loss       | -9.84     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 899       |
|    policy_loss        | 7.17e+07  |
|    reward             | 1521407.1 |
|    std                | 0.987     |
|    value_loss         | 7.52e+13  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 203       |
|    iterations         | 1000      |
|    time_elapsed       | 49        |
|    total_timesteps    | 10000     |
| train/                |           |
|    entropy_loss       | -9.83     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 999       |
|    policy_loss        | 1.41e+08  |
|    reward             | 2594842.5 |
|    std                | 0.986     |
|    value_loss         | 2.43e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 206       |
|    iterations         | 1100      |
|    time_elapsed       | 53        |
|    total_timesteps    | 11000     |
| train/                |           |
|    entropy_loss       | -9.82     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 1099      |
|    policy_loss        | 2.02e+08  |
|    reward             | 3621684.2 |
|    std                | 0.985     |
|    value_loss         | 5.13e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4182719.841788154
Sharpe:  0.7065226810322777
=================================
-------------------------------------
| time/                 |           |
|    fps                | 208       |
|    iterations         | 1200      |
|    time_elapsed       | 57        |
|    total_timesteps    | 12000     |
| train/                |           |
|    entropy_loss       | -9.8      |
|    explained_variance | 5.96e-08  |
|    learning_rate      | 0.0004    |
|    n_updates          | 1199      |
|    policy_loss        | 6.41e+07  |
|    reward             | 1435004.8 |
|    std                | 0.982     |
|    value_loss         | 6.42e+13  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 211       |
|    iterations         | 1300      |
|    time_elapsed       | 61        |
|    total_timesteps    | 13000     |
| train/                |           |
|    entropy_loss       | -9.79     |
|    explained_variance | 5.96e-08  |
|    learning_rate      | 0.0004    |
|    n_updates          | 1299      |
|    policy_loss        | 1.38e+08  |
|    reward             | 2707523.2 |
|    std                | 0.98      |
|    value_loss         | 2.51e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 213       |
|    iterations         | 1400      |
|    time_elapsed       | 65        |
|    total_timesteps    | 14000     |
| train/                |           |
|    entropy_loss       | -9.78     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 1399      |
|    policy_loss        | 2.29e+08  |
|    reward             | 4180936.8 |
|    std                | 0.979     |
|    value_loss         | 6.08e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4271331.5898148315
Sharpe:  0.7141078133246254
=================================
-------------------------------------
| time/                 |           |
|    fps                | 214       |
|    iterations         | 1500      |
|    time_elapsed       | 69        |
|    total_timesteps    | 15000     |
| train/                |           |
|    entropy_loss       | -9.78     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0004    |
|    n_updates          | 1499      |
|    policy_loss        | 8.03e+07  |
|    reward             | 1770714.2 |
|    std                | 0.978     |
|    value_loss         | 1.03e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 216       |
|    iterations         | 1600      |
|    time_elapsed       | 73        |
|    total_timesteps    | 16000     |
| train/                |           |
|    entropy_loss       | -9.78     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 1599      |
|    policy_loss        | 1.33e+08  |
|    reward             | 2799781.5 |
|    std                | 0.979     |
|    value_loss         | 2.76e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 218       |
|    iterations         | 1700      |
|    time_elapsed       | 77        |
|    total_timesteps    | 17000     |
| train/                |           |
|    entropy_loss       | -9.77     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0004    |
|    n_updates          | 1699      |
|    policy_loss        | 2.09e+08  |
|    reward             | 3782954.2 |
|    std                | 0.977     |
|    value_loss         | 4.7e+14   |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4133498.8876433456
Sharpe:  0.7009423253577455
=================================
-------------------------------------
| time/                 |           |
|    fps                | 218       |
|    iterations         | 1800      |
|    time_elapsed       | 82        |
|    total_timesteps    | 18000     |
| train/                |           |
|    entropy_loss       | -9.75     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 1799      |
|    policy_loss        | 8.77e+07  |
|    reward             | 1747342.0 |
|    std                | 0.974     |
|    value_loss         | 1.08e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 220       |
|    iterations         | 1900      |
|    time_elapsed       | 86        |
|    total_timesteps    | 19000     |
| train/                |           |
|    entropy_loss       | -9.73     |
|    explained_variance | 1.79e-07  |
|    learning_rate      | 0.0004    |
|    n_updates          | 1899      |
|    policy_loss        | 1.55e+08  |
|    reward             | 2885395.5 |
|    std                | 0.972     |
|    value_loss         | 2.98e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 221       |
|    iterations         | 2000      |
|    time_elapsed       | 90        |
|    total_timesteps    | 20000     |
| train/                |           |
|    entropy_loss       | -9.72     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 1999      |
|    policy_loss        | 2.15e+08  |
|    reward             | 3976518.0 |
|    std                | 0.971     |
|    value_loss         | 5.31e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4046834.5908127716
Sharpe:  0.6919991450241213
=================================
-------------------------------------
| time/                 |           |
|    fps                | 221       |
|    iterations         | 2100      |
|    time_elapsed       | 94        |
|    total_timesteps    | 21000     |
| train/                |           |
|    entropy_loss       | -9.71     |
|    explained_variance | 1.19e-07  |
|    learning_rate      | 0.0004    |
|    n_updates          | 2099      |
|    policy_loss        | 8.41e+07  |
|    reward             | 1543899.6 |
|    std                | 0.968     |
|    value_loss         | 8.56e+13  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 222       |
|    iterations         | 2200      |
|    time_elapsed       | 98        |
|    total_timesteps    | 22000     |
| train/                |           |
|    entropy_loss       | -9.71     |
|    explained_variance | 1.19e-07  |
|    learning_rate      | 0.0004    |
|    n_updates          | 2199      |
|    policy_loss        | 1.46e+08  |
|    reward             | 2804042.8 |
|    std                | 0.968     |
|    value_loss         | 3.04e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 223       |
|    iterations         | 2300      |
|    time_elapsed       | 102       |
|    total_timesteps    | 23000     |
| train/                |           |
|    entropy_loss       | -9.7      |
|    explained_variance | 5.96e-08  |
|    learning_rate      | 0.0004    |
|    n_updates          | 2299      |
|    policy_loss        | 1.99e+08  |
|    reward             | 4275337.5 |
|    std                | 0.967     |
|    value_loss         | 6.46e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4052447.3183236937
Sharpe:  0.6928298846678986
=================================
-------------------------------------
| time/                 |           |
|    fps                | 223       |
|    iterations         | 2400      |
|    time_elapsed       | 107       |
|    total_timesteps    | 24000     |
| train/                |           |
|    entropy_loss       | -9.68     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0004    |
|    n_updates          | 2399      |
|    policy_loss        | 8.6e+07   |
|    reward             | 1638133.4 |
|    std                | 0.965     |
|    value_loss         | 1.05e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 224       |
|    iterations         | 2500      |
|    time_elapsed       | 111       |
|    total_timesteps    | 25000     |
| train/                |           |
|    entropy_loss       | -9.68     |
|    explained_variance | -2.38e-07 |
|    learning_rate      | 0.0004    |
|    n_updates          | 2499      |
|    policy_loss        | 1.35e+08  |
|    reward             | 2705907.2 |
|    std                | 0.965     |
|    value_loss         | 2.72e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 225       |
|    iterations         | 2600      |
|    time_elapsed       | 115       |
|    total_timesteps    | 26000     |
| train/                |           |
|    entropy_loss       | -9.67     |
|    explained_variance | 5.96e-08  |
|    learning_rate      | 0.0004    |
|    n_updates          | 2599      |
|    policy_loss        | 1.97e+08  |
|    reward             | 3837006.5 |
|    std                | 0.964     |
|    value_loss         | 4.66e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4123691.04929031
Sharpe:  0.700619600708785
=================================
-------------------------------------
| time/                 |           |
|    fps                | 225       |
|    iterations         | 2700      |
|    time_elapsed       | 119       |
|    total_timesteps    | 27000     |
| train/                |           |
|    entropy_loss       | -9.68     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 2699      |
|    policy_loss        | 8.79e+07  |
|    reward             | 1748944.0 |
|    std                | 0.965     |
|    value_loss         | 1.14e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 226       |
|    iterations         | 2800      |
|    time_elapsed       | 123       |
|    total_timesteps    | 28000     |
| train/                |           |
|    entropy_loss       | -9.67     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 2799      |
|    policy_loss        | 1.77e+08  |
|    reward             | 2919523.0 |
|    std                | 0.964     |
|    value_loss         | 3.12e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4164192.583783833
Sharpe:  0.7051087865111408
=================================
-------------------------------------
| time/                 |           |
|    fps                | 226       |
|    iterations         | 2900      |
|    time_elapsed       | 127       |
|    total_timesteps    | 29000     |
| train/                |           |
|    entropy_loss       | -9.65     |
|    explained_variance | 1.19e-07  |
|    learning_rate      | 0.0004    |
|    n_updates          | 2899      |
|    policy_loss        | 4.63e+07  |
|    reward             | 951793.56 |
|    std                | 0.961     |
|    value_loss         | 2.8e+13   |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 227       |
|    iterations         | 3000      |
|    time_elapsed       | 132       |
|    total_timesteps    | 30000     |
| train/                |           |
|    entropy_loss       | -9.65     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 2999      |
|    policy_loss        | 9.13e+07  |
|    reward             | 1990526.0 |
|    std                | 0.96      |
|    value_loss         | 1.45e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 227       |
|    iterations         | 3100      |
|    time_elapsed       | 136       |
|    total_timesteps    | 31000     |
| train/                |           |
|    entropy_loss       | -9.63     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 3099      |
|    policy_loss        | 1.58e+08  |
|    reward             | 3170679.8 |
|    std                | 0.958     |
|    value_loss         | 3.69e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4067124.4699334074
Sharpe:  0.6948431965746262
=================================
-------------------------------------
| time/                 |           |
|    fps                | 227       |
|    iterations         | 3200      |
|    time_elapsed       | 140       |
|    total_timesteps    | 32000     |
| train/                |           |
|    entropy_loss       | -9.64     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 3199      |
|    policy_loss        | 6.1e+07   |
|    reward             | 1306404.8 |
|    std                | 0.959     |
|    value_loss         | 5.29e+13  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 228       |
|    iterations         | 3300      |
|    time_elapsed       | 144       |
|    total_timesteps    | 33000     |
| train/                |           |
|    entropy_loss       | -9.63     |
|    explained_variance | 1.19e-07  |
|    learning_rate      | 0.0004    |
|    n_updates          | 3299      |
|    policy_loss        | 9.71e+07  |
|    reward             | 2191711.2 |
|    std                | 0.958     |
|    value_loss         | 1.7e+14   |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 228       |
|    iterations         | 3400      |
|    time_elapsed       | 148       |
|    total_timesteps    | 34000     |
| train/                |           |
|    entropy_loss       | -9.63     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 3399      |
|    policy_loss        | 1.56e+08  |
|    reward             | 3346152.2 |
|    std                | 0.958     |
|    value_loss         | 4.1e+14   |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4034374.463013146
Sharpe:  0.6906021236478135
=================================
-------------------------------------
| time/                 |           |
|    fps                | 228       |
|    iterations         | 3500      |
|    time_elapsed       | 153       |
|    total_timesteps    | 35000     |
| train/                |           |
|    entropy_loss       | -9.62     |
|    explained_variance | 5.96e-08  |
|    learning_rate      | 0.0004    |
|    n_updates          | 3499      |
|    policy_loss        | 6.01e+07  |
|    reward             | 1337613.8 |
|    std                | 0.957     |
|    value_loss         | 6.05e+13  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 229       |
|    iterations         | 3600      |
|    time_elapsed       | 157       |
|    total_timesteps    | 36000     |
| train/                |           |
|    entropy_loss       | -9.64     |
|    explained_variance | 1.79e-07  |
|    learning_rate      | 0.0004    |
|    n_updates          | 3599      |
|    policy_loss        | 1.4e+08   |
|    reward             | 2489865.5 |
|    std                | 0.96      |
|    value_loss         | 2.28e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 229       |
|    iterations         | 3700      |
|    time_elapsed       | 161       |
|    total_timesteps    | 37000     |
| train/                |           |
|    entropy_loss       | -9.64     |
|    explained_variance | 1.19e-07  |
|    learning_rate      | 0.0004    |
|    n_updates          | 3699      |
|    policy_loss        | 1.88e+08  |
|    reward             | 4019591.0 |
|    std                | 0.96      |
|    value_loss         | 5.58e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4119809.4408771424
Sharpe:  0.7003255647826773
=================================
-------------------------------------
| time/                 |           |
|    fps                | 229       |
|    iterations         | 3800      |
|    time_elapsed       | 165       |
|    total_timesteps    | 38000     |
| train/                |           |
|    entropy_loss       | -9.65     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0004    |
|    n_updates          | 3799      |
|    policy_loss        | 6.5e+07   |
|    reward             | 1363994.8 |
|    std                | 0.96      |
|    value_loss         | 6.05e+13  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 229       |
|    iterations         | 3900      |
|    time_elapsed       | 169       |
|    total_timesteps    | 39000     |
| train/                |           |
|    entropy_loss       | -9.64     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 3899      |
|    policy_loss        | 1.15e+08  |
|    reward             | 2599753.0 |
|    std                | 0.959     |
|    value_loss         | 2.45e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 230       |
|    iterations         | 4000      |
|    time_elapsed       | 173       |
|    total_timesteps    | 40000     |
| train/                |           |
|    entropy_loss       | -9.62     |
|    explained_variance | 5.96e-08  |
|    learning_rate      | 0.0004    |
|    n_updates          | 3999      |
|    policy_loss        | 2.09e+08  |
|    reward             | 3773874.0 |
|    std                | 0.957     |
|    value_loss         | 5.35e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4030151.780621599
Sharpe:  0.691531028386293
=================================
-------------------------------------
| time/                 |           |
|    fps                | 230       |
|    iterations         | 4100      |
|    time_elapsed       | 178       |
|    total_timesteps    | 41000     |
| train/                |           |
|    entropy_loss       | -9.62     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 4099      |
|    policy_loss        | 9.04e+07  |
|    reward             | 1612590.9 |
|    std                | 0.957     |
|    value_loss         | 8.73e+13  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 230       |
|    iterations         | 4200      |
|    time_elapsed       | 182       |
|    total_timesteps    | 42000     |
| train/                |           |
|    entropy_loss       | -9.62     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0004    |
|    n_updates          | 4199      |
|    policy_loss        | 1.48e+08  |
|    reward             | 2647467.8 |
|    std                | 0.958     |
|    value_loss         | 2.66e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 230       |
|    iterations         | 4300      |
|    time_elapsed       | 186       |
|    total_timesteps    | 43000     |
| train/                |           |
|    entropy_loss       | -9.62     |
|    explained_variance | 5.96e-08  |
|    learning_rate      | 0.0004    |
|    n_updates          | 4299      |
|    policy_loss        | 1.79e+08  |
|    reward             | 3713941.2 |
|    std                | 0.957     |
|    value_loss         | 5.12e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4077198.0467443867
Sharpe:  0.695992993463236
=================================
-------------------------------------
| time/                 |           |
|    fps                | 230       |
|    iterations         | 4400      |
|    time_elapsed       | 190       |
|    total_timesteps    | 44000     |
| train/                |           |
|    entropy_loss       | -9.62     |
|    explained_variance | 0         |
|    learning_rate      | 0.0004    |
|    n_updates          | 4399      |
|    policy_loss        | 8.99e+07  |
|    reward             | 1753287.2 |
|    std                | 0.957     |
|    value_loss         | 1.12e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 231       |
|    iterations         | 4500      |
|    time_elapsed       | 194       |
|    total_timesteps    | 45000     |
| train/                |           |
|    entropy_loss       | -9.62     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0004    |
|    n_updates          | 4499      |
|    policy_loss        | 1.6e+08   |
|    reward             | 3013050.0 |
|    std                | 0.957     |
|    value_loss         | 3.14e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 231       |
|    iterations         | 4600      |
|    time_elapsed       | 198       |
|    total_timesteps    | 46000     |
| train/                |           |
|    entropy_loss       | -9.63     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0004    |
|    n_updates          | 4599      |
|    policy_loss        | 1.91e+08  |
|    reward             | 4056158.0 |
|    std                | 0.958     |
|    value_loss         | 6.01e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4110861.0279944185
Sharpe:  0.6995703979641089
=================================
-------------------------------------
| time/                 |           |
|    fps                | 231       |
|    iterations         | 4700      |
|    time_elapsed       | 203       |
|    total_timesteps    | 47000     |
| train/                |           |
|    entropy_loss       | -9.63     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0004    |
|    n_updates          | 4699      |
|    policy_loss        | 6.95e+07  |
|    reward             | 1597928.8 |
|    std                | 0.958     |
|    value_loss         | 7.49e+13  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 231       |
|    iterations         | 4800      |
|    time_elapsed       | 207       |
|    total_timesteps    | 48000     |
| train/                |           |
|    entropy_loss       | -9.63     |
|    explained_variance | 5.96e-08  |
|    learning_rate      | 0.0004    |
|    n_updates          | 4799      |
|    policy_loss        | 1.2e+08   |
|    reward             | 2745886.8 |
|    std                | 0.958     |
|    value_loss         | 2.58e+14  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 232       |
|    iterations         | 4900      |
|    time_elapsed       | 211       |
|    total_timesteps    | 49000     |
| train/                |           |
|    entropy_loss       | -9.62     |
|    explained_variance | 5.96e-08  |
|    learning_rate      | 0.0004    |
|    n_updates          | 4899      |
|    policy_loss        | 1.83e+08  |
|    reward             | 4007404.8 |
|    std                | 0.957     |
|    value_loss         | 5.87e+14  |
-------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4086023.462167963
Sharpe:  0.6967232525311317
=================================
-------------------------------------
| time/                 |           |
|    fps                | 231       |
|    iterations         | 5000      |
|    time_elapsed       | 215       |
|    total_timesteps    | 50000     |
| train/                |           |
|    entropy_loss       | -9.63     |
|    explained_variance | 1.19e-07  |
|    learning_rate      | 0.0004    |
|    n_updates          | 4999      |
|    policy_loss        | 9.05e+07  |
|    reward             | 1792232.6 |
|    std                | 0.958     |
|    value_loss         | 1.13e+14  |
-------------------------------------
CPU times: user 3min 29s, sys: 2.08 s, total: 3min 31s
Wall time: 3min 35s

3.3.3.2 PPO

agent = DRLAgent(env = env_train)
PPO_PARAMS = {
  "n_steps": 2048,
  "ent_coef": 0.005,
  "learning_rate": 0.001,
  "batch_size": 128,
}
model_ppo = agent.get_model("ppo",model_kwargs = PPO_PARAMS)
model_ppo

{'n_steps': 2048, 'ent_coef': 0.005, 'learning_rate': 0.001, 'batch_size': 128}
Using cuda device

<stable_baselines3.ppo.ppo.PPO at 0x7fc981276f10>

%%time
# train PPO agent
trained_ppo = agent.train_model(model=model_ppo, tb_log_name='ppo', total_timesteps=50_000)

Logging to tensorboard_log/ppo/ppo_1
----------------------------------
| time/              |           |
|    fps             | 307       |
|    iterations      | 1         |
|    time_elapsed    | 6         |
|    total_timesteps | 2048      |
| train/             |           |
|    reward          | 3199747.8 |
----------------------------------
=================================
begin_total_asset:1000000
end_total_asset:3913564.3893383564
Sharpe:  0.6788581038411543
=================================
------------------------------------------
| time/                   |              |
|    fps                  | 275          |
|    iterations           | 2            |
|    time_elapsed         | 14           |
|    total_timesteps      | 4096         |
| train/                  |              |
|    approx_kl            | 8.119969e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 5.96e-08     |
|    learning_rate        | 0.001        |
|    loss                 | 6.38e+14     |
|    n_updates            | 10           |
|    policy_gradient_loss | -1.26e-06    |
|    reward               | 2267978.8    |
|    std                  | 1            |
|    value_loss           | 1.28e+15     |
------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:3994425.904324733
Sharpe:  0.6872894807654567
=================================
------------------------------------------
| time/                   |              |
|    fps                  | 267          |
|    iterations           | 3            |
|    time_elapsed         | 22           |
|    total_timesteps      | 6144         |
| train/                  |              |
|    approx_kl            | 8.032657e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 0            |
|    learning_rate        | 0.001        |
|    loss                 | 9.55e+14     |
|    n_updates            | 20           |
|    policy_gradient_loss | -1.22e-06    |
|    reward               | 1323024.1    |
|    std                  | 1            |
|    value_loss           | 1.95e+15     |
------------------------------------------
------------------------------------------
| time/                   |              |
|    fps                  | 266          |
|    iterations           | 4            |
|    time_elapsed         | 30           |
|    total_timesteps      | 8192         |
| train/                  |              |
|    approx_kl            | 8.294592e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | -1.19e-07    |
|    learning_rate        | 0.001        |
|    loss                 | 1.13e+15     |
|    n_updates            | 30           |
|    policy_gradient_loss | -1.11e-06    |
|    reward               | 4083331.2    |
|    std                  | 1            |
|    value_loss           | 2.51e+15     |
------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4150878.4127395344
Sharpe:  0.7027972589326308
=================================
----------------------------------------
| time/                   |            |
|    fps                  | 262        |
|    iterations           | 5          |
|    time_elapsed         | 39         |
|    total_timesteps      | 10240      |
| train/                  |            |
|    approx_kl            | 7.5961e-09 |
|    clip_fraction        | 0          |
|    clip_range           | 0.2        |
|    entropy_loss         | -9.93      |
|    explained_variance   | 2.38e-07   |
|    learning_rate        | 0.001      |
|    loss                 | 9.87e+14   |
|    n_updates            | 40         |
|    policy_gradient_loss | -9.28e-07  |
|    reward               | 2970283.2  |
|    std                  | 1          |
|    value_loss           | 1.95e+15   |
----------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:3998877.2903906824
Sharpe:  0.6871327226982832
=================================
------------------------------------------
| time/                   |              |
|    fps                  | 261          |
|    iterations           | 6            |
|    time_elapsed         | 46           |
|    total_timesteps      | 12288        |
| train/                  |              |
|    approx_kl            | 7.508788e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 0            |
|    learning_rate        | 0.001        |
|    loss                 | 8.7e+14      |
|    n_updates            | 50           |
|    policy_gradient_loss | -9.23e-07    |
|    reward               | 1590401.4    |
|    std                  | 1            |
|    value_loss           | 1.84e+15     |
------------------------------------------
------------------------------------------
| time/                   |              |
|    fps                  | 261          |
|    iterations           | 7            |
|    time_elapsed         | 54           |
|    total_timesteps      | 14336        |
| train/                  |              |
|    approx_kl            | 8.556526e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 0            |
|    learning_rate        | 0.001        |
|    loss                 | 1.26e+15     |
|    n_updates            | 60           |
|    policy_gradient_loss | -1.26e-06    |
|    reward               | 4469934.5    |
|    std                  | 1            |
|    value_loss           | 2.36e+15     |
------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4096060.7173107606
Sharpe:  0.6982076816270193
=================================
------------------------------------------
| time/                   |              |
|    fps                  | 259          |
|    iterations           | 8            |
|    time_elapsed         | 63           |
|    total_timesteps      | 16384        |
| train/                  |              |
|    approx_kl            | 6.315531e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | -1.19e-07    |
|    learning_rate        | 0.001        |
|    loss                 | 1.29e+15     |
|    n_updates            | 70           |
|    policy_gradient_loss | -9.07e-07    |
|    reward               | 2862059.0    |
|    std                  | 1            |
|    value_loss           | 2.49e+15     |
------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4058297.672070343
Sharpe:  0.694059504812043
=================================
------------------------------------------
| time/                   |              |
|    fps                  | 258          |
|    iterations           | 9            |
|    time_elapsed         | 71           |
|    total_timesteps      | 18432        |
| train/                  |              |
|    approx_kl            | 8.789357e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 0            |
|    learning_rate        | 0.001        |
|    loss                 | 7.67e+14     |
|    n_updates            | 80           |
|    policy_gradient_loss | -1.49e-06    |
|    reward               | 2085537.0    |
|    std                  | 1            |
|    value_loss           | 1.35e+15     |
------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4194078.3287386466
Sharpe:  0.7088772216329564
=================================
------------------------------------------
| time/                   |              |
|    fps                  | 258          |
|    iterations           | 10           |
|    time_elapsed         | 79           |
|    total_timesteps      | 20480        |
| train/                  |              |
|    approx_kl            | 7.683411e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 1.19e-07     |
|    learning_rate        | 0.001        |
|    loss                 | 1.05e+15     |
|    n_updates            | 90           |
|    policy_gradient_loss | -1.07e-06    |
|    reward               | 1272599.6    |
|    std                  | 1            |
|    value_loss           | 2.08e+15     |
------------------------------------------
------------------------------------------
| time/                   |              |
|    fps                  | 258          |
|    iterations           | 11           |
|    time_elapsed         | 87           |
|    total_timesteps      | 22528        |
| train/                  |              |
|    approx_kl            | 7.945346e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 0            |
|    learning_rate        | 0.001        |
|    loss                 | 1.46e+15     |
|    n_updates            | 100          |
|    policy_gradient_loss | -1.14e-06    |
|    reward               | 4016451.2    |
|    std                  | 1            |
|    value_loss           | 2.78e+15     |
------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4195896.423487038
Sharpe:  0.7074945647632417
=================================
------------------------------------------
| time/                   |              |
|    fps                  | 257          |
|    iterations           | 12           |
|    time_elapsed         | 95           |
|    total_timesteps      | 24576        |
| train/                  |              |
|    approx_kl            | 8.178176e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 0            |
|    learning_rate        | 0.001        |
|    loss                 | 8.19e+14     |
|    n_updates            | 110          |
|    policy_gradient_loss | -1.2e-06     |
|    reward               | 2698685.2    |
|    std                  | 1            |
|    value_loss           | 1.71e+15     |
------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4200279.1240972085
Sharpe:  0.7078889896224082
=================================
-------------------------------------------
| time/                   |               |
|    fps                  | 256           |
|    iterations           | 13            |
|    time_elapsed         | 103           |
|    total_timesteps      | 26624         |
| train/                  |               |
|    approx_kl            | 6.9849193e-09 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -9.93         |
|    explained_variance   | -2.38e-07     |
|    learning_rate        | 0.001         |
|    loss                 | 9.95e+14      |
|    n_updates            | 120           |
|    policy_gradient_loss | -9.4e-07      |
|    reward               | 1757732.2     |
|    std                  | 1             |
|    value_loss           | 1.96e+15      |
-------------------------------------------
-------------------------------------------
| time/                   |               |
|    fps                  | 257           |
|    iterations           | 14            |
|    time_elapsed         | 111           |
|    total_timesteps      | 28672         |
| train/                  |               |
|    approx_kl            | 6.9849193e-09 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -9.93         |
|    explained_variance   | 1.19e-07      |
|    learning_rate        | 0.001         |
|    loss                 | 1.29e+15      |
|    n_updates            | 130           |
|    policy_gradient_loss | -9.14e-07     |
|    reward               | 3963198.2     |
|    std                  | 1             |
|    value_loss           | 2.63e+15      |
-------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4117226.909721386
Sharpe:  0.7009549483123549
=================================
-------------------------------------------
| time/                   |               |
|    fps                  | 256           |
|    iterations           | 15            |
|    time_elapsed         | 119           |
|    total_timesteps      | 30720         |
| train/                  |               |
|    approx_kl            | 8.8475645e-09 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -9.93         |
|    explained_variance   | 1.19e-07      |
|    learning_rate        | 0.001         |
|    loss                 | 1.14e+15      |
|    n_updates            | 140           |
|    policy_gradient_loss | -1.46e-06     |
|    reward               | 2481026.2     |
|    std                  | 1             |
|    value_loss           | 2.14e+15      |
-------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:3976828.667046468
Sharpe:  0.686149853760772
=================================
------------------------------------------
| time/                   |              |
|    fps                  | 255          |
|    iterations           | 16           |
|    time_elapsed         | 128          |
|    total_timesteps      | 32768        |
| train/                  |              |
|    approx_kl            | 9.313226e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | -1.19e-07    |
|    learning_rate        | 0.001        |
|    loss                 | 7.51e+14     |
|    n_updates            | 150          |
|    policy_gradient_loss | -1.65e-06    |
|    reward               | 1788779.9    |
|    std                  | 1            |
|    value_loss           | 1.54e+15     |
------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4011862.64593621
Sharpe:  0.6882706958170082
=================================
-----------------------------------------
| time/                   |             |
|    fps                  | 255         |
|    iterations           | 17          |
|    time_elapsed         | 136         |
|    total_timesteps      | 34816       |
| train/                  |             |
|    approx_kl            | 8.96398e-09 |
|    clip_fraction        | 0           |
|    clip_range           | 0.2         |
|    entropy_loss         | -9.93       |
|    explained_variance   | 0           |
|    learning_rate        | 0.001       |
|    loss                 | 1.03e+15    |
|    n_updates            | 160         |
|    policy_gradient_loss | -1.76e-06   |
|    reward               | 1022607.75  |
|    std                  | 1           |
|    value_loss           | 2.15e+15    |
-----------------------------------------
------------------------------------------
| time/                   |              |
|    fps                  | 255          |
|    iterations           | 18           |
|    time_elapsed         | 144          |
|    total_timesteps      | 36864        |
| train/                  |              |
|    approx_kl            | 6.722985e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | -1.19e-07    |
|    learning_rate        | 0.001        |
|    loss                 | 1.33e+15     |
|    n_updates            | 170          |
|    policy_gradient_loss | -6.24e-07    |
|    reward               | 3342887.0    |
|    std                  | 1            |
|    value_loss           | 2.65e+15     |
------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:3874303.678110981
Sharpe:  0.6749724874730472
=================================
-----------------------------------------
| time/                   |             |
|    fps                  | 255         |
|    iterations           | 19          |
|    time_elapsed         | 152         |
|    total_timesteps      | 38912       |
| train/                  |             |
|    approx_kl            | 7.82893e-09 |
|    clip_fraction        | 0           |
|    clip_range           | 0.2         |
|    entropy_loss         | -9.93       |
|    explained_variance   | 0           |
|    learning_rate        | 0.001       |
|    loss                 | 6.35e+14    |
|    n_updates            | 180         |
|    policy_gradient_loss | -1.17e-06   |
|    reward               | 2591912.2   |
|    std                  | 1           |
|    value_loss           | 1.39e+15    |
-----------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4075396.0292179375
Sharpe:  0.6960579699082773
=================================
------------------------------------------
| time/                   |              |
|    fps                  | 255          |
|    iterations           | 20           |
|    time_elapsed         | 160          |
|    total_timesteps      | 40960        |
| train/                  |              |
|    approx_kl            | 7.945346e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 1.19e-07     |
|    learning_rate        | 0.001        |
|    loss                 | 9.7e+14      |
|    n_updates            | 190          |
|    policy_gradient_loss | -8.17e-07    |
|    reward               | 1483324.1    |
|    std                  | 1            |
|    value_loss           | 1.91e+15     |
------------------------------------------
------------------------------------------
| time/                   |              |
|    fps                  | 255          |
|    iterations           | 21           |
|    time_elapsed         | 168          |
|    total_timesteps      | 43008        |
| train/                  |              |
|    approx_kl            | 9.546056e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 0            |
|    learning_rate        | 0.001        |
|    loss                 | 1.38e+15     |
|    n_updates            | 200          |
|    policy_gradient_loss | -1.91e-06    |
|    reward               | 3484793.0    |
|    std                  | 1            |
|    value_loss           | 2.54e+15     |
------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4115279.627057315
Sharpe:  0.7000977033778167
=================================
-------------------------------------------
| time/                   |               |
|    fps                  | 254           |
|    iterations           | 22            |
|    time_elapsed         | 176           |
|    total_timesteps      | 45056         |
| train/                  |               |
|    approx_kl            | 5.6170393e-09 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -9.93         |
|    explained_variance   | 0             |
|    learning_rate        | 0.001         |
|    loss                 | 1.01e+15      |
|    n_updates            | 210           |
|    policy_gradient_loss | -9.59e-07     |
|    reward               | 3015818.0     |
|    std                  | 1             |
|    value_loss           | 2.04e+15      |
-------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:4155706.236713753
Sharpe:  0.703959200241537
=================================
------------------------------------------
| time/                   |              |
|    fps                  | 254          |
|    iterations           | 23           |
|    time_elapsed         | 185          |
|    total_timesteps      | 47104        |
| train/                  |              |
|    approx_kl            | 9.022187e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 0            |
|    learning_rate        | 0.001        |
|    loss                 | 8.5e+14      |
|    n_updates            | 220          |
|    policy_gradient_loss | -1.94e-06    |
|    reward               | 1757937.2    |
|    std                  | 1            |
|    value_loss           | 1.72e+15     |
------------------------------------------
------------------------------------------
| time/                   |              |
|    fps                  | 254          |
|    iterations           | 24           |
|    time_elapsed         | 193          |
|    total_timesteps      | 49152        |
| train/                  |              |
|    approx_kl            | 7.887138e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 0            |
|    learning_rate        | 0.001        |
|    loss                 | 1.13e+15     |
|    n_updates            | 230          |
|    policy_gradient_loss | -1.5e-06     |
|    reward               | 3765411.8    |
|    std                  | 1            |
|    value_loss           | 2.44e+15     |
------------------------------------------
=================================
begin_total_asset:1000000
end_total_asset:3983449.2318723113
Sharpe:  0.686623843915132
=================================
------------------------------------------
| time/                   |              |
|    fps                  | 254          |
|    iterations           | 25           |
|    time_elapsed         | 201          |
|    total_timesteps      | 51200        |
| train/                  |              |
|    approx_kl            | 6.548362e-09 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -9.93        |
|    explained_variance   | 5.96e-08     |
|    learning_rate        | 0.001        |
|    loss                 | 1.28e+15     |
|    n_updates            | 240          |
|    policy_gradient_loss | -7.5e-07     |
|    reward               | 3082500.2    |
|    std                  | 1            |
|    value_loss           | 2.53e+15     |
------------------------------------------
CPU times: user 3min 21s, sys: 1.52 s, total: 3min 23s
Wall time: 3min 22s

3.4 Evaluation

We now use the most recent data to evaluate the performance of the two models of the agent. This is also referred to as back-testing or simply trading. This data has never been seen by the training process. The start date of this data is captured in the parameter TRADE_START.

TRADE_START

'2020-07-01'

# hide
env_kwargs

{'action_space': 7,
 'hmax': 100,
 'initial_amount': 1000000,
 'reward_scaling': 0.1,
 'state_space': 7,
 'stock_dim': 7,
 'tech_indicator_list': ['macd',
  'boll_ub',
  'boll_lb',
  'rsi_30',
  'cci_30',
  'dx_30',
  'close_30_sma',
  'close_60_sma'],
 'transaction_cost_pct': 0}

trade = data_split(df, TRADE_START, DATA_END)
e_trade_gym = Portfolio(df=trade, **env_kwargs)
e_trade_gym

<__main__.Portfolio at 0x7fc9816700d0>

baseline_df = get_baseline(
        ticker="^DJI", 
        start=TRADE_START,
        end=DATA_END)
baseline_df_stats = backtest_stats(baseline_df, value_col_name = 'close')
baseline_returns = get_daily_return(baseline_df, value_col_name="close")
dji_cumpod =(baseline_returns + 1).cumprod() - 1

[*********************100%***********************]  1 of 1 completed
Shape of DataFrame:  (295, 8)
Annual return          0.311845
Cumulative returns     0.374034
Annual volatility      0.140762
Sharpe ratio           2.006165
Calmar ratio           3.491806
Stability              0.950106
Max drawdown          -0.089308
Omega ratio            1.397014
Sortino ratio          2.988706
Skew                        NaN
Kurtosis                    NaN
Tail ratio             1.094883
Daily value at risk   -0.016614
dtype: float64

df_daily_return_a2c, df_actions_a2c = DRLAgent.DRL_prediction(model=trained_a2c, environment = e_trade_gym)
df_daily_return_ppo, df_actions_ppo = DRLAgent.DRL_prediction(model=trained_ppo, environment = e_trade_gym)
time_ind = pd.Series(df_daily_return_a2c.date)
a2c_cumpod =(df_daily_return_a2c.daily_return + 1).cumprod() - 1
ppo_cumpod =(df_daily_return_ppo.daily_return + 1).cumprod() - 1
DRL_strat_a2c = convert_daily_return_to_pyfolio_ts(df_daily_return_a2c)
DRL_strat_ppo = convert_daily_return_to_pyfolio_ts(df_daily_return_ppo)

perf_func = timeseries.perf_stats
perf_stats_all_a2c = perf_func(returns=DRL_strat_a2c, factor_returns=DRL_strat_a2c, positions=None, transactions=None, turnover_denom="AGB")
perf_stats_all_ppo = perf_func(returns=DRL_strat_ppo, factor_returns=DRL_strat_ppo, positions=None, transactions=None, turnover_denom="AGB")

=================================
begin_total_asset:1000000
end_total_asset:1524583.336446113
Sharpe:  2.369076062885316
=================================
hit end!
=================================
begin_total_asset:1000000
end_total_asset:1507488.2263027485
Sharpe:  2.2708997438017855
=================================
hit end!

# hide
len(df_actions_a2c.columns)

3.4.1 Inspect actions

For the sake of interest, we inspect some the actions taken by the A2C agent.

# 
# Inspect the actions taken by the A2C agent (for interest sake)
# A2C actions
df_actions_a2c

	FBCVX	FCPGX	FDCAX	FDGFX	FDSCX	FDSVX	FSMVX
date
2020-07-01	0.142857	0.142857	0.142857	0.142857	0.142857	0.142857	0.142857
2020-07-02	0.197429	0.103983	0.103983	0.103983	0.103983	0.103983	0.282655
2020-07-06	0.242351	0.089156	0.089156	0.242351	0.158675	0.089156	0.089156
2020-07-07	0.250948	0.230227	0.092319	0.149551	0.092319	0.092319	0.092319
2020-07-08	0.112299	0.214671	0.078973	0.214671	0.214671	0.078973	0.085741
...	...	...	...	...	...	...	...
2021-08-25	0.257359	0.106575	0.094677	0.094677	0.094677	0.094677	0.257359
2021-08-26	0.270076	0.151744	0.099356	0.099356	0.099356	0.099356	0.180758
2021-08-27	0.222533	0.081865	0.081865	0.222533	0.081865	0.086806	0.222533
2021-08-30	0.124118	0.186260	0.085769	0.233143	0.124187	0.085769	0.160754
2021-08-31	0.223638	0.223638	0.082272	0.082272	0.082272	0.082272	0.223638

295 rows × 7 columns

Here is a visualization of the A2C agent’s actions specifically on two of the funds:

# hide
my_tickers

['FBCVX',
 'FSMVX',
 'FDGFX',
 'FULVX',
 'FDSCX',
 'FDCAX',
 'FDSVX',
 'FCPGX',
 'FIFNX']

fig = go.Figure()
fig.update_layout(width=900, height=600)
fig.add_trace(go.Scatter(x=time_ind, y=df_actions_a2c['FBCVX'], mode='lines', name='FBCVX A2C'))
fig.add_trace(go.Scatter(x=time_ind, y=-df_actions_a2c['FSMVX'], mode='lines', name='FSMVX A2C'))
fig.update_layout(
    legend=dict(
        x=0,
        y=1,
        traceorder="normal",
        font=dict(
            family="sans-serif",
            size=20,
            color="black"
        ),
        bgcolor="White",
        bordercolor="white",
        borderwidth=2))
fig.update_layout(title={
        'text': "Actions of A2C & PPO",
        'y': 0.87,
        'x': 0.48,
        'xanchor': 'center',
        'yanchor': 'top'})
fig.update_layout(
    paper_bgcolor='rgba(1, 1, 0, 0)',
    plot_bgcolor='rgba(1, 1, 0, 0)',
    xaxis_title="Date",
    yaxis = dict(titlefont=dict(size=26), title="Daily Actions"),
    font=dict(size=15))
# fig.update_layout(font_size = 20)
fig.update_traces(line=dict(width=2))
fig.update_xaxes(showline=True, linecolor='black', showgrid=False, gridwidth=1, gridcolor='Black', mirror=True)
fig.update_yaxes(showline=True,linecolor='black', showgrid=False, gridwidth=1, gridcolor='Black', mirror=True)
fig.update_yaxes(zeroline=True, zerolinewidth=1, zerolinecolor='Grey')
fig.show()

3.4.2 Inspect daily return

Here is a visualization of the daily return of the portfolio:

fig = go.Figure()
fig.update_layout(width=900, height=600)
fig.add_trace(go.Scatter(x=time_ind, y=DRL_strat_a2c, mode='lines', name='A2C'))
fig.add_trace(go.Scatter(x=time_ind, y=DRL_strat_ppo, mode='lines', name='PPO'))
fig.update_layout(
    legend=dict(
        x=0,
        y=1,
        traceorder="normal",
        font=dict(
            family="sans-serif",
            size=20,
            color="black"
        ),
        bgcolor="White",
        bordercolor="white",
        borderwidth=2))
fig.update_layout(title={
        'text': "Daily Return of A2C & PPO",
        'y': 0.87,
        'x': 0.48,
        'xanchor': 'center',
        'yanchor': 'top'})
fig.update_layout(
    paper_bgcolor='rgba(1, 1, 0, 0)',
    plot_bgcolor='rgba(1, 1, 0, 0)',
    xaxis_title="Date",
    yaxis = dict(titlefont=dict(size=26), title="Daily Return"),
    font=dict(size=15))
# fig.update_layout(font_size = 20)
fig.update_traces(line=dict(width=2))
fig.update_xaxes(showline=True, linecolor='black', showgrid=False, gridwidth=1, gridcolor='Black', mirror=True)
fig.update_yaxes(showline=True,linecolor='black', showgrid=False, gridwidth=1, gridcolor='Black', mirror=True)
fig.update_yaxes(zeroline=True, zerolinewidth=1, zerolinecolor='Grey')
fig.show()

3.4.3 Inspect cumulative return

Finally, we inspect the cumulative return of the portfolio brought about by each of the agents. The DJIA index is used as a baseline for reference. Both agents end up with a larger cumulative return than the DJIA.

fig = go.Figure()
fig.update_layout(width=900, height=900)
fig.add_trace(go.Scatter(x=time_ind, y=dji_cumpod, mode='lines', name='DJIA', line=dict(color="#a9a9a9")))
fig.add_trace(go.Scatter(x=time_ind, y=a2c_cumpod, mode='lines', name='A2C'))
fig.add_trace(go.Scatter(x=time_ind, y=ppo_cumpod, mode='lines', name='PPO'))
fig.update_layout(
    legend=dict(
        x=0,
        y=1,
        traceorder="normal",
        font=dict(
            family="sans-serif",
            size=16,
            color="black"
        ),
        bgcolor="White",
        bordercolor="white",
        borderwidth=2))
fig.update_layout(title={
        'text': "Cumulative Return of A2C & PPO against DJIA",
        'y': 0.92,
        'x': 0.48,
        'xanchor': 'center',
        'yanchor': 'top'})
fig.update_layout(
    paper_bgcolor='rgba(1, 1, 0, 0)',
    plot_bgcolor='rgba(1, 1, 0, 0)',
    xaxis_title="Date",
    yaxis = dict(titlefont=dict(size=26), title="Cumulative Return"),
    font=dict(size=15))
# fig.update_layout(font_size = 20)
fig.update_traces(line=dict(width=2))
fig.update_xaxes(showline=True, linecolor='black', showgrid=False, gridwidth=1, gridcolor='Black', mirror=True)
fig.update_yaxes(showline=True,linecolor='black', showgrid=False, gridwidth=1, gridcolor='Black', mirror=True)
fig.update_yaxes(zeroline=True, zerolinewidth=1, zerolinecolor='Grey')
fig.show()

Disclaimer

This content is only meant for research purposes and is not meant to be used in any form of trading. Past performance is no guarantee of future results. If you suffer losses from making use of this content, directly or indirectly, you are the sole person responsible for the losses. The author will not be held responsible in any way.