Portfolio Optimization and the PortfolioAnalytics package in R


The aim of this blog is not to make you an expert in portfolio theory, sorry! On the other hand, I suspect that being an expert on the subject does not necessarily imply wealth (apart from the intellectual one), so being able to handle tools as a layman might get you on your way to some beach house on some tropical island to retire on. Modern Portfolio Theory, however, is a complicated subject that involves many economic and mathematical concepts and I intend to give you a small crash course in its basics before showing you how to use R and the PortfolioAnalytics package to make optimal portfolio decisions.

The basics of Modern Portfolio Theory

In finance, investors want to maximize monetary gains by investing in funds and stocks that will ensure return at a minimum level of risk. Hence, given two choices of investments giving the same return on investment (ROI), a rational investor will always choose the one with the lowest possible risk. It goes without saying that putting all your money on one “horse” is either a risky investment (if it is an outsider, i.e. you face a substantial risk of loss of all your invested money) or a poor investment (Low ROI since it is a sure winner). So, investors are usually prepared to take some risks and hedge the risk and yet maintain a satisfactory ROI buy investing on an adequately large number of assets with varying risk levels. The aim of the game is to diversify your investment in such a way as to maximize the net gain (some investments will generate loses while some may lead to positive returns). This seems to be an easy task, no? But giving this some further thoughts, you might realize that choosing the assets to invest in is only part of the problem. Indeed, there are some further complications. Given N assets, each assigned with a risk, what proportions (or ratio) of each asset should one choose in order to maximize the value of the return on investment? This is the core of the portfolio maximization (or optimization) problem and which led H. Markovitz to win a Nobel Prize in economy.

One needs to evaluate the risk by studying the standard deviation of the expected value of each asset given its risk. For \(N\) assets, there exists a vector \mu=(\mu_1,\mu_2,\ldots,\mu_N )^t of expected returns generated by per unit invested monetary units (dollars, pounds, Swedish Crown…) and a covariance matrix
C_{N,N} =\begin{pmatrix} c_{1,1} & c_{1,2} & \cdots & c_{1,N}\\ \vdots & ... & \ddots & \vdots \\ c_{N,1} & c_{N,2} &... & c_{N,N} \end{pmatrix}
between the N assets. Just to simplify things, in this description we assume that the covariance matrix is positive definite, that is that xC_{N,N} x^t > 0 ,\forall x=\left( x_{1},x_{2},\cdots,x_{N} \right)^t . If C_{N,N} is not positive definite, singular value problems appear (sometimes solvable by using singular value decomposition methods). Now, given the expected return and the covariance matrix are known, investors need to decide how much risk they are willing to take to make a profit bringing balance between ROI and risks. We introduce a non-negative risk aversion factor \(\lambda \), uniquely set to each investor. It shows the extent of risk the investor is willing to take when investing his money. Whenever you talk to your pension planner, he asks how much risk you are willing to take when investing in funds. He is simply asking for your \lambda. Portfolio optimization simply amounts to solving the quadratic problem

min\left(-\lambda \mu^{t}+\frac{1}{2} x^{t} C_{N,N} x | e^{t} x=1\right)

where e^t=(1,1,\ldots, 1)_{1,N}. The equality constraint is necessary since the investor wants to put all its assets in the investment. When \lambda \rightarrow 0, this means that the investor only wishes to minimize the risks, while when \lambda grows to a sufficiently large number, the term -\lambda\mu^t will dominate, meaning that the investor will invest in assets with maximum ROI. We say that portfolio is variance efficient if for fixed expected return, no other portfolio has smaller variance. In the same way, we say that a portfolio is expected return efficient if for a fixed variance, there is no other portfolio with a larger expected return.
One should, from the above description understand that there are many possible portfolios that will maximize the quadratic problem above. We call efficient frontier (see figure 1 below) the curve of all efficient portfolios in a risk-return context. A portfolio is said to be an efficient portfolio if the portfolio maximizes the expected return for a given amount of risk or equivalently if the portfolio minimizes the risk subject to a given expected return.


We will not go into the mathematical details on how to solve the optimization problem above, but we shall give a very quick introduction to some commonly encountered constraints since they are of substantial interest in modern portfolio theory and for any investor bounded by economic realities such as regulationstaxes and transaction costs. We give definitions for these but invited the reader to look in any textbook om modern portfolio theory for other constraint types or visit the National Bureau of Economic Research.

Sometimes, investors are, by law, forbidden to hold some assets or at least a certain amount of assets. Some countries have for instance regulations prohibiting foreign investors to own some share of the market in order to protect national interests. Such constraints need to be addressed when defining the optimization of a portfolio. As for taxes, well, everyone knows that a monetary gain is almost always associated with taxes. Even trading your skills for a salary and hedging being a sort of labor, it too is subjected to taxation. It is therefore of importance to take this constraint into account when deciding to optimize a portfolio and re-optimizing an out-of-date portfolio surely needs to take taxation into account.

Anyone that has at some point held funds or stocks understands that selling assets or redefining the weights in a portfolio is associated with costs, so called transaction costs. Now, the optimal portfolio changes with time and there is therefore an incentive to re-optimize the portfolio. Since there are transactional costs every time, too frequent transactions costs will counteract the objective. The optimal strategy is therefore to find the frequency of re-optimization and trading that appropriately trades off the avoidance of transaction costs with the avoidance of having a non-optimal portfolio.

We have spoken quite a lot about constraints and now turn to the objectives with portfolios. It is true that a completely safe and secure investment does not exist, but we can reach almost ultimate safety through the purchase of government-issued securities in stable economic systems or by buying corporate bonds in top companies. Such securities are the best means of preserving principal while receiving a specified rate of return. The safest investments are usually found in the money market. In order of increasing risk, these securities often include treasury bills, commercial paper or in the fixed income  market in the form of municipal, and other government bonds and corporate bonds. As they increase in risk, these securities also increase potential yield.

The safest investments are also the ones that are likely to have the lowest rate of income return or yield. Investors must inevitably sacrifice a degree of safety if they want to increase their yields. As yield increases, safety generally goes down, and vice versa.

In order to increase their ROI and take on risk above that of money market instruments or government bonds, investors may choose to purchase corporate bonds or preferred shares with lower investment ratings (A, AA, AAA bonds, where A bonds are more risky than AA and AAA but usually have a great rate of return).

Most investors, even the most risk-averse ones, want some income generation in their portfolios, even if it’s just to keep up with the economy’s rate of inflation. But maximizing income return can be tricky, especially for individuals who depend on a fixed sum from their portfolio every month. A retiree who requires a monthly pension wants a portfolio with reasonably safe assets that provide funds over and above other income-generating assets, such as pension plans. We end the basics of modern portfolio theory by looking at a very important concept, namely Estimated Shortfall (ES) or Expected Tail Loss (ETL). ES/ETL estimates the risk of an investment, focusing on the less profitable outcomes. For high values of q it ignores the most profitable but unlikely possibilities, while for small values of q it focuses on the worst losses.

PortfolioAnalytics and quantmod packages in R

No data, no trade! An important step in constructing a portfolio is the acquisition of market information. To import daily trade data of stocks the quantmod package is essential and access to information for a huge number of stocks from different sources, such as Yahoo Finance and Google Finance. In the following example we are going to download historical data for MSFT (Microsoft), SBUX (Starbucks Corporation), IBM (IBM Common Stock), AAPL (Apple Inc.), ^GSPC (S&P 500) and AMZN (Amazon.com, Inc.).

StockData  = getSymbols(c("MSFT", "SBUX", "IBM", "AAPL", "^GSPC", "AMZN"))
head(MSFT, 10)
> head(MSFT, 10)
           MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted
2007-01-03     29.91     30.25    29.40      29.86    76935100      22.96563
2007-01-04     29.70     29.97    29.44      29.81    45774500      22.92717
2007-01-05     29.63     29.75    29.45      29.64    44607200      22.79642
2007-01-08     29.65     30.10    29.53      29.93    50220200      23.01947
2007-01-09     30.00     30.18    29.73      29.96    44636600      23.04254
2007-01-10     29.80     29.89    29.43      29.66    55017400      22.81180
2007-01-11     29.76     30.75    29.65      30.70    99464300      23.61168
2007-01-12     30.65     31.39    30.64      31.21   103972500      24.00392
2007-01-16     31.26     31.45    31.03      31.16    62379600      23.96547
2007-01-17     31.26     31.44    31.01      31.10    58519600      23.91933>
> tail(MSFT, 10)
           MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted
2017-11-17     83.12     83.12    82.24      82.40    22079000         82.40
2017-11-20     82.40     82.59    82.25      82.53    16315000         82.53
2017-11-21     82.74     83.84    82.74      83.72    21237500         83.72
2017-11-22     83.83     83.90    83.04      83.11    20553100         83.11
2017-11-24     83.01     83.43    82.78      83.26     7425600         83.26
2017-11-27     83.31     83.98    83.30      83.87    18265200         83.87
2017-11-28     84.07     85.06    84.02      84.88    21926000         84.88
2017-11-29     84.71     84.92    83.18      83.34    27381100         83.34
2017-11-30     83.51     84.52    83.34      84.17    33054600         84.17
2017-12-01     83.60     84.81    83.22      84.26    29287900         84.2

As you can see, the data is updated daily, giving information about the value at opening, the stocks highest and lowest values, the volumes traded and the adjusted value. The quantmod packages also offers options to graphically view the data.



addTA(EMA(Cl(MSFT)), on = 3, col = 4)


Generating random portfolio and finding the optimal weights on stocks

As you can see from the above graphs, quite a lot of information can be downloaded on stocks and funds from a wide range of sources. That being said, it is quite difficult to make sense of these huge datasets and determine which ratio of different assets will maximize returns and at the same time minimize risks. Besides, as we explained above there are a multitude of efficient portfolios to choose from. The idea is to choose the optimal one. Mathematically, it is the portfolio belonging to the set of efficient portfolios, i.e., the portfolio on the efficient frontier, tangent to the best capital allocation line (Best possible CAL). To find this without applying a method is quite a challenge. We show here how this is done using the PortfolioAnalytics package in R. So, let’s assume we have acquired all the data we need. We will here work with the adjusted prices of the data above.

prices.data   =  merge.zoo(MSFT[,6], SBUX[,6], IBM[,6], AAPL[,6], GSPC[,6], AMZN[,6])
returns.data   =  CalculateReturns(prices.data)
returns.data   =  na.omit(returns.data)
> tail(returns.data,5)
                   MSFT         SBUX          IBM         AAPL         ^GSPC         AMZN
2017-11-27  0.007326459 -0.015668997 0.0009220232 -0.005029462 -0.0003842577  0.008288327
2017-11-28  0.012042375  0.013414416 0.0032241414 -0.005858975  0.0098485126 -0.001864797
2017-11-29 -0.018143273  0.015001730 0.0070833737 -0.020743115 -0.0003692258 -0.027086090
2017-11-30  0.009959228  0.005390402 0.0027352523  0.013984010  0.0081909505  0.013330216
2017-12-01  0.001069312 -0.008647527 0.0051308306 -0.004655240 -0.0020245306 -0.012237114
> covMat
              MSFT         SBUX          IBM         AAPL        ^GSPC         AMZN
MSFT  0.0002985838 0.0001626595 0.0001275137 0.0001616140 0.0001534461 0.0002071854
SBUX  0.0001626595 0.0003980188 0.0001302177 0.0001684972 0.0001681332 0.0002187349
IBM   0.0001275137 0.0001302177 0.0001923346 0.0001283155 0.0001222845 0.0001428121
AAPL  0.0001616140 0.0001684972 0.0001283155 0.0004040247 0.0001546177 0.0002241833
^GSPC 0.0001534461 0.0001681332 0.0001222845 0.0001546177 0.0001606410 0.0001785572
AMZN  0.0002071854 0.0002187349 0.0001428121 0.0002241833 0.0001785572 0.0006390490

And calculate the covariance matrix (remember, from our introduction, that it was needed to set up the quadratic problem with constraints.

meanReturns = colMeans(returns.data)
covMat = cov(returns.data)
              MSFT         SBUX          IBM         AAPL        ^GSPC         AMZN
MSFT  0.0002985838 0.0001626595 0.0001275137 0.0001616140 0.0001534461 0.0002071854
SBUX  0.0001626595 0.0003980188 0.0001302177 0.0001684972 0.0001681332 0.0002187349
IBM   0.0001275137 0.0001302177 0.0001923346 0.0001283155 0.0001222845 0.0001428121
AAPL  0.0001616140 0.0001684972 0.0001283155 0.0004040247 0.0001546177 0.0002241833
^GSPC 0.0001534461 0.0001681332 0.0001222845 0.0001546177 0.0001606410 0.0001785572
AMZN  0.0002071854 0.0002187349 0.0001428121 0.0002241833 0.0001785572 0.0006390490     

Next, we specify which assets to include in our portfolio (our shopping cart) and decide on the constraints to be added. He we choose a box constraint (i.e., an upper and lower boundary) and demand a full investment (i.e., we go all in).

port   =  portfolio.spec(assets = c("MSFT", "SBUX", "IBM", "AAPL", "^GSPC", "AMZN"))
port   =  add.constraint(port, type = "box", min = 0.05, max = 0.8)
port   =  add.constraint(portfolio = port, type = "full_investment")

As we mentioned above, there are infinitely many possible portfolios, some being disastrous, some average and a very few that make any sense if you don’t want to lose your investments. We cannot generate all of them, but we may construct enough for an analysis to make sense. We therefore use the random_portfolios()function in PortfolioAnalytics package and add the objective of finding the minimum variance portfolio before optimizing and adding the objective to maximize return. Remember that our objectives are to minimize risk (minimum variance) and to maximize profit (return). A note of caution! Random_portfolios() does not take into account diversification constraints, which means they have to be added at a later stage.  

rportfolios   =  random_portfolios(port, permutations = 50000, rp_method = "sample")
minvar.port   =  add.objective(port, type = "risk", name = "var")
minvar.opt   =  optimize.portfolio(returns.data, minvar.port, optimize_method = "random", rp = rportfolios)
maxret.port   =  add.objective(port, type = "return", name = "mean")
maxret.opt   =  optimize.portfolio(returns.data, maxret.port, optimize_method = "random",  rp = rportfolios)

So, basically, we have which ratio to choose between the different stocks. Now that we have the minimum variance as well as the maximum return portfolios, we determine our efficient frontier. We need to determine the porfolios that meet our constraints. Such portfolios are said to be feasible and we thus need to determine the deasible standard deviation, feasible returns (means) and the ratio “feasible standard deviation” / “feasible means” 

minret   =  0.06/100
maxret   =  maxret.opt$weights %*% meanReturns
vec   =  seq(minret, maxret, length.out = 100)
eff.frontier   =  data.frame(Risk = rep(NA, length(vec)),
                           Return = rep(NA, length(vec)), 
                           SharpeRatio = rep(NA, length(vec)))
frontier.weights   =  mat.or.vec(nr = length(vec), nc = ncol(returns.data))
colnames(frontier.weights)   =  colnames(returns.data)
for(i in 1:length(vec)){
 eff.port   =  add.constraint(port, type = "return", name = "mean", return_target = vec[i])
  eff.port   =  add.objective(eff.port, type = "risk", name = "var")
  eff.port   =  optimize.portfolio(returns.data, eff.port, optimize_method = "ROI")
  eff.frontier$Risk[i]   =  sqrt(t(eff.port$weights) %*% covMat %*% eff.port$weights)
  eff.frontier$Return[i]   =  eff.port$weights %*% meanReturns
  eff.frontier$Sharperatio[i]   =  eff.port$Return[i] / eff.port$Risk[i]
  frontier.weights[i,] = eff.port$weights  
  print(paste(round(i/length(vec) * 100, 0), "% done..."))
feasible.sd   =  apply(rportfolios, 1, function(x){
  return(sqrt(matrix(x, nrow = 1) %*% covMat %*% matrix(x, ncol = 1)))
feasible.means   =  apply(rportfolios, 1, function(x){
  return(x %*% meanReturns)})
feasible.sr   =  feasible.means / feasible.sd

The only this remaining is to view the optimal proportions of the different stocks.

PortfolioAnalytics Optimization
optimize.portfolio(R = returns.data, portfolio = maxret.port, optimize_method = "random", rp = rportfolios)
Optimal Weights: 
0.050 0.062 0.058 0.056 0.050 0.724 

Just for the fun of it! Lets plot the efficient frontier. The picture below is just a snapshot of a reactive plotly plot which can be zoomed into. The colors of the portfolios are simply the Sharpe ratio, the average return earned in excess of the risk-free rate per unit of volatility or total risk.


Having fun trading!


2 thoughts on “Portfolio Optimization and the PortfolioAnalytics package in R

Add yours

  1. vec = seq(minret, maxret, length.out = 100)
    it gives the following error
    Error in seq.default(minret, maxret, length.out = 100) :
    ‘to’ must be of length 1


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

Up ↑

%d bloggers like this: