In [1]:

```
from scipy import stats
import pandas as pd
import numpy as np
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import plotly.io as pio
import statsmodels.api as sm
import pathlib
import sys
utils_path = pathlib.Path().absolute().parent.parent
sys.path.append(utils_path.__str__())
import utils.layout as lay
from utils.functions import PCA
```

In [2]:

```
pio.templates.default = 'simple_white+blog_mra'
```

This section describes how to derive equally weighted estimates for volatility and correlation.

The traditional approach for calculating volatility entails derving an unbiased estimate of the variance using an equally weighted average of squared returns and converting that value into a volatility estimate by applying the squared root of time rule. Assuming that each return has zero mean:

\begin{equation} \hat{\sigma_{it}}^2 = T^{-1} \sum_{h=1}^{T} r_{i, t-k}^2 \end{equation}After having obtained an unbiased estimate of the variance, we can use the square-root-of-time rule to convert it into volatility as:

\begin{equation} Equally\ Weighted\ Volatility\ Estimate = \hat{\sigma_{it}} \sqrt{250} \end{equation}In [3]:

```
df = pd.read_excel(r"data/Examples_II.3.xls", sheet_name="EX_II.3.5&8").iloc[:11, :2].set_index("Date")
df.index = df.index.date
df["LogRet"] = np.log(df/df.shift())[1:]
```

$$\begin{array}{lrr}
\hline
& FTSE & LogRet \\
\hline
2007-08-10 & 6038.30 & NaN \\
2007-08-13 & 6219.00 & 0.03 \\
2007-08-14 & 6143.50 & -0.01 \\
2007-08-15 & 6109.30 & -0.01 \\
2007-08-16 & 5858.90 & -0.04 \\
2007-08-17 & 6064.20 & 0.03 \\
2007-08-20 & 6078.70 & 0.00 \\
2007-08-21 & 6086.10 & 0.00 \\
2007-08-22 & 6196.00 & 0.02 \\
2007-08-23 & 6196.90 & 0.00 \\
2007-08-24 & 6220.10 & 0.00 \\
\hline
\end{array}
$$

In [4]:

```
df["SqRet"] = df["LogRet"]**2
var_FTSE = df["SqRet"].mean()
vol_FTSE = np.sqrt(var_FTSE*250)
print("Variance: {}".format(round(var_FTSE, 6)))
print("Volatility: {}".format(round(vol_FTSE, 3)))
```

Variance: 0.000433 Volatility: 0.329

The reason why we use log returns rather than ordinary returns are the following:

- at daily frequency there is very little difference from the two. For lower frequencies e.g. weekly, montly etc. it is conventional to use ordinary returns instead
- the geometric brownian motion assumes a price process whose log returns are normally distributed
- unlike ordinary returns, the h-period log return is the sum of h consecutive one-period returns.

If the expected return is assumed to be zero then the variance obtained in Equation (1) is an unbiased estimator i.e. $E(\hat{\sigma}^2) = \sigma^2$ and so $\sqrt{E(\hat{\sigma}^2)} = \sigma$

Otherwise, the expected return has to be calculated from the sample, and an unbiased estimate of the sample variance is derived as:

\begin{equation} \hat{s_{it}}^2 = (T-1)^{-1} \sum_{h=1}^{T}(r_{i, t-k} - \bar{r_{i}})^2 \end{equation}In [5]:

```
avg_log_ret = df["LogRet"].mean()
df["SqMeanDev"] = (df["LogRet"] - avg_log_ret)**2
variance = sum(df["SqMeanDev"][1:])/(len(df["SqMeanDev"][1:]) - 1)
volatility = np.sqrt(variance*250)
print("Variance: {}".format(round(variance, 6)))
print("Volatility: {}".format(round(volatility, 3)))
```

Variance: 0.000471 Volatility: 0.343

Equally weighted estimate of covariance can be computed as:

\begin{equation} \hat{\sigma_{ijt}} = T^{-1} \sum_{h=1}^{T} r_{i, t-k} r_{j, t-k} \end{equation}However, this only works for high frequency data and under the assumption of zero mean returns. When these conditions are not met, it is recommended to:

- use ordinary returns instead of log returns
- take the sum of the cross prodcut of the mean deviations of returns
- use T-1 instead of T in the denominator

Once the two variances and the covariance metrics are calculated, a correlation esimate can be derived as:

\begin{equation} Equally\ Weighted\ Correlation\ Estimate = \hat{\rho_{ijt}} = \frac{\hat{\sigma_{ijt}}}{\hat{\sigma_{it}}\hat{\sigma_{jt}}} \end{equation}In [6]:

```
df1 = pd.read_excel(r"data/Examples_II.3.xls", sheet_name="EX_II.3.7").iloc[:11, :2].set_index("Date").astype(float)
df1.index = df1.index.date
df1["LogRet"] = np.log(df1/df1.shift())
df1["SqRet"] = df1["LogRet"]**2
df1["CrossProduct"] = df1["SqRet"]*df1["SqRet"]
```

In [7]:

```
fig = go.Figure()
fig.add_trace(go.Scatter(x=df.index, y=(100*df["FTSE"]/df["FTSE"].values[0]),
mode='lines',
name='FTSE'))
fig.add_trace(go.Scatter(x=df1.index, y=(100*df1["SP500"]/df1["SP500"].values[0]),
mode='lines',
name='SP500'))
fig.update_layout(title_text = "FTSE100 and S&P500")
fig.show()
```