- Published on

# Deep-Time Neural Networks

# Hierarchical Risk Parity: A Modern Portfolio Construction Method

We present a cutting-edge method for constructing financial portfolios, called Hierarchical Risk Parity (HRP). This method overcomes challenges posed by traditional techniques, by using modern mathematics and machine learning. HRP works in three steps: Tree Clustering, Quasi-Diagonalization, and Recursive Bisection. The approach is based on the information in the covariance matrix without requiring its inversion, and it works even with singular matrices.

## Tree Clustering

Given a matrix of observations $X$, we first generate a correlation matrix. A distance measure is defined, and an $N \times N$ distance matrix $D$ is created.

We use tree structures to hierarchically cluster financial assets. The goal is to group together the closest financial assets in terms of their distance measure, providing a hierarchical structure that represents the similarities between the assets.

In mathematical terms:

The algorithm performs clustering based on a new distance measure $\tilde{d}$, which computes the distance between column vectors of $D$. Eventually, we create clusters by minimizing $\tilde{d}_{i, j}$.

```
def distance_corr(
corr_matrix: np.ndarray
) -> np.ndarray:
distance_matrix = ((1 - corr_matrix) / 2.0) ** 0.5
return distance_matrix
```

## Quasi-Diagonalization

The next stage rearranges the covariance matrix to group similar investments together along the diagonal. The result is a quasi-diagonal matrix that simplifies the allocation of investments.

```
def quasi_diagonal(link: np.ndarray) -> list:
sorted_items = [link[-1, 0], link[-1, 1]]
num_items = link[-1, -2]
while sorted_items[-1] >= num_items:
index = int(sorted_items[-1] - num_items)
sorted_items.extend([link[index, 0], link[index, 1]])
return sorted_items
```

## Recursive Bisection

The last stage is the allocation of assets, leveraging the quasi-diagonal matrix. The algorithm recursively bisects the sorted list of original items into subsets and allocates investments in inverse proportion to their aggregated variances.

The algorithm follows:

- Initialize list of items $L$ and assign a unit weight to all items $w_n=1, \forall n$.
- For each subset $L_i$ with more than one item:

- Bisect $L_i$ into two subsets $L_i^{(1)}$ and $L_i^{(2)}$.
- Define the variance of each subset $\tilde{V}_{i}^{(j)}$ based on the covariance matrix of its constituents.

And so, the portfolio is constructed.

```
def recursive_bisection(cov: pd.DataFrame, sorted_items: list) -> pd.Series:
if len(sorted_items) > 1:
cluster = [sorted_items]
while len(cluster) > 0:
cluster_ = cluster.pop()
if len(cluster_) > 1:
var_covar = cov.loc[cluster_, cluster_]
e_val, e_vec = np.linalg.eigh(var_covar)
index = e_val.argsort()[::-1]
e_val, e_vec = e_val[index], e_vec[:, index]
w = np.zeros((len(cluster_)))
w[-1] = 1
cluster0, cluster1 = [], []
for i in range(len(cluster_) - 1):
d = np.sqrt((w * e_vec[:, i]).T @ e_val[i] @ (w * e_vec[:, i]))
u = ((w * e_vec[:, i]) / d).reshape(-1, 1)
cluster0.append(cluster_[np.dot(var_covar, u).flatten() <= 0])
cluster1.append(cluster_[np.dot(var_covar, u).flatten() > 0])
cluster.extend([sorted(cluster0), sorted(cluster1)])
else:
break
else:
cluster = [sorted_items]
weights = pd.Series(1, index=[i for i in cluster[0]])
return weights
```

### Comparing Portfolio Risk Strategies: HRP vs. CLA vs. IVP

This blog presents a Monte Carlo simulation study comparing three portfolio allocation strategies: Hierarchical Risk Parity (HRP), Critical Line Algorithm (CLA), and Inverse Variance Portfolio (IVP). We find that while CLA aims for the lowest in-sample risk, it performs the worst in out-of-sample tests. Specifically, HRP yields the lowest out-of-sample variance, making it preferable for risk parity investors who often employ leverage.

**Key Findings**:

- HRP portfolios have 72.47% lower variance than CLA portfolios and 38.24% lower variance than IVP portfolios.
- HRP improves the out-of-sample Sharpe ratio of a CLA strategy by about 31.3%.

### Methodology

Three key steps are followed:

- Generate 10 series of random Gaussian returns for 520 observations (equivalent to 2 years of daily history), with random shocks and correlation structure.
- Compute HRP, CLA, and IVP portfolios looking back at 260 observations (a year of daily history).
- Rebalance portfolios every 22 observations (equivalent to a monthly frequency).

This procedure is repeated 10,000 times.

```
def random_data(
number_observations: int,
length_sample: int,
size_uncorrelated: int,
size_correlated: int,
mu_uncorrelated: float,
sigma_uncorrelated: float,
sigma_correlated: float
) -> (np.ndarray, list):
# Generate random uncorrelated data
data1 = np.random.normal(mu_uncorrelated, sigma_uncorrelated, size=(number_observations, size_uncorrelated))
# Create correlation between the variables
columns = [random.randint(0, size_uncorrelated - 1) for i in range(size_correlated)] # randomly select columns
data2 = data1[:, columns] + np.random.normal(0, sigma_uncorrelated * sigma_correlated, size=(number_observations, len(columns))) # correlated data
# Merge data
data = np.append(data1, data2, axis=1)
# Add common random shock
points = np.random.randint(length_sample, number_observations - 1, size=2) # select random observations
data[np.ix_(points, [columns[0], size_uncorrelated])] = np.array([[-0.5, -0.5], [2, 2]])
# Add specific random shock
points = np.random.randint(length_sample, number_observations - 1, size=2) # select random observations
data[points, columns[-1]] = np.array([-0.5, 2])
return data, columns
```

### Insights and Interpretations

- CLA's optimization program aims for the lowest variance but ends up with the highest out-of-sample variance.
- IVP disregards the correlation structure, which affects its performance.
- HRP combines diversification across all investments and clusters of investments, making it more robust to both common and idiosyncratic shocks.

### Flexibility and Scalability

The HRP method is flexible and allows for variations. It can incorporate forecasted returns and other econometric methods.

```
def hrp_mc(
number_iterations: int = 5000,
number_observations: int = 520,
size_uncorrelated: int = 5,
size_correlated: int = 5,
mu_uncorrelated: float = 0,
sigma_uncorrelated: float = 0.01,
sigma_correlated: float = 0.25,
length_sample: int = 260,
test_size: int = 22
) -> None:
methods = [hrp] # methods
results, iteration_counter = {i.__name__: pd.Series() for i in methods}, 0 # initialize results and iteration counter
pointers = range(length_sample, number_observations, test_size) # pointers for in-sample and out-sample
while iteration_counter < number_iterations:
data, columns = random_data(number_observations, length_sample, size_uncorrelated, size_correlated, mu_uncorrelated, sigma_uncorrelated, sigma_correlated)
returns = {i.__name__: pd.Series() for i in methods} # initialize returns
for pointer in pointers:
in_sample = data[pointer - length_sample: pointer] # in sample
cov_, corr_ = np.cov(in_sample, rowvar=0), np.corrcoef(in_sample, rowvar=0) # cov and corr
out_sample = data[pointer: pointer + test_size] # out of sample
for func in methods:
weight = func(cov=cov_, corr=corr_) # call methods
ret = pd.Series(out_sample @ (weight.transpose())) # return
returns[func.__name__] = returns[func.__name__].append(ret) # update returns
for func in methods:
ret = returns[func.__name__].reset_index(drop=True) # return column of each method
cumprod_return = (1 + ret).cumprod() # cumprod of returns
results[func.__name__].loc[iteration_counter] = cumprod_return.iloc[-1] - 1 # update results
iteration_counter += 1 # next iteration
results_df = pd.DataFrame.from_dict(results, orient="columns") # dataframe of results
results_df.to_csv("results.csv") # csv file
std_results, var_results = results_df.std(), results_df.var() # std and var for each method
print(pd.concat([std_results, var_results, var_results / var_results["hrp"] - 1], axis=1))
```

### Mathematical Insight

In simpler terms, let's consider two vectors $X$ and $Y$. The correlation variable $\rho[x, y]$ is used to derive a distance measure $d[x, y]$ defined as:

This distance measure is proven to be a true metric, providing a robust way to measure how similar or different two portfolios are.

### References

- De Prado, M. L. (2018). Advances in financial machine learning. John Wiley & Sons.
- De Prado, M. M. L. (2020). Machine learning for asset managers. Cambridge University Press.