Matrix Chain Multiplication¶
For normal multiplication the commutative law indicates that $(AB)C = A(BC)$. This is not true from a computational perspective. Depending on the number of calculations it takes more or less time.
The Matrix Multiplication is defined as follows:
A matrix with the size $4\times 2$ and $2 \times 3$ will result in a matrix with $4x3$ elements. In general $m\times n * n\times p \rightarrow m\times p$
$$ \overset{4\times 2 \text{ matrix}}{\begin{bmatrix} {a_{11}} & {a_{12}} \\ {a_{21}} & {a_{22}} \\ {a_{31}} & {a_{32}} \\ {a_{41}} & {a_{42}} \\ \end{bmatrix}} \overset{2\times 3\text{ matrix}}{\begin{bmatrix} {b_{11}} & {b_{12}} & {b_{13}} \\ {b_{21}} & {b_{22}} & {b_{23}} \\ \end{bmatrix}} = \overset{4\times 3\text{ matrix}}{\begin{bmatrix} c_{11} & c_{12} & c_{13} \\ c_{21} & c_{22} & c_{23} \\ c_{31} & c_{32} & c_{33} \\ c_{41} & c_{41} & c_{41} \\ \end{bmatrix}} $$
$$c_{12} = a_{11}b_{12} + a_{12}b_{22}$$ $$c_{33} = a_{31}b_{13} + a_{32}b_{23}$$
In total there are $4*3=12$ operations needed$
Example¶
$$ \overset{1 \text{ operation}}{(\overset{1\times 2 \text{ matrix}}{\begin{bmatrix} 1 & 2 \\ \end{bmatrix}} \overset{2\times 1 \text{ matrix}}{\begin{bmatrix} 3 \\ 4 \\ \end{bmatrix}})} \overset{1\times 2 \text{ matrix}}{\begin{bmatrix} 5 & 6 \\ \end{bmatrix}} = \overset{2 \text{ operations}}{ \overset{1\times 1 \text{ matrix}}{\begin{bmatrix} 1*3+2*4 \\ \end{bmatrix}} \overset{1\times 2 \text{ matrix}}{\begin{bmatrix} 5 & 6 \\ \end{bmatrix}} } = \overset{1\times 1 \text{ matrix}}{\begin{bmatrix} 11 \\ \end{bmatrix}} \overset{1\times 2 \text{ matrix}}{\begin{bmatrix} 5 & 6 \\ \end{bmatrix}} = \overset{1\times 2 \text{ matrix}}{\begin{bmatrix} 11*5 & 11*6 \\ \end{bmatrix}} = \overset{1\times 2 \text{ matrix}}{\begin{bmatrix} 55 & 66 \\ \end{bmatrix}} $$
$$1 \times 2 \text{ matrix} * 2 \times 1 \text{ matrix} = 2 \times 1 \text{ matrix} \rightarrow 1 \text{ operation}$$ $$1 \times 1 \text{ matrix} * 1 \times 2 \text{ matrix} = 1 \times 2 \text{ matrix} \rightarrow 2 \text{ operations}$$ $$\text{In Total } = (1*1) + (2*1) = 1+2 = 3 \text{ operations}$$
$$ \overset{1\times 2 \text{ matrix}}{\begin{bmatrix} 1 & 2 \\ \end{bmatrix}} \overset{4 \text{ operations}}{ (\overset{2\times 1 \text{ matrix}}{\begin{bmatrix} 3 \\ 4 \\ \end{bmatrix}} \overset{1\times 2 \text{ matrix}}{\begin{bmatrix} 5 & 6 \\ \end{bmatrix}}) } = \overset{2 \text{ operations}}{ \overset{1\times 2 \text{ matrix}}{\begin{bmatrix} 1 & 2 \\ \end{bmatrix}} \overset{2\times 2 \text{ matrix}}{\begin{bmatrix} 3*5 & 3*6 \\ 4*5 & 4*6 \\ \end{bmatrix}} } = \overset{1\times 2 \text{ matrix}}{\begin{bmatrix} 1 & 2 \\ \end{bmatrix}} \overset{2\times 2 \text{ matrix}}{\begin{bmatrix} 15 & 18 \\ 20 & 24 \\ \end{bmatrix}} = \overset{1\times 2 \text{ matrix}}{\begin{bmatrix} 1*15+2*20 & 1*18+2*24 \end{bmatrix}} = \overset{1\times 2 \text{ matrix}}{\begin{bmatrix} 55 & 66 \end{bmatrix}} $$
$$2 \times 1 \text{ matrix} * 1 \times 2 \text{ matrix} = 2 \times 2 \text{ matrix} \rightarrow 4 \text{ operations} $$ $$1 \times 2 \text{ matrix} * 2 \times 2 \text{ matrix} = 1 \times 2 \text{ matrix} \rightarrow 2 \text{ operations} $$ $$\text{In Total } = (2*2) + (1*2) = 4+2 = 6 \text{ operations}$$
Example 2¶
$A$ is a $10 \times 20$ matrix, $B$ is a $30 \times 40$ matrix, $C$ is a $50 \times 60$ matrix
All possible combinations: $$(AB)C = A(BC)$$
Number of operations: $$ (AB)C = (10*20*40) + (10*40*60) = 32000 \text{ operations}$$ $$ A(BC) = (30*40*60) + (30*60*20) = 108000 \text{ operations}$$
Solution¶
The best ordering can be found using recursive relationship. Let Matrix Chain Multiplication (MCM) denotes a function that returns a minimum number of scalar multiplications. Then MCM can be defined as the best split among all possible choices.
$$MCM(A_1, \dots, A_n) = min_i MCM(A_1, \dots, A_i) \bigotimes MCM(A_{i+1}, \dots, A_n)$$
Using dynamic programming and memoization, the problem can be solved in $O(n^3)$ time.
Imports¶
Programatical reflections¶
def display(matrix_dict: dict, result_size: list, cost: int, order: list):
"""displays the calculation in a nice fashion
Args:
matrix_dict (dict): input matrix sizes
result_size (list): result matrix size
cost (int): calculation cost
order (list): calculation order
"""
_sizes = ""
for i in matrix_dict:
_sizes += "{}: [{} x {}] ".format(i, matrix_dict[i][0], matrix_dict[i][1])
print("{} Matrixes: {}".format(len(matrix_dict), _sizes))
_order = ""
for item in order:
if len(item) > 1:
_order = "{}({})".format(_order, item)
else:
if _order == "":
_order = "{}".format(item)
else:
_order = "({}{})".format(_order, item)
print(" Result Matrix [{} x {}]".format(result_size[0], result_size[1]))
print(" Calculation order: {}".format(_order))
print(" Cost: {}".format(cost))
def calculation(matrix_dict: list, order: list):
"""Doing the inverse of what we should do, calculate the matrix costs and it's resulting size
Args:
matrix_dict (list): dict of input matrixes
order (list): calculation order
"""
_cost = 0
_size = [0, 0]
# loop through matrixes
for elem in order:
# not a single element
if len(elem) > 1:
# first element ever len 2
if _size == [0, 0]:
_size = [matrix_dict[elem[0]][0], matrix_dict[elem[1]][1]]
else:
_size = [matrix_dict[elem[0]][0], matrix_dict[elem[1]][1]]
_cost += matrix_dict[elem[0]][0]*matrix_dict[elem[0]][1]*matrix_dict[elem[1]][1]
else:
# first element ever len 1
if _size == [0, 0]:
_size = [matrix_dict[elem][0], matrix_dict[elem][1]]
else:
_size = [_size[0], matrix_dict[elem][1]]
_cost += _size[0]*_size[1]*matrix_dict[elem][1]
return _size, _cost
matrix_dict = {'A': [10, 20],
'B': [30, 40],
'C': [50, 60],
'D': [70, 80],
'E': [90, 100]}
cost = 0
cols = 0
rows = 0
order = ['AB', 'C']
display(matrix_dict, [cols, rows], cost, order)
order = ['A', 'BC']
display(matrix_dict, [cols, rows], cost, order)
order = ['AB', 'C', 'D']
display(matrix_dict, [cols, rows], cost, order)
order = ['A', 'BC', 'D']
display(matrix_dict, [cols, rows], cost, order)
order = ['A', 'B', 'CD']
display(matrix_dict, [cols, rows], cost, order)
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100] Result Matrix [0 x 0] Calculation order: ((AB)C) Cost: 0 5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100] Result Matrix [0 x 0] Calculation order: A(BC) Cost: 0 5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100] Result Matrix [0 x 0] Calculation order: (((AB)C)D) Cost: 0 5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100] Result Matrix [0 x 0] Calculation order: (A(BC)D) Cost: 0 5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100] Result Matrix [0 x 0] Calculation order: (AB)(CD) Cost: 0
matrix_dict = {'A': [10, 20],
'B': [30, 40],
'C': [50, 60],
'D': [70, 80],
'E': [90, 100]}
cost = 0
cols = 0
rows = 0
order = ['AB', 'C']
display(matrix_dict, [cols, rows], cost, order)
order = ['A', 'BC']
display(matrix_dict, [cols, rows], cost, order)
order = ['AB', 'C', 'D']
display(matrix_dict, [cols, rows], cost, order)
order = ['A', 'BC', 'D']
display(matrix_dict, [cols, rows], cost, order)
order = ['A', 'B', 'CD']
display(matrix_dict, [cols, rows], cost, order)
# (AB)C
order = ['AB', 'C']
cost = matrix_dict['A'][0]*matrix_dict['A'][1]*matrix_dict['B'][1]+matrix_dict['A'][0]*matrix_dict['B'][1]*matrix_dict['C'][1]
cols = matrix_dict['A'][0]
rows = matrix_dict['C'][1]
display(matrix_dict, [cols, rows], cost, order)
# A(BC)
order = ['A', 'BC']
cost = matrix_dict['B'][0]*matrix_dict['B'][1]*matrix_dict['C'][1]+matrix_dict['B'][0]*matrix_dict['C'][1]*matrix_dict['A'][1]
cols = matrix_dict['A'][0]
rows = matrix_dict['C'][1]
display(matrix_dict, [cols, rows], cost, order)
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100] Result Matrix [0 x 0] Calculation order: ((AB)C) Cost: 0 5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100] Result Matrix [0 x 0] Calculation order: A(BC) Cost: 0 5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100] Result Matrix [0 x 0] Calculation order: (((AB)C)D) Cost: 0 5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100] Result Matrix [0 x 0] Calculation order: (A(BC)D) Cost: 0 5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100] Result Matrix [0 x 0] Calculation order: (AB)(CD) Cost: 0 5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100] Result Matrix [10 x 60] Calculation order: ((AB)C) Cost: 32000 5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100] Result Matrix [10 x 60] Calculation order: A(BC) Cost: 108000
# Calculate the same with the function
matrix_dict = {'A': [10, 20],
'B': [30, 40],
'C': [50, 60]}
# (AB)C
order = ['AB', 'C']
calculation(matrix_dict, order)
display(matrix_dict, [cols, rows], cost, order)
# A(BC)
order = ['A', 'BC']
calculation(matrix_dict, order)
display(matrix_dict, [cols, rows], cost, order)
3 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] Result Matrix [10 x 60] Calculation order: ((AB)C) Cost: 108000 3 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] Result Matrix [10 x 60] Calculation order: A(BC) Cost: 108000
Algorithm¶
def minimal_mcm(chain: list):
"""Calculation of the matrix chain mulltiplication variant with the least cost
Args:
chain (list): list of list of matrix sizes [['Name', row-size, col-size],...]
Returns:
dict: of {result-row-size. result-col-size, cost, order}
"""
n = len(chain)
# single matrix chain has zero cost
aux = {(i, i): (0,) + chain[i] for i in range(n)}
# i: length of subchain
for i in range(1, n):
# j: starting index of subchain
for j in range(0, n - i):
best = float('inf')
# k: splitting point of subchain
for k in range(j, j + i):
# multiply subchains at splitting point
lcost, lname, lrow, lcol = aux[j, k]
rcost, rname, rrow, rcol = aux[k + 1, j + i]
cost = lcost + rcost + lrow * lcol * rcol
name = '(%s%s)' % (lname, rname)
# pick the best one
if cost < best:
best = cost
aux[j, j + i] = cost, name, lrow, rcol
return dict(zip(['cost', 'name', 'row-size', 'col-size'], aux[0, n - 1]))
Test¶
print(minimal_mcm([('A', 10, 20), ('B', 30, 40), ('C', 50, 60)]))
print(minimal_mcm([('A', 10, 20), ('B', 30, 40), ('C', 50, 60), ('D', 70, 80), ('E', 90, 100)]))
{'cost': 32000, 'name': '((AB)C)', 'row-size': 10, 'col-size': 60} {'cost': 160000, 'name': '((((AB)C)D)E)', 'row-size': 10, 'col-size': 100}