Matrix Chain Multiplication
Contents
Matrix Chain Multiplication¶
For normal multiplication the commutative law indicates that \((AB)C = A(BC)\). This is not true from a computational perspective. Depending on the number of calculations it takes more or less time.
The Matrix Multiplication is defined as follows:
A matrix with the size \(4\times 2\) and \(2 \times 3\) will result in a matrix with \(4x3\) elements. In general \(m\times n * n\times p \rightarrow m\times p\)
In total there are \(4*3=12\) operations needed$
Example¶
Example 2¶
\(A\) is a \(10 \times 20\) matrix, \(B\) is a \(30 \times 40\) matrix, \(C\) is a \(50 \times 60\) matrix
All possible combinations: $\((AB)C = A(BC)\)$
Number of operations: $\( (AB)C = (10*20*40) + (10*40*60) = 32000 \text{ operations}\)\( \)\( A(BC) = (30*40*60) + (30*60*20) = 108000 \text{ operations}\)$
Solution¶
The best ordering can be found using recursive relationship. Let Matrix Chain Multiplication (MCM) denotes a function that returns a minimum number of scalar multiplications. Then MCM can be defined as the best split among all possible choices.
Using dynamic programming and memoization, the problem can be solved in \(O(n^3)\) time.
Imports¶
Programatical reflections¶
def display(matrix_dict: dict, result_size: list, cost: int, order: list):
"""displays the calculation in a nice fashion
Args:
matrix_dict (dict): input matrix sizes
result_size (list): result matrix size
cost (int): calculation cost
order (list): calculation order
"""
_sizes = ""
for i in matrix_dict:
_sizes += "{}: [{} x {}] ".format(i, matrix_dict[i][0], matrix_dict[i][1])
print("{} Matrixes: {}".format(len(matrix_dict), _sizes))
_order = ""
for item in order:
if len(item) > 1:
_order = "{}({})".format(_order, item)
else:
if _order == "":
_order = "{}".format(item)
else:
_order = "({}{})".format(_order, item)
print(" Result Matrix [{} x {}]".format(result_size[0], result_size[1]))
print(" Calculation order: {}".format(_order))
print(" Cost: {}".format(cost))
def calculation(matrix_dict: list, order: list):
"""Doing the inverse of what we should do, calculate the matrix costs and it's resulting size
Args:
matrix_dict (list): dict of input matrixes
order (list): calculation order
"""
_cost = 0
_size = [0, 0]
# loop through matrixes
for elem in order:
# not a single element
if len(elem) > 1:
# first element ever len 2
if _size == [0, 0]:
_size = [matrix_dict[elem[0]][0], matrix_dict[elem[1]][1]]
else:
_size = [matrix_dict[elem[0]][0], matrix_dict[elem[1]][1]]
_cost += matrix_dict[elem[0]][0]*matrix_dict[elem[0]][1]*matrix_dict[elem[1]][1]
else:
# first element ever len 1
if _size == [0, 0]:
_size = [matrix_dict[elem][0], matrix_dict[elem][1]]
else:
_size = [_size[0], matrix_dict[elem][1]]
_cost += _size[0]*_size[1]*matrix_dict[elem][1]
return _size, _cost
matrix_dict = {'A': [10, 20],
'B': [30, 40],
'C': [50, 60],
'D': [70, 80],
'E': [90, 100]}
cost = 0
cols = 0
rows = 0
order = ['AB', 'C']
display(matrix_dict, [cols, rows], cost, order)
order = ['A', 'BC']
display(matrix_dict, [cols, rows], cost, order)
order = ['AB', 'C', 'D']
display(matrix_dict, [cols, rows], cost, order)
order = ['A', 'BC', 'D']
display(matrix_dict, [cols, rows], cost, order)
order = ['A', 'B', 'CD']
display(matrix_dict, [cols, rows], cost, order)
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100]
Result Matrix [0 x 0]
Calculation order: ((AB)C)
Cost: 0
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100]
Result Matrix [0 x 0]
Calculation order: A(BC)
Cost: 0
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100]
Result Matrix [0 x 0]
Calculation order: (((AB)C)D)
Cost: 0
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100]
Result Matrix [0 x 0]
Calculation order: (A(BC)D)
Cost: 0
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100]
Result Matrix [0 x 0]
Calculation order: (AB)(CD)
Cost: 0
matrix_dict = {'A': [10, 20],
'B': [30, 40],
'C': [50, 60],
'D': [70, 80],
'E': [90, 100]}
cost = 0
cols = 0
rows = 0
order = ['AB', 'C']
display(matrix_dict, [cols, rows], cost, order)
order = ['A', 'BC']
display(matrix_dict, [cols, rows], cost, order)
order = ['AB', 'C', 'D']
display(matrix_dict, [cols, rows], cost, order)
order = ['A', 'BC', 'D']
display(matrix_dict, [cols, rows], cost, order)
order = ['A', 'B', 'CD']
display(matrix_dict, [cols, rows], cost, order)
# (AB)C
order = ['AB', 'C']
cost = matrix_dict['A'][0]*matrix_dict['A'][1]*matrix_dict['B'][1]+matrix_dict['A'][0]*matrix_dict['B'][1]*matrix_dict['C'][1]
cols = matrix_dict['A'][0]
rows = matrix_dict['C'][1]
display(matrix_dict, [cols, rows], cost, order)
# A(BC)
order = ['A', 'BC']
cost = matrix_dict['B'][0]*matrix_dict['B'][1]*matrix_dict['C'][1]+matrix_dict['B'][0]*matrix_dict['C'][1]*matrix_dict['A'][1]
cols = matrix_dict['A'][0]
rows = matrix_dict['C'][1]
display(matrix_dict, [cols, rows], cost, order)
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100]
Result Matrix [0 x 0]
Calculation order: ((AB)C)
Cost: 0
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100]
Result Matrix [0 x 0]
Calculation order: A(BC)
Cost: 0
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100]
Result Matrix [0 x 0]
Calculation order: (((AB)C)D)
Cost: 0
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100]
Result Matrix [0 x 0]
Calculation order: (A(BC)D)
Cost: 0
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100]
Result Matrix [0 x 0]
Calculation order: (AB)(CD)
Cost: 0
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100]
Result Matrix [10 x 60]
Calculation order: ((AB)C)
Cost: 32000
5 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60] D: [70 x 80] E: [90 x 100]
Result Matrix [10 x 60]
Calculation order: A(BC)
Cost: 108000
# Calculate the same with the function
matrix_dict = {'A': [10, 20],
'B': [30, 40],
'C': [50, 60]}
# (AB)C
order = ['AB', 'C']
calculation(matrix_dict, order)
display(matrix_dict, [cols, rows], cost, order)
# A(BC)
order = ['A', 'BC']
calculation(matrix_dict, order)
display(matrix_dict, [cols, rows], cost, order)
3 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60]
Result Matrix [10 x 60]
Calculation order: ((AB)C)
Cost: 108000
3 Matrixes: A: [10 x 20] B: [30 x 40] C: [50 x 60]
Result Matrix [10 x 60]
Calculation order: A(BC)
Cost: 108000
Algorithm¶
def minimal_mcm(chain: list):
"""Calculation of the matrix chain mulltiplication variant with the least cost
Args:
chain (list): list of list of matrix sizes [['Name', row-size, col-size],...]
Returns:
dict: of {result-row-size. result-col-size, cost, order}
"""
n = len(chain)
# single matrix chain has zero cost
aux = {(i, i): (0,) + chain[i] for i in range(n)}
# i: length of subchain
for i in range(1, n):
# j: starting index of subchain
for j in range(0, n - i):
best = float('inf')
# k: splitting point of subchain
for k in range(j, j + i):
# multiply subchains at splitting point
lcost, lname, lrow, lcol = aux[j, k]
rcost, rname, rrow, rcol = aux[k + 1, j + i]
cost = lcost + rcost + lrow * lcol * rcol
name = '(%s%s)' % (lname, rname)
# pick the best one
if cost < best:
best = cost
aux[j, j + i] = cost, name, lrow, rcol
return dict(zip(['cost', 'name', 'row-size', 'col-size'], aux[0, n - 1]))
Test¶
print(minimal_mcm([('A', 10, 20), ('B', 30, 40), ('C', 50, 60)]))
print(minimal_mcm([('A', 10, 20), ('B', 30, 40), ('C', 50, 60), ('D', 70, 80), ('E', 90, 100)]))
{'cost': 32000, 'name': '((AB)C)', 'row-size': 10, 'col-size': 60}
{'cost': 160000, 'name': '((((AB)C)D)E)', 'row-size': 10, 'col-size': 100}