Matrix Multiplication With Rust and Go (Filipino)

🚀 Introduction
Honestly, I’m not even sure how I ended up here, pero napag-tripan ko lang ngayong araw mag-review ng Go (medyo Rust-y na ako sa language na yan, LOL).
Naisip kong gawin itong matrix multiplication exercise para ma-review kung paano ito gumagana under the hood at ano ang mga pwedeng approaches.
Matrix multiplication shows up everywhere:
- Artificial Intelligence → Neural networks basically do
Inputs × Weights = Outputs
. - Computer Graphics → 3D transformations like rotation, scaling, and projection all use matrices.
Sa post na to, I'll show three ways that I know, para mag matrix multiply.
⚠️ Requirements for Matrix Multiplication
Bago mag-multiply ng two matrices:
- Number of columns ng Matrix A dapat equal sa number of rows ng Matrix B.
- Kung hindi, hindi possible ang multiplication.
- Ang output matrix C ay magkakaroon ng rows equal sa Matrix A at columns equal sa Matrix B.
C
dimensions:(rows of A) × (columns of B)
.
🟢 Naive Approach (Triple Nested Loop)
Gamit ang triple nested loop, ito yung pinaka basic na algorithm na kung saan very straightforward so mabilis ma ge-gets.
Pseudocode
for i in rows of A:
for j in cols of B:
C[i][j] = 0
for k in shared dimension:
C[i][j] += A[i][k] * B[k][j]
Step-by-Step Explanation
1. Pick a row from Matrix A
i
represents the row index sa Matrix A.- Sa bawat
i
, tinitingnan natin ang buong row ng A.
2. Pick a column from Matrix B
j
represents the column index sa Matrix B.- For each
i
, pinapasadahan natin ang bawat column ng B para makuha ang tamang value saC[i][j]
.
3. Initialize C[i][j] to 0
- Bago i-start ang dot product, siguraduhin na zero ang initial value ng
C[i][j]
.
4. Compute the dot product of row A and column B
k
loops through the shared dimension (number of columns sa A or number of rows sa B).- Multiply
A[i][k] × B[k][j]
at i-add saC[i][j]
.
5. Repeat for all rows and columns
- Kapag tapos na ang lahat ng
k
, dapat may value na angC[i][j]
. - Move to next column
j
. Kapag done lahat ng columns, move sa next rowi
.
6. Result
- Pagkatapos ng lahat ng loops, may result na sa matrix C.
Video Demo
- Visual explanation ng mga sinabi ko sa taas na steps:
Golang Example
package main
import (
"fmt"
"log"
)
func multiplyMatrices(A, B [][]int) ([][]int, error) {
if len(A) == 0 || len(B) == 0 || len(A[0]) != len(B) {
return nil, fmt.Errorf(
"cannot multiply: columns of A (%d) != rows of B (%d)",
len(A[0]), len(B),
)
}
n, m, p := len(A), len(B[0]), len(B)
// create a slice with n rows (number of rows sa matrix A)
C := make([][]int, n)
// for each row, gawa ka ng columns based sa number of m (number of columns ng matrix B)
for i := range C {
C[i] = make([]int, m)
}
for i := 0; i < n; i++ {
for j := 0; j < m; j++ {
for k := 0; k < p; k++ {
C[i][j] += A[i][k] * B[k][j]
}
}
}
return C, nil
}
func main() {
A := [][]int{{1, 2}, {3, 4}}
B := [][]int{{5, 6}, {7, 8}}
C, err := multiplyMatrices(A, B)
if err != nil {
log.Fatal(err)
}
fmt.Println("Result:", C)
}
Output:
Result: [[19 22] [43 50]]
Rust Example
fn multiply_matrices(a: &Vec<Vec<i32>>, b: &Vec<Vec<i32>>) -> Vec<Vec<i32>> {
let n = a.len(); // number of rows ng matrix A
let m = b[0].len(); // number of columns ng matrix B
let p = b.len();
// ez NxM matrix using vec! macro sa Rust
let mut c = vec![vec![0; m]; n];
for i in 0..n {
for j in 0..m {
for k in 0..p {
c[i][j] += a[i][k] * b[k][j];
}
}
}
c
}
fn main() {
let a = vec![vec![1, 2], vec![3, 4]];
let b = vec![vec![5, 6], vec![7, 8]];
let c = multiply_matrices(&a, &b);
println!("Result: {:?}", c);
}
Output:
Result: [[19, 22], [43, 50]]
🔵 Strassen's Algorithm (Divide and Conquer)
80-20 ng Strassen’s Algorithm: Ang 20% na Dapat Mong Malaman
Divide-and-Conquer
Hatiin lang yung
bawat matrix sa apat na equal na quadrants:
A → [A11, A12; A21, A22]
B → [B11, B12; B21, B22]
Itong recursive style yung
pinaka-importante sa algorithm.
Bawasan ang Multiplications
Instead na 8 multiplications (A11B11, A12B21, etc.), 7 "well-chosen" multiplications (P1–P7) lang ang kailangan sa Strassen, plus ilang additions/subtractions. Dahil dito, bumababa yung
complexity from O(n³)
to O(n^2.81)
.
Pagsamahin ang Submatrices
Yung
7 products, pagsasamahin lang ulit gamit ang basic addition/subtraction para makuha yung
final quadrants: C11, C12, C21, C22.
Base Case
Pagdating sa maliliit na matrices (like 2×2 or 1×1), i-normal
multiplication mo na lang.
Trade-offs
- Mas maraming additions/subtractions → so medyo mas malakas sa memory at mas maraming kailangang i-track.
- May konting numerical instability.
- Para sa maliliit na
n
, mas mabilis pa rinyung
standard na multiplication. - Hindi gumagana sa odd-sized matrices: Itong implementation na to ay para lang sa matrices na may powers of two.
Ang Strassen's algorithm ay mas pina improve at recursive way to matrix multiply. Gumagamit to ng divide-and-conquer na diskarte para bawasan yung bilang ng mga multiplication na kailangan, kaya mas mabilis to para sa malalaking matrices.
How It Works
Instead na karaniwang 8 multiplications para sa isang 2x2 matrix, 7 lang ang ginagamit ng Strassen's algorithm.
Pwede ko i-summarize tong Strassen ito 3 basic steps:
- Divide: Hatiin ang mga input na matrices A at B sa apat na quadrants
- Conquer: I-compute ang 7 matrix products nang recursive (ang "Strassen products").
- Combine: Pagsamahin ang mga resulta ng 7 products para mabuo yung resulting matrix C.
Pseudocode
FUNCTION Strassen(A, B):
n ← laki ng matrix A
IF n ≤ 2 THEN
RETURN standard_matrix_multiplication(A, B)
END IF
mid ← n / 2
// Hatiin ang A sa submatrices
A11 ← top-left submatrix ng A
A12 ← top-right submatrix ng A
A21 ← bottom-left submatrix ng A
A22 ← bottom-right submatrix ng A
// Hatiin ang B sa submatrices
B11 ← top-left submatrix ng B
B12 ← top-right submatrix ng B
B21 ← bottom-left submatrix ng B
B22 ← bottom-right submatrix ng B
// I-compute ang 7 Strassen products (recursive calls)
P1 ← Strassen(A11, B12 - B22) // P1 = A11 * (B12 - B22)
P2 ← Strassen(A11 + A12, B22) // P2 = (A11 + A12) * B22
P3 ← Strassen(A21 + A22, B11) // P3 = (A21 + A22) * B11
P4 ← Strassen(A22, B21 - B11) // P4 = A22 * (B21 - B11)
P5 ← Strassen(A11 + A22, B11 + B22) // P5 = (A11 + A22) * (B11 + B22)
P6 ← Strassen(A12 - A22, B21 + B22) // P6 = (A12 - A22) * (B21 + B22)
P7 ← Strassen(A11 - A21, B11 + B12) // P7 = (A11 - A21) * (B11 + B12)
// I-compute ang mga resultang quadrants ng C
C11 ← P5 + P4 - P2 + P6 // top-left quadrant
C12 ← P1 + P2 // top-right quadrant
C21 ← P3 + P4 // bottom-left quadrant
C22 ← P5 + P1 - P3 - P7 // bottom-right quadrant
// Pagsamahin ang mga quadrants sa buong matrix
C ← [ [C11, C12],
[C21, C22] ]
RETURN C
END FUNCTION
Golang Example
package main
import "fmt"
// Function para i-add ang dalawang matrices
func add(A, B [][]int) [][]int {
n := len(A)
C := make([][]int, n)
for i := 0; i < n; i++ {
C[i] = make([]int, n)
for j := 0; j < n; j++ {
C[i][j] = A[i][j] + B[i][j]
}
}
return C
}
// Function para i-subtract ang dalawang matrices
func subtract(A, B [][]int) [][]int {
n := len(A)
C := make([][]int, n)
for i := 0; i < n; i++ {
C[i] = make([]int, n)
for j := 0; j < n; j++ {
C[i][j] = A[i][j] - B[i][j]
}
}
return C
}
// Strassen's algorithm
func strassen(A, B [][]int) [][]int {
n := len(A)
// Base case
if n == 1 {
C := make([][]int, 1)
C[0] = make([]int, 1)
C[0][0] = A[0][0] * B[0][0]
return C
}
// New size para sa submatrices
newSize := n / 2
A11 := make([][]int, newSize)
A12 := make([][]int, newSize)
A21 := make([][]int, newSize)
A22 := make([][]int, newSize)
B11 := make([][]int, newSize)
B12 := make([][]int, newSize)
B21 := make([][]int, newSize)
B22 := make([][]int, newSize)
for i := 0; i < newSize; i++ {
A11[i] = make([]int, newSize)
A12[i] = make([]int, newSize)
A21[i] = make([]int, newSize)
A22[i] = make([]int, newSize)
B11[i] = make([]int, newSize)
B12[i] = make([]int, newSize)
B21[i] = make([]int, newSize)
B22[i] = make([]int, newSize)
for j := 0; j < newSize; j++ {
A11[i][j] = A[i][j]
A12[i][j] = A[i][j+newSize]
A21[i][j] = A[i+newSize][j]
A22[i][j] = A[i+newSize][j+newSize]
B11[i][j] = B[i][j]
B12[i][j] = B[i][j+newSize]
B21[i][j] = B[i+newSize][j]
B22[i][j] = B[i+newSize][j+newSize]
}
}
// Recursive calls para sa Strassen's formulas
P1 := strassen(A11, subtract(B12, B22))
P2 := strassen(add(A11, A12), B22)
P3 := strassen(add(A21, A22), B11)
P4 := strassen(A22, subtract(B21, B11))
P5 := strassen(add(A11, A22), add(B11, B22))
P6 := strassen(subtract(A12, A22), add(B21, B22))
P7 := strassen(subtract(A11, A21), add(B11, B12))
// Combine results
C11 := add(subtract(add(P5, P4), P2), P6)
C12 := add(P1, P2)
C21 := add(P3, P4)
C22 := subtract(subtract(add(P5, P1), P3), P7)
// Final result matrix
C := make([][]int, n)
for i := 0; i < newSize; i++ {
C[i] = make([]int, n)
C[i+newSize] = make([]int, n)
for j := 0; j < newSize; j++ {
C[i][j] = C11[i][j]
C[i][j+newSize] = C12[i][j]
C[i+newSize][j] = C21[i][j]
C[i+newSize][j+newSize] = C22[i][j]
}
}
return C
}
func main() {
A := [][]int{{1, 2}, {3, 4}}
B := [][]int{{5, 6}, {7, 8}}
C := strassen(A, B)
fmt.Println("Result:", C)
}
Rust Example
fn add(a: &Vec<Vec<i32>>, b: &Vec<Vec<i32>>) -> Vec<Vec<i32>> {
let n = a.len();
let mut c = vec![vec![0; n]; n];
for i in 0..n {
for j in 0..n {
c[i][j] = a[i][j] + b[i][j];
}
}
c
}
fn subtract(a: &Vec<Vec<i32>>, b: &Vec<Vec<i32>>) -> Vec<Vec<i32>> {
let n = a.len();
let mut c = vec![vec![0; n]; n];
for i in 0..n {
for j in 0..n {
c[i][j] = a[i][j] - b[i][j];
}
}
c
}
fn strassen(a: &Vec<Vec<i32>>, b: &Vec<Vec<i32>>) -> Vec<Vec<i32>> {
let n = a.len();
if n == 1 {
return vec![vec![a[0][0] * b[0][0]]];
}
let mid = n / 2;
let mut a11 = vec![vec![0; mid]; mid];
let mut a12 = vec![vec![0; mid]; mid];
let mut a21 = vec![vec![0; mid]; mid];
let mut a22 = vec![vec![0; mid]; mid];
let mut b11 = vec![vec![0; mid]; mid];
let mut b12 = vec![vec![0; mid]; mid];
let mut b21 = vec![vec![0; mid]; mid];
let mut b22 = vec![vec![0; mid]; mid];
for i in 0..mid {
for j in 0..mid {
a11[i][j] = a[i][j];
a12[i][j] = a[i][j + mid];
a21[i][j] = a[i + mid][j];
a22[i][j] = a[i + mid][j + mid];
b11[i][j] = b[i][j];
b12[i][j] = b[i][j + mid];
b21[i][j] = b[i + mid][j];
b22[i][j] = b[i + mid][j + mid];
}
}
let p1 = strassen(&a11, &subtract(&b12, &b22));
let p2 = strassen(&add(&a11, &a12), &b22);
let p3 = strassen(&add(&a21, &a22), &b11);
let p4 = strassen(&a22, &subtract(&b21, &b11));
let p5 = strassen(&add(&a11, &a22), &add(&b11, &b22));
let p6 = strassen(&subtract(&a12, &a22), &add(&b21, &b22));
let p7 = strassen(&subtract(&a11, &a21), &add(&b11, &b12));
let c11 = add(&subtract(&add(&p5, &p4), &p2), &p6);
let c12 = add(&p1, &p2);
let c21 = add(&p3, &p4);
let c22 = subtract(&subtract(&add(&p5, &p1), &p3), &p7);
let mut c = vec![vec![0; n]; n];
for i in 0..mid {
for j in 0..mid {
c[i][j] = c11[i][j];
c[i][j + mid] = c12[i][j];
c[i + mid][j] = c21[i][j];
c[i + mid][j + mid] = c22[i][j];
}
}
c
}
fn main() {
let a = vec![vec![1, 2], vec![3, 4]];
let b = vec![vec![5, 6], vec![7, 8]];
let c = strassen(&a, &b);
println!("Result: {:?}", c);
}
🟡 Parallelization
Ang matrix multiplication ay sobrang dali i-parallelize. Bawat isang cell sa output matrix C
pwedeng i-compute nang hiwalay. So, pwede tayong gumamit ng maraming threads (o goroutines sa Go) para sabay-sabay i-compute 'yung iba't ibang parts ng matrix. Malaking tulong 'to para bumilis 'yung code natin, lalo na sa mga multi-core na computer.
Paano to gumagana?
Simple lang yung idea: hatiin 'yung trabaho. For example, pwede nating sabihin sa Thread 1, "ikaw sa row 1-10," sa Thread 2, "ikaw sa row 11-20," and so on. Bawat thread, gagawin 'yung part niya, tapos pagsasama-samahin na lang sa dulo.
Golang Example (gamit ang Goroutines)
package main
import (
"fmt"
"sync"
)
func multiplyMatricesParallel(A, B [][]int) ([][]int, error) {
if len(A) == 0 || len(B) == 0 || len(A[0]) != len(B) {
return nil, fmt.Errorf(
"cannot multiply: columns of A (%d) != rows of B (%d)",
len(A[0]), len(B),
)
}
n, m, p := len(A), len(B[0]), len(B)
C := make([][]int, n)
for i := range C {
C[i] = make([]int, m)
}
var wg sync.WaitGroup
wg.Add(n)
for i := 0; i < n; i++ {
go func(row int) {
defer wg.Done()
for j := 0; j < m; j++ {
for k := 0; k < p; k++ {
C[row][j] += A[row][k] * B[k][j]
}
}
}(i)
}
wg.Wait()
return C, nil
}
Rust Example (gamit ang Rayon)
Para gumana to, kailangan mo munang i-add 'yung rayon
crate sa Cargo.toml
mo:
[dependencies]
rayon = "1.5"
use rayon::prelude::*;
fn multiply_matrices_parallel(a: &Vec<Vec<i32>>, b: &Vec<Vec<i32>>) -> Vec<Vec<i32>> {
let n = a.len();
let m = b[0].len();
let p = b.len();
let mut c = vec![vec![0; m]; n];
c.par_iter_mut().enumerate().for_each(|(i, row)| {
for j in 0..m {
for k in 0..p {
row[j] += a[i][k] * b[k][j];
}
}
});
c
}
🤔 What's Next?
Ngayon na na cover na natin ang naive approach, Strassen's algorithm, at basic parallelization, here are some directions you and I can explore to level up our matrix multiplication game:
1. Benchmarking & Performance Profiling
- I-try to measure kung gaano kabilis ang naive vs Strassen vs parallel versions.
- Sa Rust, pwede nating gamitin ang
criterion
crate para sa precise benchmarking. - Sa Go, pwede nating gamitin ang
testing.B
para sa benchmark tests.
2. GPU Acceleration
- Yung current approach ngayon sa code is gumagamit ng CPU, kung ikukumpara mo yan sa production level na matrix multiplication na gamit sa NumPy malayong malayo yan dahil yung mga ganun is gumagamit ng GPU, since they're perfect for this type of problem. Well for now, hindi ko pa sila nasubukan pero ang idea ko for now is pwedeng gumamit ng CUDA.
- Rust →
wgpu
orcuda
bindings. - Go →
gorgonia
or custom CUDA/OpenCL bindings.