time complexity

time complexity

big O notation

Big O notation measures the asymptotic growth of a function. f (n) = O(g(n)) if for all sufficiently large n, f (n) is at most a constant factor larger than g(n).

Ω and Θ notation

We say f (n) = Ω(g(n)) if g(n) = O(f (n)).
We say f (n) = Θ(g(n)) if f (n) = O(g(n)) and g(n) = O(f (n)).

types of complexity

Worst-case complexity: what is the largest possible running time on any input of size n?
Average-case complexity: what is the average running time on a random input of size n?
Best-case complexity: what is the smallest possible running time on any input of size n?

Graph algorithm

ways of representing graphs

adjacency matrix

For graph with n vertices this is an n × n matrix A, where $A{ij}$ = 1 if there is an edge from node i to node j, $A{ij}$ = 0 otherwise.
If the graph is undirected, the matrix is symmetric

adjacency lists

For each vertex, keep a list of its neighbors.

incidence matrix

The incidence matrix of an undirected graph with n vertices and m edges is an n × m matrix B where $B{ij}$ = 1 if the i’th vertex is part of the j’th edge, $B{ij}$ = 0 otherwise.

Two fundamental graph exploration algorithms

Depth First Search (DFS)

Breadth First Search (BFS)

For the BFS tree, this gives the shortest (fewest number of steps) paths from s to all other nodes

Greedy Algorithms

Greedy algorithms are algorithms that build a solution step by step by always choosing the currently best option.

Interval Scheduling

Input: A list of intervals
Output: Maximum number of these intervals that can be chosen without getting any overlaps.
Solution:Pick the one that ends first.
Prove correctness of such an algorithm: Common strategy for analyzing greedy algorithms: prove that the algorithm always “stays ahead” of the optimal solution.

Job Scheduling With Minimum Lateness

Input: A list of jobs, each job has a deadline di
, and a duration ti
(how long it takes to finish the job)
Output: Smallest possible maximum lateness in a schedule for doing
all jobs.
Solution: Pick the job with smallest di.

Shortest path

It is helpful to instead consider a more general problem. Let us try
to find the shortest paths from s to all other vertices: Dijkstra’s algorithm: we have some set D of vertices we have found
the shortest path to, and each step we add a new vertex to D. add the vertex outside D which is closest to
s when using only vertices in D as intermediate vertices.

Divide & Conquer

Algorithms that split the input into significantly smaller parts, recursively solves each part, and then combines the subresults (somehow).

Merge sort

O(n log n).

Polynomial multiplication

$T(n) = O(n^{1.59})$.
Using FFT, get time O(n log n) for Polynomial Multiplication

Unit cost model and Bit cost model

Unit cost model: assume all numbers fit in machine registers so that basic arithmetic operations take constant time.
Bit cost model: account for size of numbers and the time it takes to manipulate them.

Integer multiplication

Karatsuba’s algorithm: $T(n) = O(n^{1.59})$.

Master Theorem

Dynamic Programming

Split a problem into smaller subproblems such that results from one
subproblem can be reused when solving others

Fibonacci numbers

The Fibonacci numbers are a classic number sequence in
mathematics, defined by the linear recurrence
f0 = 0; f1 = 1; and fn = fn−1 + fn−2 for n ≥ 2

Weighted Interval Scheduling

Input: A list of intervals [s1; t1]; [s2; t2]; : : : ; [sn; tn], each interval
[si
; ti
] has a weight wi
Output: What is the maximum total weight of these intervals that
can be chosen without getting any overlaps

Knapsack

Input: A capacity C and a list of objects, each object has a value vi
and weight wi
Output: Subset S of objects such that
$\sum{i∈S} wi ≤ C$ and $\sum{i∈S} vi$ is maximized.

top-down and bottom-up fashion

top-down fashion: we start at the end result and
recursively compute results for relevant subproblems.

bottom-up fashion: we iteratively compute results for larger and larger subproblems.

Characteristics of dynamic programming

A problem is amenable to dynamic program if we can define a set
of subproblems such that:

  1. The number of different subproblems is as small as possible.
  2. There is some ordering of subproblems from “small” to “large”
  3. The value of a subproblem can be efficiently computed given
    the values of some set of smaller subproblems.

Sequence Alignment

Input: Strings x and y of lengths m and n, parameters ‹ and ¸
Output: Minimum cost of an alignment of x and y with parameters $\sigma$ and $\alpha$.
$\alpha is the cost of aligning two different characters with each other
$\sigma$ is the cost of not aligning a character

Matrix Chain Multiplication

Network Flow

The Max-Flow problem

Input: Flow network G.
Output: Flow f maximizing the value v(f ).
Solution: The Ford-Fulkerson Algorithm O(C(m + n))
or the scaling algorithm with O(m2
log(C)) or Edmonds-Karp algorithm with O(nm(n + m)) .

Edge Cuts

An edge cut of a graph is a set of edges such that their removal would disconnect the graph.

Minimum s-t-Cut

Input: A flow network G with source s and sink t.
Output: An s-t cut A; B of G minimizing the capacity c(A; B).

The Max-Flow-Min-Cut Theorem

For every flow network G, the maximum flow from s to t equals the
minimum capacity of an s-t cut in G.

Vertex Cuts

A vertex cut in a graph is a set of vertices such that if we remove
them, the graph splits into more than one connected component.

Matchings

A matching in a graph is a set M of edges such that no vertex appears in more than one edge of M.Of particular interest to us will be bipartite graphs.

Maximum Bipartite Matching

Input: A bipartite graph G
Output: A matching M in G of maximum possible size.

Edge-Disjoint Paths

Given a directed graph with source and sink, what is maximum
number of edge-disjoint paths from s to t?
(edge-disjoint = no edge used by more than one path)

Project Selection?

Randomized Algorithms

Randomized Algorithms and approximation algorithm

A randomized algorithm is an algorithm that uses randomness to
make some of its choices.

Las Vegas Algorithms

A Las Vegas algorithm is a randomized algorithm that always finds
the correct answer, but the running time of the algorithm might
vary significantly depending on the random choices made.

Monte Carlo Algorithms

A Monte Carlo algorithm is a randomized algorithm where the
output of the algorithm may be incorrect, but we have a guarantee
that this only happens with small probability.

Random Number Generator

True RNG: gets randomness from measuring physical phenomena
(e.g. background radiation) that we believe is sufficiently random
Pseudo-RNG: starting from a small random seed, new numbers are
generated in a completely deterministic way.

Randomized Min-Cut Algorithm

while G has more than 2 vertices:Pick a uniformly random edge of G and contract it.
the total runtime is $O(n^2m)$. However this algorithm can be refined and running time improved to $O(n^2log(n))$ (relatively simple algorithm) or
$O(m log^3(n))$ (more complicated algorithm).

approximation algorithm

The Maximum Cut Problem

Partition of vertices of G into two non-empty sets A and B such that number of edges between A and B is maximized. It is a NP-hard problem. Let us lower the ambition and try to find a fast algorithm that finds a “reasonably good” cut: Random algorithm:Put each vertex in either A or B independently with probability 1/2.The random assignment algorithm cuts (in expectation) at least half of all the edges in the graph.This is an example of an approximation algorithm.

finding “good but maybe not optimal” solutions

Heuristic Algorithms

Work well in some or even many cases, but with no guarantees
about how well they perform.

Approximation Algorithms

Algorithms with provable guarantee that the solution found is relatively good compared to the optimum solution, for all instances.
For a minimization problem, an algorithm has approximation ratio $\alpha ≥ 1$ if for every instance it holds that $ Alg ≤ \alpha Opt$.
For a maximization problem, the inequality goes the other way: we have approximation ratio $\alpha ≤ 1$ if $ Alg ≥ \alpha Opt$.

alpha-approximation algorithm

approximation ratio: $\alpha$.

Minimum Load Balancing

NP-hard problem.
Given Lengths t1; : : : ; tn of n jobs to be run on m machines, to find Minimum possible makespan of a schedule for the n jobs.
Approximation Algorithm for Load Balancing:Assign job i to the machine j with the smallest load.

how to prove it

The main difficulty in analyzing approximation
algorithms is to get some handle on the Opt value. We need to find Opt, but finding Opt is NP-hard, so we find lower bounds on Opt.

Minimum Vertex Cover

A vertex cover in a graph G is a set of vertices that “touches” every edge of G.What is the size of a minimum vertex cover of G? Approximation Algorithm for Minimum Vertex Cover:while there exists an edge e = (u; v) such that u /∈ S and v /∈ S, add (u,v) into S. The algorithm is a 2-approximation algorithm.
“Unique Games Conjecture”: it is known that Vertex Cover cannot be approximated better than
within a factor 2.

Minimum Set Cover

A collection of sets S1; : : : ; Sm ⊆ U over some universe U, What is minimum number of Si’s whose union equals U?

Analysis of Greedy Set Cover Finale?