User:Varun Parbhu/sandbox

Derandomization

Randomness can be viewed as a resource, like space and time. Derandomization is then the process of removing randomness (or using as little of it as possible).^[1]^[2] It is not currently known^{[as of?]} if all algorithms can be derandomized without significantly increasing their running time.^[3] For instance, in computational complexity, it is unknown whether P = BPP,^[3] i.e., we do not know whether we can take an arbitrary randomized algorithm that runs in polynomial time with a small error probability and derandomize it to run in polynomial time without using randomness.

There are specific methods that can be employed to derandomize particular randomized algorithms:

the method of conditional probabilities, and its generalization, pessimistic estimators
discrepancy theory (which is used to derandomize geometric algorithms)
the exploitation of limited independence in the random variables used by the algorithm, such as the pairwise independence used in universal hashing
the use of expander graphs (or dispersers in general) to amplify a limited amount of initial randomness (this last approach is also referred to as generating pseudorandom bits from a random source, and leads to the related topic of pseudorandomness)
changing the randomized algorithm to use a hash function as a source of randomness for the algorithm's tasks, and then derandomizing the algorithm by brute-forcing all possible parameters (seeds) of the hash function. This technique is usually used to exhaustively search a sample space and making the algorithm deterministic (e.g. randomized graph algorithms)

Examples

Max cut Problem

The Max-Cut problem involves partitioning the vertices of an undirected graph into two disjoint subsets such that the number of edges between the subsets is maximized. This problem is known to be NP-complete, making exact solutions computationally challenging for large graphs.

Consider an undirected graph $G(V,E)$ on $n$ vertices and $m$ edges and a cut represented by $c(S,T)$ .

Algorithm

Let $n=|V|,m=|E|$ , a straightforward randomized algorithm to approximate a solution operates as follows:

Initialization: For each vertex in the graph, flip a fair coin.
Assignment:
1. If the coin lands heads, assign the vertex to subset S.
2. If tails, assign it to subset T.
Cut Evaluation: After all vertices are assigned, count the number of edges that have one endpoint in S and the other in T. These are the edges crossing the cut.

This algorithm runs in linear time relative to the number of vertices and, on average, produces a cut that includes at least half of the total edges in the graph. By the linearity of expectation and considering the each edge as either a cut-edge or now; it can be shown that:

$\mathbb {E} [c(S,T)]\geq {\dfrac {1}{2}}m\geq {\dfrac {1}{2}}OPT(G)$

Derandomizing the Max Cut randomised algorithm

1. Brute Force

Given a graph $G(V,E)$ , one can simply try all possible bitstrings $r\in \{0,1\}^{n}$ (i.e., all $2^{n}$ partitions), compute the cut, and return any partition with cut size $c(S,T)\geq {\dfrac {1}{2}}m$

Since the expected value over all random $r$ is ${\dfrac {1}{2}}$ , there must exist at least one $r$ such that the with cut size $c(S,T)\geq {\dfrac {1}{2}}m$ . So, this brute-force will succeed.

# Algorithm  : Brute Force Max-Cut
#    Input   : Undirected graph G = (V, E)    
#    Output  : Best bestPartition
    
    
from itertools import product

def max_cut_brute_force_unique(vertices, edges):
    n = len(vertices)
    if n == 0:
        return (set(), set()), 0

    max_cut_size = 0
    best_partition = None

    # Fix the first vertex in S to avoid symmetric duplicates
    for r in product([0, 1], repeat=n-1):
        S = {vertices[0]}  # Always put first vertex in S
        T = set()

        # Assign the rest of the vertices
        for i in range(1, n):
            if r[i-1] == 0:
                S.add(vertices[i])
            else:
                T.add(vertices[i])

        # Count the cut size
        cut_size = 0
        for u, v in edges:
            if (u in S and v in T) or (u in T and v in S):
                cut_size += 1

        # Update best partition
        if cut_size > max_cut_size:
            max_cut_size = cut_size
            best_partition = (S.copy(), T.copy())

    return best_partition, max_cut_size

Analysis of Algorithm

Time Complexity

Number of Partitions: Each vertex can be assigned to one of two subsets, leading to $2^{n}$ possible partitions for a graph with $n$ vertices. However, since swapping the subsets doesn't change the cut, the number of distinct cuts is $2^{n-1}$ .
Cut Evaluation: For each partition, we need to evaluate the number of edges crossing the cut. This involves checking each edge to see if its endpoints are in different subsets, which takes $O(m)$ time for $m$ edges.
Total Time: $O(2^{n-1}\cdot m)$

This is exponential in the number of vertices, making it impractical for large graphs.

Space Complexity

Current Partition: At any point, we need to store the current partition, which requires $O(n)$ space.
Best Partition: To keep track of the best partition found so far, we also need $O(n)$ space.
Total Space: $O(n)$

2. Pairwise Independence

Instead of using a large number of random bits, the algorithm can be derandomized using a small family of pairwise independent hash functions, resulting in a deterministic algorithm with the same guarantee.

Hash Function

We define a pairwise independent^[4] hash function:

import math
def h(n, r, x):
    r_len = math.ceil(math.log2(n+1))

    if (n < 0 or not all(b in {0,1} for b in r) or x < 1 or x > n or len(r) != r_len):
        return None
    
    x_bin = bin(x)[2:].zfill(r_len)
    total = sum(int(x_bin[i]) for i in range(r_len) if r[i] == 1)
    return total % 2

This function maps an index $x\in [1,n]$ to either 0 or 1 based on a binary seed $r\in \{0,1\}^{\lceil \log _{2}(n+1)\rceil }$

Algorithm

To derandomize Max-Cut, we iterate over all seeds $r\in \{0,1\}^{\lceil \log _{2}(n+1)\rceil }$ in the hash function family and evaluate the cut value generated by each.

import math
from itertools import product

def maxcut_hashfunctions(g: Graph):
    n = len(g._vertices)
    r_bits = math.ceil(math.log2(n+1))
    best_cut_value = 0
    
    for r in product([0, 1], repeat=r_bits):
        vertex_list = g.list_vertices()
        for x, vtx in enumerate(vertex_list):
            flip = h(n, list(r), x + 1)
            vtx._element = flip  # 0 or 1
    
        current_cut_value = 0
        for edge in g.list_edges():
            if edge._u._element != edge._v._element:
                current_cut_value += 1
    
        best_cut_value = max(best_cut_value, current_cut_value)
    
    return best_cut_value

Analysis of Algorithm

Time Complexity

The hash function uses a seed $r$ of length $\lceil \log _{2}(n+1)\rceil$ bits, where $n$ is the number of vertices. Therefore, the total number of seeds is $2^{\lceil \log _{2}(n+1)\rceil }=O(n)$
For each seed:
- Assign values to n vertices: $O(n\log n)$ due to binary conversion.
- Each edge is checked to see if its endpoints fall into opposite partitions: $O(m)$

Total Time: $O(n(n\log n+m))=O(n^{2}\log n+n\cdot m)$

This is polynomial, and significantly faster than the exponential brute-force approach.

Space Complexity

Only one partition stored at a time: $O(n)$
Cut value tracking: $O(1)$

Total Space: $O(n)$

3. Method of Conditional Probabilities^[5]

A more efficient method to derandomize the randomized Max-Cut algorithm uses the method of conditional expectations. This avoids brute force or enumerating hash functions, and deterministically constructs a partition that achieves at least the expected value of the randomized algorithm (i.e., at least half the total number of edges).

The idea is to simulate the randomized coin-flip assignment, but make each decision greedily, fixing each vertex one-by-one in a way that preserves or increases the expected size of the final cut.

The algorithm traverses each vertex and decides whether to assign it to subset S (say 1) or T (say 0), based on which choice results in a larger expected cut given the current partial assignment.

The method of conditional expectations deterministically simulates this process: instead of sampling all coin flips, it incrementally chooses the next bit (assignment of a vertex) that maximizes the conditional expected cut, conditioned on prior choices.

Since each choice preserves or improves the expected value, the final result is at least the expectation:

$c(S,T)\geq \mathbb {E} [c(S,T)]\geq {\frac {1}{2}}m$

def maxcut_conditionalexpect(g: Graph):
    for v in g._vertices:
        v._element = None # Reset all assignments
        cut_value = 0
    
    for vtx in g._vertices:
        incident_edges = vtx.get_incident_edges_list()
        neighbors = [g.opposite(vtx, edge) for edge in incident_edges]
    
        # Try vtx = 1
        vtx._element = 1
        delta_S = sum(1 for neighbor in neighbors
                      if neighbor._element is not None and neighbor._element != 1)
    
        # Try vtx = 0
        vtx._element = 0
        delta_T = sum(1 for neighbor in neighbors
                      if neighbor._element is not None and neighbor._element != 0)
    
        # Choose the assignment that improves the cut
        if delta_S >= delta_T:
            vtx._element = 1
            cut_value += delta_S
        else:
            vtx._element = 0
            cut_value += delta_B
    
    return cut_value

Analysis of Algorithm

Time Complexity

Each vertex is processed once: $O(n)$
For each vertex, we look at its incident edges and already assigned neighbors: $d_{v}$
Total time over all vertices is: $O\left(\sum _{v\in V}d_{v}\right)=O(m)$

Space Complexity

Each vertex stores an assignment (0, 1, or None): $O(n)$
No additional space is needed for copies or recursion.

This method is a polynomial-time deterministic algorithm that matches the performance guarantee of the simple randomized Max-Cut algorithm, with much better efficiency than brute-force or hash-function-based derandomization.

Notes

^ "6.046J Lecture 22: Derandomization | Design and Analysis of Algorithms | Electrical Engineering and Computer Science". MIT OpenCourseWare. Retrieved 2024-12-27.
^ Luby, Michael; Wigderson, Avi (July 1995). Pairwise Independence and Derandomization (Report). USA: University of California at Berkeley.
^ ^a ^b "Lecture Notes, Chapter 3. Basic Derandomization Techniques". people.seas.harvard.edu. Retrieved 2024-12-27.
^ "Pairwise independence", Wikipedia, 2024-03-09, retrieved 2025-05-22
^ "Conditional probability", Wikipedia, 2025-03-06, retrieved 2025-05-22

[1] "6.046J Lecture 22: Derandomization | Design and Analysis of Algorithms | Electrical Engineering and Computer Science". MIT OpenCourseWare. Retrieved 2024-12-27.

[2] Luby, Michael; Wigderson, Avi (July 1995). Pairwise Independence and Derandomization (Report). USA: University of California at Berkeley.

[:2-3] "Lecture Notes, Chapter 3. Basic Derandomization Techniques". people.seas.harvard.edu. Retrieved 2024-12-27.

[4] "Pairwise independence", Wikipedia, 2024-03-09, retrieved 2025-05-22

[5] "Conditional probability", Wikipedia, 2025-03-06, retrieved 2025-05-22

[1]

[2]

[3]

[4]

[5]

Derandomization

Examples

Max cut Problem

Algorithm

Derandomizing the Max Cut randomised algorithm

1. Brute Force

Analysis of Algorithm

2. Pairwise Independence

Hash Function

Algorithm

Analysis of Algorithm

3. Method of Conditional Probabilities[5]

Analysis of Algorithm

Notes

3. Method of Conditional Probabilities^[5]