User:Varun Parbhu/sandbox
Derandomization
[edit]Randomness can be viewed as a resource, like space and time. Derandomization is then the process of removing randomness (or using as little of it as possible).[1][2] It is not currently known[as of?] if all algorithms can be derandomized without significantly increasing their running time.[3] For instance, in computational complexity, it is unknown whether P = BPP,[3] i.e., we do not know whether we can take an arbitrary randomized algorithm that runs in polynomial time with a small error probability and derandomize it to run in polynomial time without using randomness.
There are specific methods that can be employed to derandomize particular randomized algorithms:
- the method of conditional probabilities, and its generalization, pessimistic estimators
- discrepancy theory (which is used to derandomize geometric algorithms)
- the exploitation of limited independence in the random variables used by the algorithm, such as the pairwise independence used in universal hashing
- the use of expander graphs (or dispersers in general) to amplify a limited amount of initial randomness (this last approach is also referred to as generating pseudorandom bits from a random source, and leads to the related topic of pseudorandomness)
- changing the randomized algorithm to use a hash function as a source of randomness for the algorithm's tasks, and then derandomizing the algorithm by brute-forcing all possible parameters (seeds) of the hash function. This technique is usually used to exhaustively search a sample space and making the algorithm deterministic (e.g. randomized graph algorithms)
Examples
[edit]Max cut Problem
[edit]The Max-Cut problem involves partitioning the vertices of an undirected graph into two disjoint subsets such that the number of edges between the subsets is maximized. This problem is known to be NP-complete, making exact solutions computationally challenging for large graphs.
Consider an undirected graph on vertices and edges and a cut represented by .
Algorithm
[edit]Let , a straightforward randomized algorithm to approximate a solution operates as follows:
- Initialization: For each vertex in the graph, flip a fair coin.
- Assignment:
- If the coin lands heads, assign the vertex to subset S.
- If tails, assign it to subset T.
- Cut Evaluation: After all vertices are assigned, count the number of edges that have one endpoint in S and the other in T. These are the edges crossing the cut.
This algorithm runs in linear time relative to the number of vertices and, on average, produces a cut that includes at least half of the total edges in the graph. By the linearity of expectation and considering the each edge as either a cut-edge or now; it can be shown that:
Derandomizing the Max Cut randomised algorithm
[edit]1. Brute Force
[edit]Given a graph , one can simply try all possible bitstrings (i.e., all partitions), compute the cut, and return any partition with cut size
Since the expected value over all random is , there must exist at least one such that the with cut size . So, this brute-force will succeed.
# Algorithm : Brute Force Max-Cut
# Input : Undirected graph G = (V, E)
# Output : Best bestPartition
from itertools import product
def max_cut_brute_force_unique(vertices, edges):
n = len(vertices)
if n == 0:
return (set(), set()), 0
max_cut_size = 0
best_partition = None
# Fix the first vertex in S to avoid symmetric duplicates
for r in product([0, 1], repeat=n-1):
S = {vertices[0]} # Always put first vertex in S
T = set()
# Assign the rest of the vertices
for i in range(1, n):
if r[i-1] == 0:
S.add(vertices[i])
else:
T.add(vertices[i])
# Count the cut size
cut_size = 0
for u, v in edges:
if (u in S and v in T) or (u in T and v in S):
cut_size += 1
# Update best partition
if cut_size > max_cut_size:
max_cut_size = cut_size
best_partition = (S.copy(), T.copy())
return best_partition, max_cut_size
Analysis of Algorithm
[edit]Time Complexity
- Number of Partitions: Each vertex can be assigned to one of two subsets, leading to possible partitions for a graph with vertices. However, since swapping the subsets doesn't change the cut, the number of distinct cuts is .
- Cut Evaluation: For each partition, we need to evaluate the number of edges crossing the cut. This involves checking each edge to see if its endpoints are in different subsets, which takes time for edges.
- Total Time:
This is exponential in the number of vertices, making it impractical for large graphs.
Space Complexity
- Current Partition: At any point, we need to store the current partition, which requires space.
- Best Partition: To keep track of the best partition found so far, we also need space.
- Total Space:
2. Pairwise Independence
[edit]Instead of using a large number of random bits, the algorithm can be derandomized using a small family of pairwise independent hash functions, resulting in a deterministic algorithm with the same guarantee.
Hash Function
[edit]We define a pairwise independent[4] hash function:
import math
def h(n, r, x):
r_len = math.ceil(math.log2(n+1))
if (n < 0 or not all(b in {0,1} for b in r) or x < 1 or x > n or len(r) != r_len):
return None
x_bin = bin(x)[2:].zfill(r_len)
total = sum(int(x_bin[i]) for i in range(r_len) if r[i] == 1)
return total % 2
This function maps an index to either 0 or 1 based on a binary seed
Algorithm
[edit]To derandomize Max-Cut, we iterate over all seeds in the hash function family and evaluate the cut value generated by each.
import math
from itertools import product
def maxcut_hashfunctions(g: Graph):
n = len(g._vertices)
r_bits = math.ceil(math.log2(n+1))
best_cut_value = 0
for r in product([0, 1], repeat=r_bits):
vertex_list = g.list_vertices()
for x, vtx in enumerate(vertex_list):
flip = h(n, list(r), x + 1)
vtx._element = flip # 0 or 1
current_cut_value = 0
for edge in g.list_edges():
if edge._u._element != edge._v._element:
current_cut_value += 1
best_cut_value = max(best_cut_value, current_cut_value)
return best_cut_value
Analysis of Algorithm
[edit]Time Complexity
- The hash function uses a seed of length bits, where is the number of vertices. Therefore, the total number of seeds is
- For each seed:
- Assign values to n vertices: due to binary conversion.
- Each edge is checked to see if its endpoints fall into opposite partitions:
Total Time:
This is polynomial, and significantly faster than the exponential brute-force approach.
Space Complexity
- Only one partition stored at a time:
- Cut value tracking:
Total Space:
A more efficient method to derandomize the randomized Max-Cut algorithm uses the method of conditional expectations. This avoids brute force or enumerating hash functions, and deterministically constructs a partition that achieves at least the expected value of the randomized algorithm (i.e., at least half the total number of edges).
The idea is to simulate the randomized coin-flip assignment, but make each decision greedily, fixing each vertex one-by-one in a way that preserves or increases the expected size of the final cut.
The algorithm traverses each vertex and decides whether to assign it to subset S (say 1) or T (say 0), based on which choice results in a larger expected cut given the current partial assignment.
The method of conditional expectations deterministically simulates this process: instead of sampling all coin flips, it incrementally chooses the next bit (assignment of a vertex) that maximizes the conditional expected cut, conditioned on prior choices.
Since each choice preserves or improves the expected value, the final result is at least the expectation:
def maxcut_conditionalexpect(g: Graph):
for v in g._vertices:
v._element = None # Reset all assignments
cut_value = 0
for vtx in g._vertices:
incident_edges = vtx.get_incident_edges_list()
neighbors = [g.opposite(vtx, edge) for edge in incident_edges]
# Try vtx = 1
vtx._element = 1
delta_S = sum(1 for neighbor in neighbors
if neighbor._element is not None and neighbor._element != 1)
# Try vtx = 0
vtx._element = 0
delta_T = sum(1 for neighbor in neighbors
if neighbor._element is not None and neighbor._element != 0)
# Choose the assignment that improves the cut
if delta_S >= delta_T:
vtx._element = 1
cut_value += delta_S
else:
vtx._element = 0
cut_value += delta_B
return cut_value
Analysis of Algorithm
[edit]Time Complexity
- Each vertex is processed once:
- For each vertex, we look at its incident edges and already assigned neighbors:
- Total time over all vertices is:
Space Complexity
- Each vertex stores an assignment (0, 1, or None):
- No additional space is needed for copies or recursion.
This method is a polynomial-time deterministic algorithm that matches the performance guarantee of the simple randomized Max-Cut algorithm, with much better efficiency than brute-force or hash-function-based derandomization.
Notes
[edit]- ^ "6.046J Lecture 22: Derandomization | Design and Analysis of Algorithms | Electrical Engineering and Computer Science". MIT OpenCourseWare. Retrieved 2024-12-27.
- ^ Luby, Michael; Wigderson, Avi (July 1995). Pairwise Independence and Derandomization (Report). USA: University of California at Berkeley.
- ^ a b "Lecture Notes, Chapter 3. Basic Derandomization Techniques". people.seas.harvard.edu. Retrieved 2024-12-27.
- ^ "Pairwise independence", Wikipedia, 2024-03-09, retrieved 2025-05-22
- ^ "Conditional probability", Wikipedia, 2025-03-06, retrieved 2025-05-22