What is HEAP?


Start from one Leetcode problem

Below is the problem “Remove Stones to Minimize the Total” from Leetcode weekly contest 254:

You are given a 0-indexed integer array piles, where piles[i] represents the number of stones in the ith pile, and an integer k. You should apply the following operation exactly k times:

Choose any piles[i] and remove $floor(piles[i] / 2)$ stones from it.

  • Notice that you can apply the operation on the same pile more than once.

  • Return the minimum possible total number of stones remaining after applying the k operations.

  • floor(x) is the greatest integer that is smaller than or equal to x.

Constraints:

  1. 1 <= piles.length <= $10^5$
  2. 1 <= piles[i] <= $10^4$
  3. 1 <= k <= $10^5$

First glance of the problem

As the problem mentions, we need to minimized the total piles after we remove corresponding piles. So everytime we should choose the maximum piles among the candidates. Upon this, the first thought would be to sort the piles list after each operation, however, the time complexity of the fastest sorting algorithm, such as Quicksort and Heapsort, the time complexity is $O(nlog(n))$. After we combine the k times operation, the optimal time complexity is $O(nklog(n))$, because n and k are comparable. And this way is not quite good for this scenario.

Then, we realize that we do not need to sort the list everytime, because the only element we need to operate with is the maximum piles. But we need to be concerned about how to manage the piles list after each operation to fetch the maximum element accurately. Therefore, the heap data structure comes to help us with this problem.

Heap

Heap is a general term for one special data structure in computer science, which is usually regarded as a complete binary tree. Given any node P and C in the heap, if P is the parent node of C, then the value of P will be less than or equal to $($or greater than or equal to$)$ the value of C. The top node in the stack is called the root node, and the root node itself has no parent node. There are two common heaps: the Min-heap and the Max-heap. In the Min-heap, the value of the parent node is always less than or equal to the value of the child node, while in the Max-heap, the value of the parent node is always greater than or equal to the value of the child node.

Two common Heaps: Min-heap $($left$)$, Max-heap $($right$)$  

Deletion and Insertion values

  1. Deletion process: Since deleting any element in the middle of the heap is expensive, we can simply replace the element to be deleted with the last element and delete the last element in the heap.
  • Replace the root or element to be deleted with the last element.
  • Remove the last element from the heap.
  • The last element is now placed at the position of the root node. Therefore, it may not obey the heap properties. Then, heapify the last node placed at the position of root.
  1. Insertion process: Inserting elements into the heap is a similar way to the deletion process above.
  • First increase the heap size by 1 so that it can store new elements.
  • Insert new elements at the end of the heap.
  • This new element may distort the Heap property of its parent element. Therefore, in order to maintain the properties of Heap, the newly inserted element is piled up in a bottom-up method.

Implementation of Min-heap

class MinHeap:
    def __init__(self): # The heap list is initialized with a value
        self.heap_list = [0]
        self.current_size = 0
 
    def sift_up(self, i): # Moves the value up in the tree to maintain the heap property.
        while i // 2 > 0: # While the element is not the root or the left element
            
            if self.heap_list[i] < self.heap_list[i // 2]: # If the element is less than its parent swap the elements
                self.heap_list[i], self.heap_list[i // 2] = self.heap_list[i // 2], self.heap_list[i] # Move the index to the parent to keep the properties
            i = i // 2
 
    def insert(self, k): # Inserts a value into the heap       
        self.heap_list.append(k) # Append the element to the heap       
        self.current_size += 1 # Increase the size of the heap     
        self.sift_up(self.current_size) # Move the element to its position from bottom to the top
 
    def sift_down(self, i):
        
        while (i * 2) <= self.current_size: # if the current node has at least one child
            mc = self.min_child(i) # Get the index of the min child of the current node
            if self.heap_list[i] > self.heap_list[mc]: # Swap the values of the current element is greater than its min child
                self.heap_list[i], self.heap_list[mc] = self.heap_list[mc], self.heap_list[i]
            i = mc
 
    def min_child(self, i):
        if (i * 2)+1 > self.current_size: # If the current node has only one child, return the index of the unique child
            return i * 2
        else: 
            if self.heap_list[i*2] < self.heap_list[(i*2)+1]: # Herein the current node has two children
                return i * 2 # Return the index of the min child according to their values
            else:
                return (i * 2) + 1
 
    def delete_min(self):
        if len(self.heap_list) == 1: # Equal to 1 since the heap list was initialized with a value
            return 'Empty heap'
        root = self.heap_list[1] # Get root of the heap (The min value of the heap)       
        self.heap_list[1] = self.heap_list[self.current_size] # Move the last value of the heap to the root       
        self.heap_list, _ = self.heap_list # Pop the last value since a copy was set on the root 
        self.current_size -= 1 # Decrease the size of the heap        
        self.sift_down(1) # Move down the root (value at index 1) to keep the heap property
        return root # Return the min value of the heap

Heap queue algorithm in Python

Details seen in heapq module page.

To create a heap, use a list initialized to [], or transform a populated list into a heap via function heapify.

  • heapq.heappush((heap, item)): Push the value item onto the heap, maintaining the heap invariant.
  • heapq.heappop((heap)): Pop and return the smallest item from the heap, maintaining the heap invariant. To access the smallest item without popping it, use heap[0].
  • heapq.heappushpop((heap, item)): Push item on the heap, then pop and return the smallest item from the heap. The combined action runs more efficiently than heappush followed by a separate call to heappop.
  • heapq.heapify((x)): Transform list x into a heap, in-place, in linear time.
  • heapq.heapreplace((heap, item)): Pop and return the smallest item from the heap, and also push the new item. The heap size doesn’t change.
  • heapq.merge((*iterables, key=None, reverse=False)): Merge multiple sorted inputs into a single sorted output, for example, merge timestamped entries from multiple log files. Returns an iterator over the sorted values.

Final solution in Python

First, we need to transform our piles list into a heap queue, where we use heap = [-i for i in piles]. The reason we assign -i for each element in the piles is that we hope to update the maximum value in the piles in a Min-heao by heapq.heapify. So the reversion of the sign is necessary.

Then, for each loop, we fetch the “minimum” value by heapq.heappop, which is the root element and update it by the rules: remove $floor(piles[i] / 2)$ stones from it.

Lastly, use heapq.heappush to restore the updated element back to the heap queue.

class Solution:
    def minStoneSum(self, piles: List[int], k: int) -> int:
        heap = [-i for i in piles]
        heapq.heapify(heap)
        for i in range(k):
            num = -heapq.heappop(heap)
            num -= num//2
            heapq.heappush(heap,-num)
        return -sum(heap)

 
The time complexity of heappop and heappush are both $O(log(n))$, so the total time complexity is $O(klog(n))$, which is the optimal one.

Reference

Heap introduction: https://en.wikipedia.org/wiki/Heap_$($data_structure$)$
heapq — Heap queue algorithm: https://docs.python.org/3/library/heapq.html
Implementation of Min-heap: https://www.educative.io/edpresso/heap-implementation-in-python


Author: Zhenchen Hong
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Zhenchen Hong !
  Table of Contents