Full-Powered Binary Search with `bisect` in Python 3.10

FanchenBao
6 min readJun 6, 2022

--

No need to roll your own binary search from scratch anymore

Introduction

Binary search is one of the most classic algorithms taught in schools and tested in interviews. It’s a very powerful tool and the concept is not difficult to grasp. However, rolling your binary search from scratch can be tricky due to many design choices (e.g. are duplicates allowed? Do we use while lo < hi or while lo <= hi? Shall we opt for lo = mid + 1 or hi = mid — 1? etc.) and edge cases. Hence, it is desirable to have a pre-written and well-tested binary search function readily available in your toolbox.

In Python, binary search can be done using the bisect module, which offers two handy functions that are guaranteed to be correct: bisect_right and bisect_left. Both functions are able to efficiently find the index to insert a target value in a sorted list. The difference is how they handle the case where the target value already exists in the list.

Although we can do binary search using the bisect module, it is not a full-powered binary search tool, because it only allows direct comparison between an element in the list and the target value. This makes bisect ill-equipped to solve optimum-searching problems. For instance, suppose we want to find the minimum budget to complete a task. The premise is that it is difficult to search for the minimum directly, but it is easy to tell whether the task can be completed within a given budget (e.g. pass the budget to an evaluation function that returns a boolean value). For an optimum-searching problem like this, binary search is usually the best option. We set up a range of budgets, pick the middle one and see if the task can be completed within that budget. If it can, we might reduce the budget even more by moving the upper bound to the middle; otherwise, we move the lower bound to the middle. Repeat this until we converge on the minimum budget.

bisect is helpless here, because in this problem, the “comparison” is not made directly between the list element and the target, but by passing the list element to an evaluation function. In fact, there is no target in the first place. Due to this drawback, bisect is not a full-powered binary search tool.

Enlightenment

I have made my peace with bisect being a super useful but not perfect tool for quite a while, until I saw this post by the legendary lee215 a few days ago, where he used bisect_left with a key keyword argument (kwarg) and a cryptic syntax to perform optimum-searching binary search. An excerpt of his code looks like this.

bisect_left(range(10**9 + 1), True, key=die) 

What the heck is that? I have never seen a key kwarg in bisect before, nor does the syntax make much sense. But after I went through the documentation and the source code, I was enlightened.

Starting from Python 3.10, a new key kwarg is added to all the functions in the bisect module. It works the same as the key kwarg in other Python sorting functions: pass a function object (or lambda) to key, and it will transform the list elements before comparison is made. The benefit of the key kwarg is that we now have the ability to introduce an external function to bisect, which may serve as the evaluation function in an optimum-searching problem. But there is still the problem of comparison. What should bisect compare to when there is no target? That’s where the cryptic syntax comes into play, where the target value is set to boolean literal True. To decode all this, let’s walk through a simple optimum-searching problem and see how we can solve it using bisect in Python 3.10.

Example 1: Square Root of An Integer

The problem wants to find the integer part of the square root of a given integer x without using math.sqrt(x) or x**0.5 . We can rephrase the problem like this: find the largest integer whose square is smaller or equal to x. This is an optimum-searching problem and can be solved using binary search. A sample solution where binary search is rolled from scratch is shown below.

def mySqrt(x):
lo, hi = 1, (1 << 31) - 1
while lo < hi:
mid = (lo + hi) // 2
if mid * mid <= x:
lo = mid + 1
else:
hi = mid
return lo - 1

Let’s translate it with bisect_left using the key kwarg. The source code of bisect_left is given below. We can see that it is already very similar to the sample solution.

# bisect_left abridged source code
def bisect_left(a, x, lo=0, hi=None, *, key=None):
# the irrelevant code is omitted
while lo < hi:
mid = (lo + hi) // 2
if key(a[mid]) < x:
lo = mid + 1
else:
hi = mid
return lo

The contribution of key is that instead of comparing a[mid] to x, we are now comparing key(a[mid]) to x. If we can set x to a value such that the output of key(a[mid]) < x is solely determined by the output of key(a[mid]), we have injected a valid evaluation function into bisect_left.

Notice these two evaluations in Python

True < True evaluates to False
False < True evaluates to True

This tells us that if we set x = True, and if key(a[mid]) evaluates to boolean, then the output of key(a[mid]) < x is always the opposite of the output of key(a[mid]). In other words, if we want key(a[mid]) < x to evaluate to true, we must make key(a[mid]) evaluate to false, and vice versa. In the current case, we want key(a[mid]) < x to be true when mid * mid <= x, so the key function must be the opposite, which is mid * mid > x.

Now we can write the solution with bisect_left. We pass three arguments.

  • The list must contain all the values from 1 to 2³¹ - 1, sorted. We use range(1 << 31). Technically, this is not a list, but it works as well.
  • The target is True.
  • The key kwarg is key=lambda mid: mid * mid > x

The solution becomes a one-liner:

def mySqrt(self, x: int) -> int:
return bisect_left(range(1 << 31), True, key=lambda mid: mid * mid > x) - 1

Example 2: First Bad Version

This is another optimum-searching problem that can be solved with binary search. Given a list of sorted version numbers, the problem wants to find the smallest version number that represents a bad version. The premise is that once a version is bad, all the versions that follow are also bad.

Binary search from scratch produces a solution like this.

class Solution:
def firstBadVersion(self, n: int) -> int:
lo, hi = 1, n
while lo < hi:
mid = (lo + hi) // 2
if not isBadVersion(mid):
lo = mid + 1
else:
hi = mid
return lo

Using bisect_left, we can produce the three arguments as follows

  • The list is range(n + 1)
  • The target is True
  • The key kwarg is ??

I will not spoil the fun of finding the key kwarg on your own. Yet, if you have trouble coming up with an appropriate key kwarg, compare this problem to Example 1 and see the similarity. These two problems are essentially the same, just worded differently.

Test Your Understanding

If you’ve come this far, maybe you’re willing to come a little further

I hope you have gained a good grasp of why bisect_left can be leveraged as a full-powered binary search tool. To further test your understanding, here are two more challenges.

  1. Notice that in Examples 1 and 2, both solutions that roll binary search from scratch use lo = 1, yet in the bisect_left solution, we use range(1 << 31) and range(n + 1), respectively, both of which start from 0. Can we use range(1, 1 << 31) and range(1, n + 1) instead? Why?
  2. Can we replace bisect_left with bisect_right? If yes, solve both problems above using bisect_right; if no, explain why.

--

--

FanchenBao
FanchenBao

Written by FanchenBao

Hi, I am from the Earth. And you?

No responses yet