Skip to content

k-Smooth Numbers and its Generalization

June 28, 2011

1. Introduction

A positive integer is called {k}-smooth if none of its prime factors are greater than {k}. Let {P(k)} be the set of prime numbers not greater than {k}, i.e.,

\displaystyle P(k)=\{n\in\mathbb{N}:n\text{ is prime number and }n\leq k\}.

In the classic Hamming problem, we are asked to print the first {n} {5}-smooth numbers (they are also called Hamming numbers) in the increasing order. Dijkstra [1] proposed the algorithm below in 1981 to solve this problem.

Dijstra’s Algorithm


{H\leftarrow \{1\}}, {k\leftarrow 0}, {R\leftarrow\emptyset}
while {k<n}
      {h\leftarrow\text{the minimum of }H}.
      {H\leftarrow 2H\cup3H\cup5H} ({aH} means the set \{ ah:a\in H\})
      Put {h} into {R}
      {k\leftarrow k+1}
endwhile

Although Dijkstra’s algorithm is designed to solve the classic Hamming problem, it is quite straightforward to extend it to solve general Hamming problem, i.e., print the first {n} {k}-smooth number in the increasing order. However, the programming language without lazy evaluation feature, it is very hard to manage it to run in {O(n)} time, provided that computing multiplication of two integers takes {O(1)} time. In this paper, we propose a new algorithm to solve Hamming problem in a more general setting described below.

Let {B} be a finite set of {m} positive integers. We arrange the elements in {B} so that {b_{1}<b_{2}<\ldots<b_{m}}. We say that {B} is a smooth base if for each element {b_{k}}, it has a prime factor {p_{k}} whose is not included in any other element. Formally, {B} is a smooth base if and only if

\displaystyle \forall k\,\exists p_{k}:\,(p_{k}\mid b_{k})\wedge(\forall q\neq k:\, p\nmid b_{q}),

where {p\mid b_{k}} means there is a integer {t} such that {b_{k}=pt}, and {p\nmid b_{k}} means no such integer {t} exsits. Such a prime factor {p_{k}} is called the key factor of {b_{k}}. In other words, {B} is a smooth base if and only if every element has at least one key factor.

Given a smooth base {B=\{b_{1},b_{2},\ldots,b_{m}\}}, we use {H(B)} to denote the set containing all the integers in the form

\displaystyle b_{1}^{x_{1}}b_{2}^{x_{2}}\cdots b_{m}^{x_{m}},

where {x_{1},x_{2},\ldots,x_{m}} are nonnegative integers. We also say that {H(B)} is generated by {B}, and numbers in {B} are called generalized Hamming numbers. It is easy to see that every number in {H(B)} can be uniquely represented by this way, i.e., every number in {H(B)} correpsonds to a unique tupple {(x_{1},x_{2},\ldots,x_{m})}. Thus, once the smooth base is fixed, we also simply use {(x_{1},x_{2},\ldots,x_{m})} to denote {b_{1}^{x_{1}}b_{2}^{x_{2}}\cdots b_{m}^{x_{m}}}.

Note that {P(k)} is a smooth base for all {k\geq2}. For example, if {B=P(5)=\{2,3,5\}}, then {H(B)} is the collection of all {5}-smooth numbers. Thus, computing the first {n} {k}-smooth numbers is just a special case of computing first {n} generalized Hamming numbers generated by a smooth given smooth base.

2. Algorithm

Examinating the Dijkstra’s algorithm closely, we can see that there are two essential operations: finding the minimum of {H} and merging {2H}, {3H} and {5H}. For the merging operation, if the three components merged are disjoint, then it is very easy. Unfortunately, they are not. For example, {2\times3} belong to {2H} and {3H} for {H=\{2,3,5\}}. To avoid such duplicating, one possible way is to resolve the ambiguity: assign {2\times3} to either {2H} or {3H}, but not both. This can be done by adopting a kind of “Maximum Principle”: if {a\in sH} and {a\in tH} at the same time, and {s<t}, we only assign {a} to {tH}. For {a} belongs to more than two such sets {iH}, we assign {a} to the one with largest {i}. Inspired by this idea, we have following algorithm to compute first {n} generalized Hamming numbers generated by a smooth base {B=\{b_{1},b_{2},\ldots,b_{m}\}}.

k-Smooth Algorithm


Initialize queues {Q_{1},Q_{2},\ldots,Q_{m}} to be empty
{R\leftarrow\emptyset}
for {t} from {1} to {m}
      push {b_{t}} into queue {Q_{t}}
endfor
{k\leftarrow0}
while {k<n} do
      Let {h} be the minimum element in the front of each queue {Q_{i}\,(1\leq i\leq m)} and assume {h\in Q_{j}}
      for {t} from {j} to {m}
            push {h\cdot b_{t}} into queue {Q_{t}}
      endfor
      Remove {h} from {Q_{j}}.
      Put {h} into output sequence {R}
      {k\leftarrow k+1}
endwhile

Theorem 1: When k-Smooth Algorithm terminates, {R} is the sequence of the first {n} generalized Hamming numbers generated by {B}.

To prove the theorem, we shall firstly establish two impartant facts. The first one shows that {Q_{i}} and {Q_{j}} are disjoint if {i\neq j}.

Fact 1: If {i\neq j}, then {Q_{i}\cap Q_{j}=\emptyset}.

Proof: We first show by induction that any element in {Q_{1}} does not contain any key factors of {b_{2},b_{3},\ldots,b_{m}}. At the beginning, {Q_{1}=\{b_{1}\}} and by the definition of smooth base {B}, {b_{1}} does not have key factors of {b_{2},b_{3},\ldots,b_{m}}. Now assume after the first {p} loops of the while block, elements in {Q_{1}} do not contain key factors of {b_{2},b_{3},\ldots,b_{m}}. If at the {p+1}st loop of the while block, a new element {h\cdot b_{1}} is pushed into {Q_{1}}, then according to the algorithm, {h} must come from {Q_{1}} and hence {h} does not contain key factors of {b_{2},b_{3},\ldots,b_{m}}. This also implies that {h\cdot b_{1}} does not contain key factors of {b_{2},b_{3},\ldots,b_{m}} and all members of {Q_{1}} do not contain key factors of {b_{2},b_{3},\ldots,b_{m}} in the end of the {p+1}st loop. The statement is trivially true if there is no element pushed into {Q_{1}} in {p+1}st loop.

Again, by induction we show that members in {Q_{i}} ({1\leq i<m}) do not contain key factors in {b_{i+1},b_{i+2},\ldots,b_{m}}. For {i=1}, the statement holds by the argument above. Now assume the statement holds for all {1\leq i<p<m}. By the similar argument used to prove the statement for {Q_{1}}, we can show that the statement also holds for {Q_{p}}. Therefore, the statement holds for {1\leq i<m}.

By the algorithm, we also know that elements in {Q_{i}} {(1\leq i\leq m)} includes key factor of {b_{i}}. Now let {1\leq i<j\leq m}. Since members in {Q_{i}} do not contain any key factor of {b_{j}} while all members in {Q_{j}} have a key factor of {b_{j}}, no element in {Q_{i}} can belong to {Q_{j}} and hence {Q_{i}\cap Q_{j}=\emptyset}. \Box

Now we have known that elements from different queues are distinct. To demonstrate that no duplicated numbers will be added to {R}, we need to show that elements in the same queue is also distinct. Actually, we manage to show a stronger conclusion: elements in the same queue are pushed into the queue by the strictly increasing order. We also obtain an important fact at the same time, which shows every generalized Hamming number will be pushed into some queue. This guranttees than no generalized Hamming numbers are skipped by the algorithm.

Before starting the next fact, we define followers of a generalized Hamming number {(x_{1},x_{2},\ldots,x_{m})} as the {m} numbers {(x_{1}+1,x_{2},\ldots,x_{m}),\,(x_{1},x_{2}+1,\ldots,x_{m}),\ldots,(x_{1},x_{2},\ldots,x_{m}+1)} .

Fact 2: At any time, elements in {Q_{i}} ({1\leq i\leq m}) are strictly increasing and hence distinct. Also, if at step {p}, (x_{1},x_{2},\ldots,x_{m}) is removed from some queue, then each of its {m} followers, either has been already pushed into some queue at some step {q} {(q<p}), or will be pushed into some queue at the step {p}.

Proof: Again, we prove it by induction. Obviously, at the very beginning of the first loop of the while block, the statement above holds. Assume the statement is correct at the step {p-1}. Suppose at the step {p}, {g=(x_{1},x_{2},\ldots,x_{m})} is removed from queue {Q_{j}}.

For {1\leq i<j}, since {(x_{1},\ldots,x_{i}+1,x_{i+1},\ldots,x_{j}-1,x_{j+1},\ldots x_{m})} is smaller than {g}, it was removed at some step {l<p}, according to the induction assumption. Therefore, by the assumption, its follower {(x_{1},\ldots,x_{i}+1,x_{i+1},\ldots,x_{m})}, which is also a follower of {g}, was pushed into some queue before the step {p}. For {i\geq j}, the follower {(x_{1},\ldots,x_{i}+1,\ldots,x_{m})} is pushed into {Q_{i}} at the step {p}. Therefore, the second half of the statement still holds for the step {p}.

For each {i} such that {j\leq i\leq m}, there is a new element {(x_{1},\ldots,x_{i}+1,\ldots,x_{m})} pushed into {Q_{i}}. If {Q_{i}} contains only one element {(x_{1},\ldots,x_{i}+1,\ldots,x_{m})} , then {Q_{i}} is increasing trivially. Now assume {Q_{i}} contains more than one element. Let {(y_{1},\ldots,y_{i}+1,\ldots,y_{m})} be any one in {Q_{i}} other than {(x_{1},\ldots,x_{i}+1,\ldots,x_{m})}. According to the algorithm, {(y_{1},\ldots,y_{i},\ldots,y_{m})} was pushed into {Q_{i}} before {g} because the element {(y_{1},\ldots,y_{i}-1,\ldots,y_{m})} was removed from some queue before {g}. Hence {(y_{1},\ldots,y_{i}-1,\ldots,y_{m})<g}, and further {(y_{1},\ldots,y_{i}-1,\ldots,y_{m})\times b_{i}<g\times b_{i}}. That is, {(y_{1},\ldots,y_{i},\ldots,y_{m})<(x_{1},\ldots,x_{i}+1,\ldots,x_{m})}, and {Q_{i}} is strictly increasing. \Box

Given these two facts established, it is quite straightforward to see the correctness of the statement in Theorem 1.

Proof: Since numbers in each queue is strictly increasing and numbers in all queues are distinct, the outputed sequence is strictly increasing and hence has no duplicates. Also, since no number will be skipped, the output sequence must contain the first {n} generalized Hamming numbers generated by the base {B} when the algorithm terminates. \Box

A PDF version of this article can be found here.

References
[1] Edsger W. Dijkstra. Hamming’s exercise in sasl. 1981.

No comments yet

Leave a comment