Skip to content

k-Smooth Numbers and its Generalization

June 28, 2011

1. Introduction

A positive integer is called {k}-smooth if none of its prime factors are greater than {k}. Let {P(k)} be the set of prime numbers not greater than {k}, i.e.,

\displaystyle P(k)=\{n\in\mathbb{N}:n\text{ is prime number and }n\leq k\}.

In the classic Hamming problem, we are asked to print the first {n} {5}-smooth numbers (they are also called Hamming numbers) in the increasing order. Dijkstra [1] proposed the algorithm below in 1981 to solve this problem.

Dijstra’s Algorithm


{H\leftarrow \{1\}}, {k\leftarrow 0}, {R\leftarrow\emptyset}
while {k<n}
      {h\leftarrow\text{the minimum of }H}.
      {H\leftarrow 2H\cup3H\cup5H} ({aH} means the set \{ ah:a\in H\})
      Put {h} into {R}
      {k\leftarrow k+1}
endwhile

Although Dijkstra’s algorithm is designed to solve the classic Hamming problem, it is quite straightforward to extend it to solve general Hamming problem, i.e., print the first {n} {k}-smooth number in the increasing order. However, the programming language without lazy evaluation feature, it is very hard to manage it to run in {O(n)} time, provided that computing multiplication of two integers takes {O(1)} time. In this paper, we propose a new algorithm to solve Hamming problem in a more general setting described below.

Let {B} be a finite set of {m} positive integers. We arrange the elements in {B} so that {b_{1}<b_{2}<\ldots<b_{m}}. We say that {B} is a smooth base if for each element {b_{k}}, it has a prime factor {p_{k}} whose is not included in any other element. Formally, {B} is a smooth base if and only if

\displaystyle \forall k\,\exists p_{k}:\,(p_{k}\mid b_{k})\wedge(\forall q\neq k:\, p\nmid b_{q}),

where {p\mid b_{k}} means there is a integer {t} such that {b_{k}=pt}, and {p\nmid b_{k}} means no such integer {t} exsits. Such a prime factor {p_{k}} is called the key factor of {b_{k}}. In other words, {B} is a smooth base if and only if every element has at least one key factor.

Given a smooth base {B=\{b_{1},b_{2},\ldots,b_{m}\}}, we use {H(B)} to denote the set containing all the integers in the form

\displaystyle b_{1}^{x_{1}}b_{2}^{x_{2}}\cdots b_{m}^{x_{m}},

where {x_{1},x_{2},\ldots,x_{m}} are nonnegative integers. We also say that {H(B)} is generated by {B}, and numbers in {B} are called generalized Hamming numbers. It is easy to see that every number in {H(B)} can be uniquely represented by this way, i.e., every number in {H(B)} correpsonds to a unique tupple {(x_{1},x_{2},\ldots,x_{m})}. Thus, once the smooth base is fixed, we also simply use {(x_{1},x_{2},\ldots,x_{m})} to denote {b_{1}^{x_{1}}b_{2}^{x_{2}}\cdots b_{m}^{x_{m}}}.

Note that {P(k)} is a smooth base for all {k\geq2}. For example, if {B=P(5)=\{2,3,5\}}, then {H(B)} is the collection of all {5}-smooth numbers. Thus, computing the first {n} {k}-smooth numbers is just a special case of computing first {n} generalized Hamming numbers generated by a smooth given smooth base.

2. Algorithm

Examinating the Dijkstra’s algorithm closely, we can see that there are two essential operations: finding the minimum of {H} and merging {2H}, {3H} and {5H}. For the merging operation, if the three components merged are disjoint, then it is very easy. Unfortunately, they are not. For example, {2\times3} belong to {2H} and {3H} for {H=\{2,3,5\}}. To avoid such duplicating, one possible way is to resolve the ambiguity: assign {2\times3} to either {2H} or {3H}, but not both. This can be done by adopting a kind of “Maximum Principle”: if {a\in sH} and {a\in tH} at the same time, and {s<t}, we only assign {a} to {tH}. For {a} belongs to more than two such sets {iH}, we assign {a} to the one with largest {i}. Inspired by this idea, we have following algorithm to compute first {n} generalized Hamming numbers generated by a smooth base {B=\{b_{1},b_{2},\ldots,b_{m}\}}.

k-Smooth Algorithm


Initialize queues {Q_{1},Q_{2},\ldots,Q_{m}} to be empty
{R\leftarrow\emptyset}
for {t} from {1} to {m}
      push {b_{t}} into queue {Q_{t}}
endfor
{k\leftarrow0}
while {k<n} do
      Let {h} be the minimum element in the front of each queue {Q_{i}\,(1\leq i\leq m)} and assume {h\in Q_{j}}
      for {t} from {j} to {m}
            push {h\cdot b_{t}} into queue {Q_{t}}
      endfor
      Remove {h} from {Q_{j}}.
      Put {h} into output sequence {R}
      {k\leftarrow k+1}
endwhile

Theorem 1: When k-Smooth Algorithm terminates, {R} is the sequence of the first {n} generalized Hamming numbers generated by {B}.

To prove the theorem, we shall firstly establish two impartant facts. The first one shows that {Q_{i}} and {Q_{j}} are disjoint if {i\neq j}.

Fact 1: If {i\neq j}, then {Q_{i}\cap Q_{j}=\emptyset}.

Proof: We first show by induction that any element in {Q_{1}} does not contain any key factors of {b_{2},b_{3},\ldots,b_{m}}. At the beginning, {Q_{1}=\{b_{1}\}} and by the definition of smooth base {B}, {b_{1}} does not have key factors of {b_{2},b_{3},\ldots,b_{m}}. Now assume after the first {p} loops of the while block, elements in {Q_{1}} do not contain key factors of {b_{2},b_{3},\ldots,b_{m}}. If at the {p+1}st loop of the while block, a new element {h\cdot b_{1}} is pushed into {Q_{1}}, then according to the algorithm, {h} must come from {Q_{1}} and hence {h} does not contain key factors of {b_{2},b_{3},\ldots,b_{m}}. This also implies that {h\cdot b_{1}} does not contain key factors of {b_{2},b_{3},\ldots,b_{m}} and all members of {Q_{1}} do not contain key factors of {b_{2},b_{3},\ldots,b_{m}} in the end of the {p+1}st loop. The statement is trivially true if there is no element pushed into {Q_{1}} in {p+1}st loop.

Again, by induction we show that members in {Q_{i}} ({1\leq i<m}) do not contain key factors in {b_{i+1},b_{i+2},\ldots,b_{m}}. For {i=1}, the statement holds by the argument above. Now assume the statement holds for all {1\leq i<p<m}. By the similar argument used to prove the statement for {Q_{1}}, we can show that the statement also holds for {Q_{p}}. Therefore, the statement holds for {1\leq i<m}.

By the algorithm, we also know that elements in {Q_{i}} {(1\leq i\leq m)} includes key factor of {b_{i}}. Now let {1\leq i<j\leq m}. Since members in {Q_{i}} do not contain any key factor of {b_{j}} while all members in {Q_{j}} have a key factor of {b_{j}}, no element in {Q_{i}} can belong to {Q_{j}} and hence {Q_{i}\cap Q_{j}=\emptyset}. \Box

Now we have known that elements from different queues are distinct. To demonstrate that no duplicated numbers will be added to {R}, we need to show that elements in the same queue is also distinct. Actually, we manage to show a stronger conclusion: elements in the same queue are pushed into the queue by the strictly increasing order. We also obtain an important fact at the same time, which shows every generalized Hamming number will be pushed into some queue. This guranttees than no generalized Hamming numbers are skipped by the algorithm.

Before starting the next fact, we define followers of a generalized Hamming number {(x_{1},x_{2},\ldots,x_{m})} as the {m} numbers {(x_{1}+1,x_{2},\ldots,x_{m}),\,(x_{1},x_{2}+1,\ldots,x_{m}),\ldots,(x_{1},x_{2},\ldots,x_{m}+1)} .

Fact 2: At any time, elements in {Q_{i}} ({1\leq i\leq m}) are strictly increasing and hence distinct. Also, if at step {p}, (x_{1},x_{2},\ldots,x_{m}) is removed from some queue, then each of its {m} followers, either has been already pushed into some queue at some step {q} {(q<p}), or will be pushed into some queue at the step {p}.

Proof: Again, we prove it by induction. Obviously, at the very beginning of the first loop of the while block, the statement above holds. Assume the statement is correct at the step {p-1}. Suppose at the step {p}, {g=(x_{1},x_{2},\ldots,x_{m})} is removed from queue {Q_{j}}.

For {1\leq i<j}, since {(x_{1},\ldots,x_{i}+1,x_{i+1},\ldots,x_{j}-1,x_{j+1},\ldots x_{m})} is smaller than {g}, it was removed at some step {l<p}, according to the induction assumption. Therefore, by the assumption, its follower {(x_{1},\ldots,x_{i}+1,x_{i+1},\ldots,x_{m})}, which is also a follower of {g}, was pushed into some queue before the step {p}. For {i\geq j}, the follower {(x_{1},\ldots,x_{i}+1,\ldots,x_{m})} is pushed into {Q_{i}} at the step {p}. Therefore, the second half of the statement still holds for the step {p}.

For each {i} such that {j\leq i\leq m}, there is a new element {(x_{1},\ldots,x_{i}+1,\ldots,x_{m})} pushed into {Q_{i}}. If {Q_{i}} contains only one element {(x_{1},\ldots,x_{i}+1,\ldots,x_{m})} , then {Q_{i}} is increasing trivially. Now assume {Q_{i}} contains more than one element. Let {(y_{1},\ldots,y_{i}+1,\ldots,y_{m})} be any one in {Q_{i}} other than {(x_{1},\ldots,x_{i}+1,\ldots,x_{m})}. According to the algorithm, {(y_{1},\ldots,y_{i},\ldots,y_{m})} was pushed into {Q_{i}} before {g} because the element {(y_{1},\ldots,y_{i}-1,\ldots,y_{m})} was removed from some queue before {g}. Hence {(y_{1},\ldots,y_{i}-1,\ldots,y_{m})<g}, and further {(y_{1},\ldots,y_{i}-1,\ldots,y_{m})\times b_{i}<g\times b_{i}}. That is, {(y_{1},\ldots,y_{i},\ldots,y_{m})<(x_{1},\ldots,x_{i}+1,\ldots,x_{m})}, and {Q_{i}} is strictly increasing. \Box

Given these two facts established, it is quite straightforward to see the correctness of the statement in Theorem 1.

Proof: Since numbers in each queue is strictly increasing and numbers in all queues are distinct, the outputed sequence is strictly increasing and hence has no duplicates. Also, since no number will be skipped, the output sequence must contain the first {n} generalized Hamming numbers generated by the base {B} when the algorithm terminates. \Box

A PDF version of this article can be found here.

References
[1] Edsger W. Dijkstra. Hamming’s exercise in sasl. 1981.

An Integer Decomposition Problem

October 12, 2010

There are a lot of integer decomposition problems. Here is one of them: decompose an integer N into k different integers so that the sum of them is N and the multiplication of them is maximized. A vivid version of this problem described below.

Parliament (source: Northeastern Europe 1998): New convocation of The Fool Land’s Parliament consists of {N} delegates. According to the present regulation delegates should be divided into disjoint groups of different sizes and every day each group has to send one delegate to the conciliatory committee. The composition of the conciliatory committee should be different each day. The Parliament works only while this can be accomplished. You are to write a program that will determine how many delegates should contain each group in order for Parliament to work as long as possible.

Input:  The input file contains a single integer {N} ({5<=N<=1000} ).

Output: Write to the output file the sizes of groups that allow the Parliament to work for the maximal possible time. These sizes should be printed on a single line in ascending order and should be separated by spaces.

Sample Input

7

Sample Output

3 4

Before introduce the solution, let us review an inequality and the multiplication principle described below.

Theorem 1 (Inequality of Arithmetric and Geometric Means) For any list of {n} nonnegative numbers {x_1,x_2,\ldots,x_n}, we have\displaystyle x_1x_2\ldots x_n {\leq \left(\frac{x_1+x_2,\ldots,x_n}{n}\right)}^n

and the equality holds if and only if {x_1=x_2=\ldots=x_n}.

Theorem 2 (Multiplication Principle) If a task consists of {k} different operations {o_1,o_2,\ldots,o_k}, and each operation {o_i} can be done by {m_i} ways. Then, there are in total {m_1m_2\ldots m_k} different ways to complete the task.

Back to our problem on hand. Assume that all delegates are separated into {k} disjoint groups, each of which has {x_i} delegates. Then, according to multiplication principle, there are {M=x_1x_2\ldots x_k} different ways to compose a conciliatory committee. According to the inequality of arithmetric and geometric means, {M} reaches its maximal value when {x_1=x_2=\ldots =x_k}.

Unfortunately, for any {x_i} and {x_j}, if {i \neq j}, then {x_i \neq x_j}, and {x_i} should be an integer. However, the inequality of arithmetric and geometric means does give us some intuition: {M} should be as larger as {x_1,x_2,\ldots,x_k} get close to each other. The following lemmas confirms this intuition. Without loss of generality, assume {x_1 <x_2 < \ldots < x_k}. Define the {gap} of two integers {x_1} and {x_2} as the number of integers between them. Denote it by {gap(x_1,x_2)}. For example, {gap(5,8)=2} since there are 6 and 7 between them. If {gap(x_1,x_2)=0}, then we say that {x_1} and {x_2} has {no} gap.

Lemma 3 Let {t=x_1+x_2} when {t}, {x_1} and {x_2} are nonnegative integers and {x_1 \neq x_2}. {x_1x_2} has the maximal value if and only if {gap(x_1,x_2) \leq 1}. Particularly, when {t\geq 5}, the maximum of {x_1x_2} is larger than {t} strictly.

Given a increasing sequence of integers {X_k = x_1,x_2,\ldots,x_k}, define the number of gaps in {X_k} as the number of pairs ({x_i},{x_{i+1}}) which {gap(x_i,x_{i+1})>0}. For example, sequence {1,2,5,6,8} has 2 gaps: one between 2 and 5, and the other between 6 and 8.

Lemma 4 {M} reaches the maximal value if and only if there is at most one gap in {X_k} and that gap is at most 1 if any.

The lemma above tells us that to make {M} as large as possible, the sequence {X_k} should composed by a list of continous integers {x+1,x+2,\ldots,x+k}, or two segments of continous integers {x+1,x+2,\ldots,x+u} and {(x+u+1)+1,(x+u+1)+2,\ldots,(x+u+1)+v} where {u+v = k}. Example: {N=15}, {k=3}. Then {M=4\times 5\times 6=120}. If {k=4}, then {M=2\times 3\time 4\times 6=144}. For {k=5}, {M=1\times 2\times 3\times 4\times 5 = 120}.

One natural question is that, given {k} is fixed, how to look for valid sequence {X_k} which maximize {M}? Well, by lemmas above, we can assume that {X_k= x+1,x+2,\ldots,x+u, (x+u+1)+1,(x+u+1)+2,\ldots,(x+u+1)+v} where {u + v = k} and {v < k}. So

\displaystyle (\sum_{i=1}^u x+i) + (\sum_{j=1}^v (x+u+1)+j) = N

Or

\displaystyle x=\frac{N}{k}-\frac{v}{k}-\frac{k+1}{2}

by assuming {N=mk+t} where {t < k},

\displaystyle x=m-\frac{t-v}{k}-\frac{k+1}{2}

Since {x} is integer and {v<k}, when {k} is odd, {v=t}. When {k} is even, {v=t\pm \frac{k}{2}}, depending on if {t} is larger than half of {k}. By this way, for any {N} and valid {k}, we compute the unique pair of {x} and {v}.

Lemma 5 Given {N} and {k}, the sequence {X_k} maximizing {M} is unique.

At this point, we can solve the problem as follows: For all possible {k}, compute the {x} and {v} by above way. Then compute their corresponding {M} values, between which we pick the largest one. The algorithm is inefficient since we involve computing the pretty number of many factors (imagine {4\times5\times\ldots\times 50} !!). Java and some programming language do provide BigInteger class, but it is time consuming.

The question following is, can we void manipulation on big integers? Yes. To make it, we need some insightful thoughts. From the example given above, it seems that {M} increases and {x_1} decreases when {k} get larger, but not too large. At least, {x_1 = 1} is not expected and should be voided. In a word, we try to get an {x_1\geq 2} as small as possible. The perfect value for {x_1} would be 2, of course. However, not always we can make it. The good news is, {x_1} should be 2 or 3.

Lemma 6 Let {X_k} be the sequence maximinze {M} among all valid sequences. Then {x_1} is 2 or 3.

Proof: We prove it by contradiction. Assume {x_1 > 3}. Consider the case {x_1=4}. By Lemma 4, {x_2=5} or {6}. If {x_2=5}, we construct a new sequence {X^\prime} by replacing {x_2} with {x_1^\prime=2} and {x_2^\prime=3} and keeping others intacted. Note that {X^\prime} is a valid sequence, meaning that all elements in it are different and sum to {N}. Then {\frac{X_k}{X^\prime} = \frac{5}{6}}. That’s {X_k < X^\prime}, which is a contradiction. If {x_2=6}, we construct a new valid {X^\prime} by replacing {x_1} and {x_2} by 2,3 and 5. Since {2\times 3\times 5 = 30 > 4\times 6=24}, it contradicts the optimality of {X_k}. Consider {x_1 > 4}. In this case, we construct the new valid {X^\prime} by replacing {x_1} with two small factors {y_1} and {y_2} such that {gap(y_1,y_2)\leq 1}. According to Lemma 5, {X^\prime > X_k}, yet another contradiction. So {x_1 \leq 3}. Obviously, if {x_1=1}, we can easily construct an valid {X^\prime > X_k}.
Now, we narrow the search scope to those {X_k} starting with 2 or 3. Furthermore,

Lemma 7 Let {X=x+1,x+2,\ldots,x+u_1, (x+u_1+1)+1,\ldots,(x+u_1+1)+v_1} be the sequence maximizing {M} starting from 2 ({x=1}) and {Y=y+1,y+2,\ldots,y+u_1,(y+u_2+1)+1,\ldots,(y+u_2+1)+v_2} starting from 3 ({y=2}), where {u_1,u_2 \geq 1} and {v_1,v_2 \geq 0}. Let {i} be the length of {X} and {j} be the length of {Y}. Then, {i>j} and {v_1 < v_2}.

Proof: Obviously, {v_1 < i}. {\sum_{a\in X} a = N \Rightarrow (2i+3)^2=8N-8v_1+9} and {\sum_{a\in Y} a = N \Rightarrow (2j+5)^2=8N-8v_2+25}. Combining them, we have\displaystyle (j-i+1)(i+j+4) = 2(v_1-v_2)+4

Assume {i\leq j}. Then {2(v_1-v_2)+4 \geq i+j+4 \geq 2i+4}, or {v_1-v_2 \geq i}. However, since {v_1 < i} and {v_2 \geq 0}, it’s impossible. So {i > j} and then {v_1 < v_2}.

The above lemma simply states that if we get two valid sequences, namely, one starting from 2 with length {i} which maximize {M} when {k=i}, and the other starting from 3 with length {j} which maximize {M} when {k=j}, and each of them consists of two segments of continguous integers, then latter segment of the one starting from 2 is short than that of the one starting from 3.

This fact leads to following important lemma.

Lemma 8 Let valid {X=x+1,x+2,\ldots,x+u,(x+u+1)+1,\ldots,(x+u+1)+v} start from 3 ({x=2}). Then {X} maximizes {M} among all valid sequences if and only if {v=0} or {v=1}.

Proof: In either case of {v=0} or {v=1}, no valid sequence starting from 2 can be constructed according Lemma 7. According to Lemma 6, {X} maximizes {M}. Now assume {X} maximize {M} but {v>1}. Then, we construct a new valid {Y} by replacing {(x+u+1)+2} with 2 and {(x+u+1)}. Since {u>0} and {x=2}, {2(x+u+1) > (x+u+1)+2}. So {Y>X}. Contradiction.

Now we come to the crux of the problem. We construct a valid sequence {X} starting from {x+1} where {x=1} by the following way: start from {x+1}, we keep adding {x+2,\ldots,x+i} until adding {x+i+1} will make the sum of elements in {X} larger than {N}. Then, we increases each element by 1 in the order of {x+i,x+i-1,\ldots,x+1} and repeat the process until it sums to {N}. The sequence we finally get is the answer.

{S \gets 0}
{k \gets 0}
array {X} stores the sequence
while {S + (k+2) \leq N}

      {X[k+1] \gets k+2}
      {k\gets k+1}
      {S \gets S + X[k]}
end while
{j \gets k}
{m \gets N-S}

while {m > 0}
      {X[j] \gets X[j]+1}
      {m \gets m-1}
      {j \gets j-1}
      if {j < 1}
            {j = k}
      end if
end while

The next step is to prove the algorithm is correct. Given above lemmas, it’s an easy task.

Proof: Firstly, {m \leq k+1}. Otherwise, the first while loop won’t terminate. So the body of second while loop will be executed at most {k+1} times. If {m=k} or {m=k+1}, then {X} starts from 3 with the second segment of length 0 or 1, respectively. According to Lemma 8, {X} is optimal. If {m = 0}, we are unable to construct a sequence {Y} such that {Y} starts from 3 and will the second segment of length at most 1. If {0<m<k}, {X} has second segment of length at least 1. According to Lemma 7, we also cann’t construct {Y} maximing {M} and starting from 3 with second segment of length at most 1. Combining Lemma 6, the {X} is optimal.

The PDF version of this solution is available here.

Yet Another Note On Skewness And Kurtosis

October 10, 2010

Due to a project I recently work on, I would like to know the relationship between skewness and kurtosis. After some Google search, I am directed to Wilkins’s paper : A Note On Skewness And Kurtosis [PDF] in which he gave an new proof of the following inequality:

\displaystyle kurtosis \geq (skewness)^2+1.

However, he only proved it for random variables with finite values. It’s quite natural to extend his proof to any real-valued random variable. Here is the proof I give only involving  fundamental concept and definition from probability theory and quadratic form which is exactly the remarkable idea from Wilkins’s original proof.

Let X be a real-valued random variable defined on probability space (\Omega,\mathcal{E},P). Let \mu be the mean of X. Then,

\displaystyle \mu\equiv\int_\Omega X\, \text{d}P. \ \ \ \ \ (1)

Also, denote the i^{th} central moment of X by \upsilon_i. Then,

\displaystyle \upsilon_i\equiv\int_\Omega(X-\mu)^i\, \text{d}P. \ \ \ \ \ (2)

And, the standard deviation \sigma of X is defined as

\displaystyle \sigma \equiv \sqrt{\upsilon_2}. \ \ \ \ \ (3)

Define the i^{th} standard moment \lambda_i of X as

\displaystyle \lambda_i\equiv\frac{\upsilon_i}{\sigma^i}. \ \ \ \ \ (4)

Here, \lambda_3  is called skewness and \lambda_4 is called kurtosis. Note that

\upsilon_1=0, \lambda_1=0

and

\displaystyle \lambda_2=1. \ \ \ \ \ (5)

Now, consider the quadratic form

\displaystyle G(a,b,c)\equiv \int_{\Omega}\big(a+(X-\mu)b+(X-\mu)^2c\big)^2\text{d}P.

\displaystyle=\int_{\Omega}\bigg(a^2 + (X-\mu)^2b^2+(X-\mu)^4c^2+2(X-\mu)ab+2(X-\mu)^2ac+2(X-\mu)^3bc\bigg)\text{d}P
\displaystyle = a^2\int_\Omega\,\text{d}P + b^2\int_\Omega(X-\mu)^2\,\text{d}P+c^2\int_\Omega(X-\mu)^4\,\text{d}P
\displaystyle + 2ab\int_\Omega(X-\mu)\,\text{d}P + 2ac\int_\Omega(X-\mu)^2\,\text{d}P + 2bc\int_\Omega(X-\mu)^3\, \text{d}P

\displaystyle = a^2+\upsilon_2b^2+\upsilon_4c^2+2\upsilon_1ab+2\upsilon_2ac+2\upsilon_3bc

Since G(a,b,c)\geq 0 for all a,b,c (G is semi-definite), we shall have its discriminant larger than or equal to 0. That is,

\begin{vmatrix}1&\upsilon_1&\upsilon_2\\ \upsilon_1&\upsilon_2&\upsilon_3\\ \upsilon_2&\upsilon_3&\upsilon_4\end{vmatrix}=\begin{vmatrix}1&0&\sigma^2\\ 0&\sigma^2&\sigma^3\lambda_3\\ \sigma^2&\sigma^3\lambda_3&\sigma^4\lambda_4\end{vmatrix}=\sigma^6\lambda_4 - \sigma^6-\sigma^6\lambda_3^2=\sigma^6(\lambda_4-\lambda_3^2-1) \geq 0.

For any real-valued random variable whose standard deviation is not zero, we have

\displaystyle \lambda_4 \geq \lambda_3^2+1. \ \ \ \ \ (6)

Note that standard deviation \sigma=0 if and only if X is constant almost everywhere.