primorial sieve

𝑷 𝒓 𝒊 𝒎 𝒐 𝒓 𝒊 𝒂 𝒍 𝒔 𝒊 𝒆 𝒗 𝒆 .

The following text was an early attempt at solving the problem, a more → sophisticated mechanism is already available, but its description omits much of the broader introduction to the whole subject. It is therefore worth reviewing the explanations below, remembering that they are already of quite archival importance…

The 𝒑𝒓𝒊𝒎𝒐𝒓𝒊𝒂𝒍 𝒔𝒊𝒆𝒗𝒆 (abbreviated 𝑷𝒓𝒊𝑺) presented here is an algorithm that allows for the decomposition into prime factors of numbers written using tens of thousands of digits, and even larger ones. It seems to me that this is a completely new method, previously unknown, but there are so many mathematical publications that I could have missed it...

𝑷𝒓𝒊𝑺 is a COMPLETE algorithm. What does it mean? Namely, when an attempt to factor a number does not lead to its decomposition into prime factors, at least two - this means a hundred percent certainty that this number itself is a prime number. The algorithm is characterized by the lowest possible computational complexity, its only inconvenience is that factoring a really large number — requires a COMPLETE database, containing numbered, consecutive primes, which can be large divisors of the pseudoprime. A composite number that is the product of large primes — resists attempts at factoring most strongly. For this reason, such numbers are called strongly pseudoprimes: they are often resistant to attacks using even the best SINGLE methods used so far. Therefore, you have to use SEVERAL of them, and this increases the total computational complexity, because these calculations often duplicate each other...

Composite numbers, which are the product of prime numbers: a large (sometimes two or even several such numbers) and a small one - are easy to factor - well, at least partially: we are often satisfied with the mere statement that such a number is composite, and to prove it, it is enough to find at least one of its divisors, and finding a small one is the easiest...

Of course, the fewer the components (large, and similar in size prime numbers that make it up) in such a number, the more difficult it is to factor it, and for a given size of the number (counted in the number of digits that make it up) - the most difficult to factor is the square of a large prime number, similarly to the product of twin numbers, or quadruple, sextuple (where p₁ = p₀ + 6) - or generally: nearby...

Factoring a relatively small number can be performed using the so-called the naive or otherwise primitive method (resembling the ingenious sieve of Eratosthenes, just as an ordinary pyramid resembles a pyramid placed on its top), simply dividing (or rather trying to divide) the number being tested by subsequent prime numbers. When we succeed, we try to divide the result of the division once more by the last used p-number, because it is possible that the number being tested divides by it several times.

Each prime number (p-number) by which the number being tested is divisible, of course in the appropriate power — is a component of the product of prime numbers, the product that creates the specific number being tested.

In the next step, we treat the obtained result of the division as another number being tested, and divide it by subsequent prime numbers contained in their table, repeating the procedure until the obtained result is smaller than the (test) prime numbers that follow successively in the procedure we are using, gradually climbing up the series of prime numbers. Of course, trying to divide by them no longer makes sense, because we are not dealing with fractional numbers, but with integers.

And now, following this digression, I will point out the differences in the use of the 𝒑𝒓𝒊𝒎𝒐𝒓𝒊𝒂𝒍 𝒔𝒊𝒆𝒗𝒆 method, which is, for a change, a method for detecting not only not very large composite numbers, but an extremely efficient way of unmasking large, even huge ones. Although to some extent it has something in common with the naive method described above. But there is an important difference. The point is that a computer, like a human, performs multiplication noticeably faster than division: the difference reaches several dozen percent - measured either by the time of execution of the operation or the number of processor cycles. In the 𝑷𝒓𝒊𝑺 method, in the first step, we multiply several consecutive prime numbers by themselves (because we will use the primorial function, and then the 𝑽𝒂𝒏 function — its nephew), and then divide (or try to divide) the number being tested by their product. And it is worth considering how much faster this is than if we divided by each of these numbers separately, because in the method proposed here we have one equation for multiplying n numbers (primorial), and then we repeat the division x times. The number of operations is therefore x + n, where there are n multiplications, and division operations x. In the naive method, omitting the initial multiplication of primes — we have to perform n⋅x division operations. And when both x and n are large, this difference is really big! But that is not the most important thing. Much more important is a clever, and so far omitted (and we postulate it here) examination of the remainders obtained from division and determining which of them are prime numbers, and which are not: well, the second case absolutely indicates that the number we are testing is a composite number, because it can be divided by at least one of the components of the remainder, which is a composite number. Symbolically, we can write it like this:

a / p₀# = p₀# ⋅b + r

where r is the remainder of the division. And if r is a composite number, then it has at least one common divisor: either with the number b or with the number p# ⁣ ⁣ — which results from the adopted methodology. And if so, then the entire right-hand side of the equation can be divided by this common divisor. And this means that the number a is a composite number, and one of its simple factors (i.e. the product of prime numbers that creates it) is the common divisor that has just been found. But it can also be the remainder of the division, which is a prime number, if it is not greater than p₀! So we divide the number under study by him (or by she), and the one obtained (as a result of division) — we subject it to further factorization, proceeding in this way until we obtain the complete factorization of the initial number. As you can see, this is the INTEGER method.

The 𝒑𝒓𝒊𝒎𝒐𝒓𝒊𝒂𝒍 𝒔𝒊𝒆𝒗𝒆 algorithm is dedicated to the study of huge numbers, consisting of tens of thousands of digits, and even larger. It is known that (as already mentioned) the greatest interest is not aroused by those that are divisible by, among others, relatively small prime numbers, but numbers that are the product (and especially: exactly two!) of large prime numbers of similar size. These numbers will of course not be detected in the first, preliminary step (the method has a multi-level structure to facilitate calculations. More on that later) of the 𝑷𝒓𝒊𝑺 procedure, nor even in the subsequent steps, which are not very high yet. To facilitate calculations and avoid using too large numbers as divisors — it will (maybe) be necessary to use a new function, which we will present here, a function called Vanguard, in short 𝑽𝒂𝒏.

Symbolically, in equations we write this ⁣ #( p₀ ; p₁) — this is a two-argument function: #(p₁ ; p₀) = p₁ # / p₀# ⁣ ⁣ where p₁ > p₀ ⁣ and ⁣ p₁—p₀ ≡ Δ₀⁣ ⁣where this notation does NOT mean subtracting numbers: we are dealing with huge numbers, so it is worth using a certain trick. Namely, instead of writing them out in full, with all their digits, which count in thousands, we can take advantage of the fact that they are already (and will definitely be!) numbered (and the gaps, which are currently numerous, can be filled in quite easily using the method presented here), so when recalling a specific number, we can simply provide its index, i.e. the ordinal number in the table of prime numbers. Thus, Δ₀ denotes the differences of the ordinal numbers p₀ and p₁, and then there are still further Δ₁ and Δ₂ — up to Δₙ... (as it follows from the above, the numbers p₀; p₁; p₂; p₃... pₙ — are not consecutive numbers at all according to their indices, but are certain prime numbers distinguished by us for practical use).

We will also adopt the convention that the unary notation Δp ⁣ ⁣ denotes the index of the number p, while the binary notation Δ(p₀ ; p₁) — denotes the difference of the indices of these numbers.

#(p₁ ; p₀) = p₁ # / p₀# which for the domain of prime numbers is analogous to the operation p₁! / p₀! — i.e. the operation of dividing two factorials in the domain of natural numbers.

What does it mean to say that the "method has a multi-story structure"? Namely, if the number under study resisted attempts to factor it in the first stage, then in the next and subsequent stages we will use the 𝑽𝒂𝒏 function, choosing its Δ in such a way that it meets certain conditions, which will be described below. Why is it worth using subsequent floors, instead of immediately taking a giant primorial? Well, it will work out "in the wash", but it will often be the case that the number under study will have some rather small divisor, and then it will be detected using a not too large primorial, so it is not worth "shooting at a sparrow with a cannon". In addition, it is known that large collections of the numbers under study will often be processed, one by one, in their entire set. Division by numbers with a very large number of digits is not very fast, so (probably!) - it is better to break it down into stages (floors). First on the ground floor — we test a given number using 𝒑𝒓𝒊𝒎𝒐𝒓𝒊𝒂𝒍 q₀, then we go to the first floor and test it using q₁ — which is no longer 𝒑𝒓𝒊𝒎𝒐𝒓𝒊𝒂𝒍, but a number calculated using the 𝑽𝒂𝒏-guard function, where q₁ = #(p₁ ; p₀)

When that doesn't work either, we go to the second floor, where 𝑽𝒂𝒏 gives the value q₂ = #(p₂ ; p₁). Etc. etc… The next values of the numbers q₀; q₁; q₂; q₃… — are similar, although they are created from larger and larger numbers (of course, composite). But we consistently use fewer and fewer of them, and this is so that no qₙ exceeds the threshold specified limited by the "capacity" of a quatro-integer variable or similar...

As we climb the next "floors" of this procedure, we will get a fairly large (and this will cause noticeable problems) number of remainders from division, and again, as in the previous stage, those of them that are composite numbers - unambiguously signal that the number being tested is also composite and is divided by at least one of the components that make up the given remainder. On the other hand, the remainders that are prime numbers, contained in the interval from (qₙ₊₁+1) to qₙ₊₂ — arouse hope that the number being tested may be prime. The remainders that are numbers contained in the interval from qₙ to qₙ₊₁ — unambiguously indicate that the number being tested is composite, even if they are prime numbers. On the other hand, the number one and all prime numbers less than qₙ, which are prime numbers - may signal that the number being tested may be prime after all. Of course, this does not apply to remainders, which are composite numbers.

STOP! Error: it is only important that the remainder of the division has no common divisor with either b or qₙ — that is, that it is relatively prime to both of them. So these composite numbers can still signal that the number being tested is prime after all. Let's recall the relevant equation:

a / qₙ = b⋅qₙ + r

However, we are in the area of relatively small numbers, so it won't be difficult to check this, but it will require an appropriate procedure, i.e. several dozen lines of code...

Such errors sometimes happen when you don't fully understand the algorithm you've invented...

(𝐢𝐧 𝐋𝐯𝐢𝐯-𝐒𝐜𝐨𝐭𝐭𝐢𝐬𝐡 𝐂𝐨𝐨𝐤𝐁𝐨𝐨𝐤 prime floor: 𝐬𝐞𝐥𝐞𝐫, 𝐩𝐚𝐫𝐬𝐥𝐞𝐲 & 𝐩𝐨𝐨𝐫)

Mail

‘shrink’ (shortened link) to the above:

tinyurl.com/PrimorialSieve