The Collatz Amphora

The Collatz Amphora - Mathematical Background

Introduction

The Collatz conjecture is a mathematical problem that has eluded resolution. While its rules appear elementary, they manifest intricate behaviors. This phenomenon has been the subject of extensive research by mathematicians for many years.

The present visualization is an examination of the concept through the lens of a novel algebraic framework, named "The Tuple-based Transform".

The amphora metaphor is an apt analogy: the container is compared to ancient Greek vessels that were used to store valuable items. Conversely, this mathematical container encapsulates the rules that govern the Collatz behavior.

The Collatz Amphora visualization provides an intuitive representation of our fundamental algebraic results: cycle uniqueness, parameter repetition necessity and guaranteed convergence. While previous approaches relied on statistical methods or partial bounds, our framework demonstrates these properties through direct algebraic proof.

The 3D visualization simply makes these abstract structural constraints tangible and comprehensible.

This proof of concept contains a concise synopsis of the ideas outlined in our article, which will incorporate the forthcoming corrections and enhancements in the subsequent version.

The Collatz Conjecture

The Collatz conjecture, alternatively referred to as the $3n+1$ problem, stipulates that for any positive integer a greater than $0$, a will eventually reach $1$ when applying the function $f(n)$.

$$f(n) = \begin{cases} \frac{n}{2} & \text{if } n \text{ is even} \\ 3n + 1 & \text{if } n \text{ is odd} \end{cases}$$

For instance, upon considering the sequence of numbers $\textbf{n=9}$, $\textbf{n=28}$, $\textbf{n=37}$, $\textbf{n=86}$ and one additional iteration on the trivial cycle, the following sequences are obtained:

$$\begin{align*} \{9, \;&28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1\} \\ \{&28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1\} \\ \{37, 112, 56, \;&28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1\} \\ \{86, 43, 130, 65, 196, 98, 49, 148, 74, 37, 112, 56, \;&28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1\} \end{align*}$$

The Tuple-Based Transform

The tuple-based transform is a computational technique that facilitates the conversion of Collatz sequences into structured tuples, thereby enabling a more profound mathematical analysis. This transformation establishes a complete isomorphy between the original dynamic problem and a static algebraic structure.

$$\phi_q: \mathcal{C} \rightarrow \mathcal{T}_q$$ $$[p, f(p), m, q] \text{ where } c_i = 2qm + p$$

The following properties are fundamental to this representation:

Isomorphism: Establishes a one-to-one correspondence between Collatz sequences and tuple sequences, preserving all structural relationships
Homomorphism: Preserves fundamental operations since the transform of a sequence operation equals the operation of transformed sequences
Reversibility: Any tuple sequence can be perfectly reconstructed back to its original Collatz sequence without information loss

The numbers $\textbf{n=9}$, $\textbf{n=28}$, $\textbf{n=37}$ and $\textbf{n=86}$ are represented using the simplified Tuple-based Transform as:

$$\begin{align*} \{4, \;&13, 6, 3, 10, 5, 16, 8, 25, 12, 6, 19, 9, 4, 2, 7, 3, 1, 0, 0, 1, 0\} \\ \{&13, 6, 3, 10, 5, 16, 8, 25, 12, 6, 19, 9, 4, 2, 7, 3, 1, 0, 0, 1, 0\} \\ \{18, 55, 27, \;&13, 6, 3, 10, 5, 16, 8, 25, 12, 6, 19, 9, 4, 2, 7, 3, 1, 0, 0, 1, 0\} \\ \{ 42, 21, 64, 32, 97, 48, 24, 73, 36, 18, 55, 27, \;&13, 6, 3, 10, 5, 16, 8, 25, 12, 6, 19, 9, 4, 2, 7, 3, 1, 0, 0, 1, 0\} \end{align*}$$

This algebraic framework transforms the conjecture from a dynamical systems problem into a static structural analysis, where convergence properties become algebraic invariants that can be systematically studied.

Cycle Uniqueness

The Tuple-based Transform demonstrates the uniqueness of the trivial cycle, represented by the sequence of numbers $\{4, 2, 1\}$, due to the fact that it is the sole instance in which consecutive multiplicity values, denoted by $m$, can be identical.

The algebraic proof of the unique multiplicity property is as follows:

$$\begin{align*} \text{Case 1: } [1, 4, m_i, 1] \text{ followed by } [1, 4, m_{i+1}, 1] \\ \quad 6m_i &+ 4 = 2m_{i+1} + 1 \longrightarrow m_i = m_{i+1} = - 3/4 \\ \text{Case 2: } [1, 4, m_i, 1] \text{ followed by } [2, 1, m_{i+1}, 1] \\ \quad 6m_i &+ 4 = 2m_{i+1} + 2 \longrightarrow m_i = m_{i+1} = - 1/2 \\ \text{Case 3: } [2, 1, m_i, 1] \text{ followed by } [1, 4, m_{i+1}, 1] \\ \quad m_i &+ 1 = 2m_{i+1} + 1 \longrightarrow m_i = m_{i+1} = 0 \\ \text{Case 4: } [2, 1, m_i, 1] \text{ followed by } [2, 1, m_{i+1}, 1] \\ \quad m_i &+ 1 = 2m_{i+1} + 2 \longrightarrow m_i = m_{i+1} = - 1 \end{align*}$$

The valid consecutive values $m_{i} = 0 = m_{i+1}$ can be reversed in the following manner:

$$\begin{align*} \{[2,1,0,1], [1,4,0,1]\} \longleftrightarrow \{2, 1, 4\} \end{align*}$$

Universal Convergence

The tuple-based transform establishes a complete proof of the Collatz conjecture through two fundamental results.

First: Any sequence containing repeated values of the multiplicity parameter must be constrained in its boundaries, irrespective of whether these values are consecutive or not. This is a consequence of the deterministic nature of the Collatz function. Specifically, once a parameter value, $m$, repeats, the sequence enters a predictable trajectory with an upper bound on all subsequent values.
Second: Every Collatz sequence must contain repeated $m$ values due to structural constraints. The parameter $m_r$ undergoes a delicate balance, wherein odd transformations result in an approximate increase of $3$, while even transformations lead to a decrease of approximately $2$:

$$\frac{1}{1 + \log_2(3)} \leq \alpha \leq \frac{1}{2}$$

For $m$ to remain bounded away from zero, the proportion of odd numbers must be at least $0.3869$, but structural constraints limit this proportion to at most $0.5$.

The aforementioned competing dynamic forces result in the arbitrary approach of zero on repeated occasions. According to the Pigeonhole Principle, repetition becomes inevitable. The following is the sequence of implications that are logically derived from the initial proposition:

$$\begin{align*} \text{Structural constraints} &\Rightarrow \text{Repeated } m \text{ values} \\ \text{Repeated } m \text{ values} &\Rightarrow \text{Bounded sequences} \\ \text{Bounded + deterministic function} &\Rightarrow \text{Periodic pseudocycles} \\ \text{Unique cycle } \{4,2,1\} &\Rightarrow \text{Universal convergence} \end{align*}$$

In the Tuple-based Transform, the initial repetition of nonconsecutive values, denoted by $m_r$ and depicting pseudocycles, and the subsequent repetitions on the trivial cycle are highlighted.

$$\begin{align*} \{4, &13, \textbf{6}, 3, 10, 5, 16, 8, 25, 12, \textbf{6}, 19, 9, 4, 2, 7, 3, 1, \textbf{0}, \textbf{0}, 1, 0\} \\ \{&13, \textbf{6}, 3, 10, 5, 16, 8, 25, 12, \textbf{6}, 19, 9, 4, 2, 7, 3, 1, \textbf{0}, \textbf{0}, 1, 0\} \\ \{18, 55, 27, &13, \textbf{6}, 3, 10, 5, 16, 8, 25, 12, \textbf{6}, 19, 9, 4, 2, 7, 3, 1, \textbf{0}, \textbf{0}, 1, 0\} \\ \{42, 21, 64, 32, 97, 48, 24, 73, 36, 18, 55, 27, &13, \textbf{6}, 3, 10, 5, 16, 8, 25, 12, \textbf{6}, 19, 9, 4, 2, 7, 3, 1, \textbf{0}, \textbf{0}, 1, 0\} \end{align*}$$

The 42 Structural Classes

A remarkable discovery about the Tuple-based Transform is that, despite the infinite variety of starting values, all Collatz sequences belong to exactly 42 distinct structural classes, which are determined by their first repeated parameter value, $m_r$.

These $m_r$ classes represent the complete, finite set

$$\begin{align*} \{0, 1, 2, 3, 6, 7, 8, 9, 12, 16, 19, 25, 45, 53, 60, 79, 91, 121, 125, 141, 166, 188, 205, 243, 250\\ 324, 333, 432, 444, 487, 576, 592, 649, 667, 683, 865, 889, 1153, 1214, 1821, 2428, 3643\} \end{align*}$$

As the Tuple-based Transform can be bidirectionally reversed, these 42 $m_r$ values can be transformed into 42 pairs aligned with the Collatz sequences (note that the native Collatz sequences do not repeat values).

$$\begin{align*} \{ &[1,1], [3,4], [6,5], [7,8], [14,13], [15,16], [18,17],[19,20],[25,26],[33,34],[39,40],[51,52],[91,92],[108,107], \\ &[121,122],[159,160],[183,184],[243,244],[252,251],[284,283],[333,334],[378,377],[411,412],[487,488], \\ &[501,502],[649,650],[667,668],[865,866],[889,890],[975,976],[1153,1154],[1185,1186],[1299,1300], \\ &[1335,1336],[1368,1367],[1735,1736],[1779,1780],[2307,2308],[2430,2429],[3643,3644],[4857,4858], \\ &[7287,7288]\} \end{align*}$$

The Invariance Property

This fundamental invariance property demonstrates that all sequences with the same value of $m_r$ exhibit equivalent distances between the first and second occurrences of that parameter.

The constant distance, $D(m_r)$, remains unaltered irrespective of the initial value, $n$, thereby delineating deterministic pathways within a dynamic system that appears to be chaotic.

$$\begin{align*} \forall n_1, n_2 \in S(m_r): &\quad D(m_r, n_1) = D(m_r, n_2) = D(m_r) \\ \text{Distance is invariant} &\text{ across all sequences in class } S(m_r) \end{align*}$$

In the interest of the preceding example, it is imperative to note that the sequence of numbers $\textbf{n=9}$, $\textbf{n=28}$, $\textbf{n=37}$ and $\textbf{n=86}$ belong to the class $mn_r = 6$ of the Tuple-based Transformation, characterized by a static distance.

$$\begin{align*} \{4, &13, \textbf{6, 3, 10, 5, 16, 8, 25, 12, 6}, 19, 9, 4, 2, 7, 3, 1, 0\} \\ \{&13, \textbf{6, 3, 10, 5, 16, 8, 25, 12, 6}, 19, 9, 4, 2, 7, 3, 1, 0\} \\ \{18, 55, 27, &13, \textbf{6, 3, 10, 5, 16, 8, 25, 12, 6}, 19, 9, 4, 2, 7, 3, 1, 0\} \\ \{42, 21, 64, 32, 97, 48, 24, 73, 36, 18, 55, 27, &13, \textbf{6, 3, 10, 5, 16, 8, 25, 12, 6}, 19, 9, 4, 2, 7, 3, 1, 0\} \end{align*}$$

In the Collatz plane, the $m_r=6$ class is represented by $[14,13]$:

$$\begin{align*} \{9, &28, \textbf{14, 7, 22, 11, 34, 17, 52, 26, 13}, 40, 20, 10, 5, 16, 8, 4, 2, 1\} \\ \{&28, \textbf{14, 7, 22, 11, 34, 17, 52, 26, 13}, 40, 20, 10, 5, 16, 8, 4, 2, 1\} \\ \{37, 112, 56, &28, \textbf{14, 7, 22, 11, 34, 17, 52, 26, 13}, 40, 20, 10, 5, 16, 8, 4, 2, 1\} \\ \{86, 43, 130, 65, 196, 98, 49, 148, 74, 37, 112, 56, &28, \textbf{14, 7, 22, 11, 34, 17, 52, 26, 13}, 40, 20, 10, 5, 16, 8, 4, 2, 1\} \end{align*}$$

a total of $9$ elements ($8$ steps) were to be saved, given the fact that the distance between repetitions is constant. As will be shown, it is possible to circumvent the necessity for further computational processes.

A Taxonomy for Collatz sequences: Types A, B and C

In addition to the structural classification provided by these 42 $m_r$ classes, sequences manifest a natural geometric taxonomy based on the location of their maximum value in relation to parameter repetitions. This process gives rise to three exhaustive types:

Type A (Pre-Repetition Maximum) where the maximum occurs before the first $m_r$ repetition.
Type B (Inter-Repetition Maximum) where it occurs between the first and second $m_r$ repetition.
Type C (Post-Repetition Maximum) where it occurs after the second $m_r$ repetition.

The distribution of the data reveals a significant degree of asymmetry. Type A comprises an infinite set representing the vast majority of sequences, while Types B and C form finite exceptional sets of exactly $1156$ and $73$ sequences, respectively. The maximum initial values for these sets are $n=29148$ for Type B and $n=9088$ for Type C.

This geometric classification is mathematically distinct from structural classes, resulting in a two-dimensional framework where each sequence is uniquely characterized by the pair $(m_r, \text{Type})$.

Natural Limitat of $m_r$ Values

The presence of precisely 42 values of $m_r$ is indicative of a fundamental self-limiting mechanism inherent in the dynamics of the Collatz sequence. This finite bound is not arbitrary; rather, it arises from the mathematical impossibility of generating new repeated parameter values beyond a certain threshold. As the initial value, denoted by $n$, increases, two competing dynamical processes establish a natural boundary:

Maximum Value Migration: Larger maximum values $M(n)$ tend to appear progressively earlier in the sequence trajectory, moving toward the initial positions
Parameter Bound Stability: The possible $m_r$ values remain algebraically constrained and cannot grow indefinitely due to the inherent structure of the tuple-based transform

Therefore, for sufficiently large $n$, the system attains a critical threshold at which the maximum value $M(n)$ occurs so early in the trajectory that it prevents the formation of new parameter repetitions. This phenomenon gives rise to an asymptotic separation, which in turn brings about a fundamental transformation in the taxonomic distribution. Type B sequences transition to Type A, characterized by a maximum number of moves prior to the initial repetition $m_r$.

Concurrently, Type C sequences also adopt an Type A configuration, a transition that occurs once the maximum number of repetitions has been reached. Consequently, while the sets of Type B and C remain finite (with the $1156$ sequences and the $73$ sequences, respectively), Type A grows to encompass an infinite set. All sufficiently large starting values inevitably belong to Type A.

$$\begin{align*} \lim_{n \to \infty} \text{pos}(M(n)) &\to 1 \text{ (maximum moves to start)} \\ \lim_{n \to \infty} P(\text{Type A}|n) &\to 1 \text{ (all large } n \text{ become Type A)} \\ |\text{Type B}| &= 1156 \text{ (finite set)} \\ |\text{Type C}| &= 73 \text{ (finite set)} \end{align*}$$

Self-Return Transformations: $m_r \mapsto m_r$

Each of the 42 structural classes is defined by a unique transformation sequence that maps the parameter value back to itself: $\omega(m_r) = m_r$.

These sequences consist of $T_1$ operations ($m \mapsto 3m + 1$) representing odd Collatz steps and $T_2$ operations ($m \mapsto \lfloor m/2 \rfloor$) representing even steps. The order matters critically, since sequences with identical counts of $T_1$ and $T_2$ operations but different arrangements produce entirely different self-return values.

$$\begin{align*} \omega(m_r) = T_{p_{k}} \circ T_{p_{k-1}} \circ \cdots \circ T_{p_1} = m_r \end{align*}$$

Continuing with the numbers $\textbf{n=9}$, $\textbf{n=28}$, $\textbf{n=37}$ and $\textbf{n=86}$, and using the $p$ subsequence $[2], 1, 2, 1, 2, 1, 2, 2, [1]$:

$$\begin{align*} [6] = \\ &\textbf{T2}(6) = \lfloor\frac{6}{2}\rfloor = 3\;(p=2),\;\textbf{T1}(3) = 3\times3+1 = 10\;(p=1),\;\textbf{T2}(10) = \lfloor\frac{10}{2}\rfloor = 5\;(p=2), \\ &\textbf{T1}(5) = 3\times5+1 = 16\;(p=1),\;\textbf{T2}(16) = \lfloor\frac{16}{2}\rfloor = 8\;(p=2),\;\textbf{T1}(8) = 3\times8+1 = 25\;(p=1), \\ &\textbf{T2}(25) = \lfloor\frac{25}{2}\rfloor = 12\;(p=2),\;\textbf{T2}(12) = \lfloor\frac{12}{2}\rfloor = 6\;(p=2) \\ = [6] \end{align*}$$

It is evident that this scenario must inherently symmetrical in the context of Collatz sequences. Consequently, we need to reverse the values to this environment.

$$\begin{align*} [14] = \\ &\textbf{T2}(14) = \frac{14}{2} = 7,\;\textbf{T1}(7) = 3\times7+1 = 22,\;\textbf{T2}(22) = \frac{22}{2} = 11,\;\textbf{T1}(11) = 3\times11+1 = 34,\; \\ &\textbf{T2}(34) = \frac{34}{2} = 17,\;, \textbf{T1}(17) = 3\times17+1 = 52,\;\textbf{T2}(52) = \frac{52}{2} = 26,\;\textbf{T2}(26) = \frac{26}{2} = 13\; \\ = [13] \end{align*}$$

Wormhole Sequences: Computational Shortcuts

The notion of wormholes emerges naturally from the structural properties of the 42 classes of pseudocycles and their associated 42-dimensional manifolds. It has been established that each pseudocycle delineates a computational shortcut.

The initiation of a wormhole sequence is marked by the arrival of a sequence at an entry point. This precomputed trajectory leads directly to the trivial cycle, denoted by the sequence numbering $\{4, 2, 1\}$. In a similar manner, the Tuple-based Transform demonstrates the existence of wormholes from the initial occurrence of any repeated parameter $m_r$, directly to $m = 0$, thereby delineating the algebraic pathway to convergence that is devoid of intermediate computations.

Given that these wormhole paths are precalculated and stored in both domains, it is possible to determine the remaining steps from entry point to convergence instantaneously. The aforementioned phenomenon engenders "wormhole travel" through the Collatz space. The sequences in question transition into known tunnels, which effectively guide them to final convergence. It is noteworthy that the entirety of the journey is predetermined algebraically, as opposed to being iteratively computed.

Stopping Time Prediction and Computational Efficiency

The framework represents a paradigm shift in the field of stopping time analysis by introducing a fundamental computational dichotomy. It has been determined that approximately 65% of the initial values of $n$ are associated with the predictable cases, as these cases include pseudocycles within their respective pathways. The remaining 35% of cases represent the unpredictable instances in which parameter repetition occurs exclusively on the trivial cycle.

The computational savings exhibited significant variation: sequences that employed complete trajectory computation exhibited negligible savings, given the necessity of computing the entire trajectory. Conversely, sequences that initiated with a value equal to an entry point value attained substantial savings of 100%, a result of the immediate prediction capability. The precise reduction in steps depends on the rate at which the wormhole entry point is identified in the sequence trajectory.

$$\begin{align*} \text{Unpredictable (35%):} &\quad m_r = 0 \Rightarrow O(\tau(n)) \text{ complexity} \\ \text{Predictable (65%):} &\quad m_r > 0 \Rightarrow O(\log n) \text{ complexity} \\ \text{Savings range:} &\quad \text{From } 0\% \text{ to } 100\% \end{align*}$$

Comparing The Collatz Amphora with Existing Models

Previous approaches to Collatz can be categorized into distinct paradigms, each of which captures only a partial aspect of the problem. Tao's probabilistic model demonstrates that nearly all orbits attain almost bounded values through a sophisticated statistical analysis. However, the model is unable to identify specific exceptional cases. The Krasikov-Lagarias difference inequalities are a set of mathematical inequalities that provide analytical limits, thereby constraining trajectory behavior and establishing growth bounds and density estimates.

Michel's geometric approach is indicative of patterns in residue classes modulo powers of 2, thereby providing insights into the local trajectory structure. The Simons-de Weger cycle analysis is a method of analyzing systems that eliminates potential non-trivial cycles through the implementation of length and value constraints. This process serves to reduce the overall space of possible counterexamples.

The Amphora offers a comprehensive structural visualization of the partial information captured by each approach, thereby unifying these perspectives. In instances where Tao exhibit statistical convergence, the Amphora unveils the finite wormhole network that ensures it. In instances where Krasikov-Lagarias demonstrate the existence of bounds, the Amphora elucidates the topological constraints that give rise to these bounds. In instances where Michel focuses on the identification of local patterns, the Amphora highlights the global structure that underlies these local patterns, thereby demonstrating a comprehensive and interconnected framework.

Rather than competing with these approaches, the Amphora complements them by transforming abstract mathematical constraints into intuitive structural understanding, making the fundamental convergence mechanism visible and comprehensible.

Values for Testing

In consideration of the significant interest that $n=27$ has elicited, we append a few values from its class $m_r=60$ for the purpose of evaluation.

$$\begin{align*} \text{Type B} = \{&27,31,41,47,54,55,62,71,73,82,83,94,97,107,109,110,121,124,125,129,142,145,146,147,161, \\ &164,165,166,171,188,193,194,195,214,218,220,221,231,242,248,250,257,258,259,285,290,291, \\ &292,293,294,322,327,328,330,332,342,343,345,347,376,386,387,388,389,390,391,415,428,429, \\ &436,437,440,442,457,459,462,463,484,491,496,500,\} \\ \text{Type A} = \{&831,1071,1087,1247,1263,1351,1449,1455,1463,1519,1527,1551,1607,1623,1631,1647,1662,1695, \\ &1711,1719,1801,1839,1851,1871,1895,1935,2023,2025,2027,2043,2067,2071,2083,2142,2143, \\ &2163,2167,2174,2183,2195,2279,2281,2291,2299,2311,2327,2401,2411,2435,2439,2447,2451, \\ &2455,2467,2471,2494,2526,2527,2539,2543,2567,2575\} \end{align*}$$

Should further examples be required for processing, they can be found here. The data set includes all values of $n$ less than $2^20$, meticulously classified into designated $m_r$ classes and A, B and C taxonomies.