integrating ton into a multi-vm swap model
we're integrating ton into mobula's data layer. unifying it with EVM, SVM, TRON, SUI and our perps schemas means everything has to collapse into the same swap / transfer / token primitives, across vms that share almost nothing at the protocol level. ton is the chain that bent our assumptions the most. a few things i learned over the last two days.
first wall: a swap is not a transaction.
on EVM, a uniswap swap is one transaction. one block, one log line, one Swap(amount0, amount1, ...) event you grep. atomic, all-or-nothing. on solana an aggregator swap is one transaction with a list of inner instructions, parsable from a known IDL. atomic too.
on ton, a single user-facing swap is a DAG of transactions distributed across multiple shards and almost always multiple block heights:
user wallet tx (external-in)
→ user jetton wallet tx (transfer, op 0x0f8a7ea5)
→ router jetton wallet tx (internal_transfer 0x178d4519 + transfer_notification 0x7362d09c)
→ router tx (parse forward_payload, swap → pool)
→ pool tx (compute output, pay_to → router)
→ router tx
→ router output jetton wallet tx (internal_transfer)
→ user output jetton wallet tx
→ optional final transfer_notification
8+ transactions, 4–5 contracts, sometimes 2–3 shards. classifying "did this swap succeed" requires reconstructing the full trace tree first. you can't process tx-by-tx and emit swap events, because you'll attribute amounts to the wrong leg, miss bounces, and double-count failed executions.
our cross-vm swap pipeline assumes "one swap = one normalized event with input / output amounts and a timestamp." on EVM that maps 1:1 to a log line. on TON we now build trace trees first (linking in_msg.hash → out_msg.hash across shards), classify the trace as a known dex pattern, then emit the unified record. completely different ingestion path from anything else we support.
second wall: there are (almost) no events.
EVM has LOG opcodes. solana programs emit logs subscribers can stream. ton has nothing equivalent in the general case. interfaces work by convention: the first 32 bits of every internal message body are an unsigned op-code, the rest of the cell is the typed payload. no ABI registry, no event topics — just hardcoded constants:
transfer 0x0f8a7ea5 (jetton owner → jetton wallet)
internal_transfer 0x178d4519 (jetton wallet → jetton wallet)
transfer_notification 0x7362d09c (jetton wallet → recipient)
excesses 0xd53276db
ston.fi v1 swap 0x25938561
ston.fi v2 swap 0x6664de2a
ston.fi v2 pay_to 0x657b54f5 (success signal)
dedust swap (native) 0xea06185d
dedust swap (jetton, in forward_payload) 0xe3a0d482
dedust swap event 0x9c610de3 (rare: actual ext_out emission)
detecting a ston.fi v2 swap is: index every message by (dest_address, op_code), find a transfer_notification to the v2 router whose forward_payload starts with 0x6664de2a, walk the resulting trace, and look for a downstream pay_to to confirm success. there is no "Swap" event being emitted anywhere in this flow — you reverse-engineer it from message shapes.
dedust is the outlier: pools emit ext_out_msg_info messages with op 0x9c610de3 carrying asset_in / asset_out / amount_in / amount_out / sender. that's basically an EVM event in spirit, dropped by one major dex. the rest of the ecosystem doesn't follow the pattern, but for dedust specifically the indexer path collapses to "match every external-out from a known pool address."
third wall: account ≡ contract, and tokens are not what you think.
on EVM there's a single ERC20 contract holding mapping(address => uint256) balances. on ton, jettons (TEP-74, the ton ERC20-equivalent) are split: a "jetton master" contract holds metadata and total supply, and every (master, owner) pair has its own dedicated "jetton wallet" smart contract holding that one user's balance. one wallet contract per user per token.
addresses are derivable, not registered. an address on ton is literally (workchain, sha256(StateInit)) where StateInit contains code + initial data. for jetton wallets, code is the master's wallet code, initial data is (0, owner_address, master_address, wallet_code). to find any user's USDt balance, you compute their jetton wallet address client-side from these inputs, then call the get_wallet_data() get-method on that derived address.
implications for indexing:
• no central transfer log on the master to subscribe to. you watch every internal_transfer between jetton wallets and reconstruct flows yourself.
• a user holding 100 different jettons has 100 different wallet contract addresses on chain. listing a portfolio means computing 100 deterministic addresses, then querying 100 get-methods.
• get-method calls are free (read-only via liteserver) but per-call rpc. caching and batching matter at scale.
fourth wall: sharding is real, and it reshapes the ingestion topology.
ton uses what they call the infinite sharding paradigm — conceptually every account lives in its own shard. in practice the basechain (workchain 0) splits and merges shards based on load. there can be 1, 2, 4, 8 shards at any moment, identified by binary prefixes.
the masterchain (workchain -1, single shard) is the source of truth: each masterchain block references the head of every shard chain at that height. ingestion loop:
loop:
mc_block = liteserver.lookup(workchain=-1, seqno=last+1)
shards = liteserver.getAllShardsInfo(mc_block)
for shard_block in shards:
for tx in listBlockTransactions(shard_block):
persist(tx, tx.in_msg, tx.out_msgs)
last = mc_block.seqno
shard splits/merges break naive "follow shard X" loops. you have to rediscover the shard set from each masterchain block, which is the opposite of solana (single global stream) or EVM (one block = one ordered list).
inter-shard messaging is also specific. a message produced in shard X at block N is guaranteed delivered in shard Y at block N+1 (hypercube routing). practically: a multi-hop swap traverses several shards and surfaces across several block heights. our trace reconstructor has to be height-tolerant — a swap initiated at masterchain block 12000 may not have its pay_to confirmation until block 12005.
fifth wall: failed transactions look successful.
every internal message has a bounce flag. if the destination contract errors during compute phase and bounce=1, the runtime sends a bounce message back with bounced=1 and the original op-code prefixed with 0xffffffff. the bounce looks like a normal message in the log — you have to filter bounced=1 everywhere or double-count failed swaps.
the gotcha: a bounced jetton transfer does NOT unwind the original debit. it sends an internal_transfer back the other way. the sender's jetton wallet was already debited and will be credited on the bounce return. between the two, state is partially mutated and visible on chain. transient atomicity doesn't exist — there is a real intermediate window where money has moved one direction but not the other.
sixth wall: prices come from get-methods, not state diffs.
on EVM you derive a pool's reserves from event logs (Sync on uniswap v2) or from storage slots. on ton, a pool emits no useful event for its state, and reading "storage slots" doesn't translate cleanly because contract storage is a cell tree, not a key/value array.
instead, you call the contract's get_pool_data() get-method, parameterized at a specific BlockIdExt = (workchain, shard, seqno, root_hash, file_hash). liteservers support querying historical state. so price-at-block becomes "compute the right BlockIdExt for the trace's pool tx, then call get_pool_data() against it." our pricing layer needed to learn an entirely new pattern for one chain.
how it lands in the unified model.
mobula stores a normalized swap record across every chain: input/output token, input/output amount, sender, dex, pool, timestamp, fees. on EVM, one log produces one record. on TON, one trace tree produces one record, and the mapping work is non-trivial:
• input/output token: derived from jetton wallet → jetton master backwards lookup, then resolved to canonical metadata via get_jetton_data() cached by master address.
• input/output amount: parsed from transfer and pay_to payloads, with bounce filtering.
• timestamp: the masterchain block's now field for the trace's earliest tx.
• fees: sum of total_fees across all txs in the trace, not just the user-facing one. this is the hidden cost of message-passing — every hop pays gas.
the integration also forced our perps store to make a sharper distinction between "atomic vm" and "trace vm" venues. settlement semantics differ. on EVM/SVM, a swap fill is a single state transition. on ton, the equivalent of a "fill" is a confirmed pay_to, and there is a multi-second window where the swap is initiated, partially routed, and not yet confirmed. our ws stream had to grow a swap.status: initiated | confirmed | bounced shape that no other chain needs.
takeaway.
ton is a clean counterexample to the assumption that "all chains are basically the same with different bytecode." async message-passing, deterministic per-pair token wallets, op-codes instead of events, sharded ingestion, bounces that look like normal messages, prices read via get-methods at historical BlockIdExt — the EVM intuition doesn't carry. the unified data layer is harder when the underlying primitives don't match. but that's exactly what the unified layer is for.