Everything Will Be Tokenised
The next decade is not about which AI model you use. It is about which tokens you control. From memory to identity to expert knowledge, every business asset is becoming machine-readable. The organisations that understand this will compound. Those that do not will become raw material for someone else's token factory.
Not Just Words on a Page
When people talk about tokens and AI, they usually mean one thing: the chunks of text a language model processes when predicting the next word. The Declaration of Independence is 1,627 tokens. GPT-4 processes roughly 10 trillion tokens a month. Pay per token. Think in tokens. Bill in tokens.
That framing is too narrow, and the organisations that mistake it for the full picture will pay for it.
Tokenisation is the process of rendering the world for computers. When you tokenise data, identity, money, or knowledge, you make what people know and do across human systems (finance, commerce, healthcare, logistics) safely accessible to software, including increasingly autonomous software. Tokens are the DNA of AI. They are its feedstock and its fuel.
Not all tokens are equal. There are seven distinct token categories emerging that will define which organisations win the next decade. Each has a different supply curve, a different competitive dynamic, and a radically different value trajectory. Understanding which ones your business generates, which ones it protects, and which ones it is quietly giving away is the most important strategic question in AI right now.
Seven Tokens That Define the Next Decade
Three broad groupings are worth holding in your head: tokens used for value transfer, tokens used for building expertise, and tokens used for personalisation. Within those three sits the taxonomy that matters in practice.
Identity tokens
Cryptographically verifiable proof of who a person, organisation, or agent is. As machines proliferate and deepfakes become indistinguishable from the real thing, identity tokens are the foundation of any trusted interaction. Biometrics, verified credentials, Know Your Customer (KYC) records: these are identity tokens. Businesses that hold them and issue them sit at the chokepoint of the entire agent economy.
Context tokens
How you behave, not how you describe yourself. Every click, transaction, navigation path, and revealed preference is a context token. These are not self-reported; they are inferred from real behaviour. A retailer with 40 million customers generating daily context tokens through purchases, browsing, and returns holds something no competitor can buy off the shelf.
Access tokens
Authorisation credentials that unlock identity and context data for trusted third parties. Stripe's payment tokens, Plaid's account connectivity tokens, Persona's reusable identity tokens: all access tokens. They are the hardest to build, requiring years of trust-building, regulatory compliance, and scale, and the most defensible once established.
Memory tokens
Context shared directly and persistently with AI systems. When you tell an AI your preferences, your goals, and your history, you are generating memory tokens. The organisation that holds your memory tokens has an informational advantage over every other organisation you interact with. This race has barely started, and the eventual winner may be sitting on the most valuable dataset in commercial history.
Expert tokens
Refined representations of specialist knowledge, built from human expert feedback on AI outputs. A medical AI trained on annotated clinical decisions generates expert tokens. A financial analysis tool that learns from experienced analysts correcting its outputs generates expert tokens. These compound through use: the more decisions an expert reviews, the more valuable the resulting token becomes.
Knowledge tokens
The broad training corpus that gives AI systems general capability: web text, research papers, code repositories. Unlike the categories above, knowledge tokens are largely commoditised. Any foundation model can access them. The value has already shifted up the stack to the specialised tokens that sit on top.
Asset tokens
Digital representations of financial and physical assets: stablecoins, tokenised securities, programmable payment credentials. As agents gain the ability to transact autonomously, asset tokens become the fuel for the machine economy. Launching a programmable asset on Solana or Base now costs under ten dollars. The infrastructure is ready.
The Revolut Proof
You can talk about token taxonomy in the abstract for a long time. Revolut's April 2026 research paper, PRAGMA: Revolut Foundation Model, is what it looks like in practice when an organisation commits to its own token inventory rather than borrowing someone else's.
Revolut did not license a foundation model from OpenAI, Anthropic, or Google and fine-tune it on a subset of customer data. They built their own.
PRAGMA is a family of Transformer-based models pre-trained on 26 million user records spanning 111 countries, covering 24 billion behavioural events totalling 207 billion tokens. Those tokens are not words. They are card transactions, app navigation paths, trading activity, communications, account milestones, and profile states: every interaction a Revolut customer has ever had with the platform, encoded and made machine-readable.
The point of PRAGMA is not that Revolut built a big model. The point is what they built it from. Every time a Revolut customer makes a payment, opens the app, or buys a stock, they generate a context token. Aggregated across 50 million customers, those tokens form a representation of financial behaviour that no general-purpose model trained on internet text can replicate. You cannot buy that at an API endpoint.
Standard language models process financial data as text. They tokenise a transaction amount as digit characters. They process a merchant category code as subword fragments. They have no native representation for temporal distance between events, cyclical patterns in behaviour, or the structural relationship between a payment type and its fields. PRAGMA was built to reject this entirely. Its tokenisation scheme is native to banking data: numerical values become percentile buckets, categorical values map to single tokens, and time is encoded both as log-seconds to the last event and as cyclical calendar features. The model understands the difference between a card payment and a transfer not because it has read text about the difference, but because it has processed hundreds of millions of examples of each, structured as the actual events they are.
Revolut is not offering PRAGMA to anyone else. They are not sharing the architecture with competitors. This is a proprietary token factory built from proprietary context tokens, producing expert tokens that compound with every new customer interaction. The moat is not the model. The moat is the 24 billion events that trained it.
The Honest Questions Every Organisation Needs to Ask
Revolut had an unusual head start: a decade of behavioural data from tens of millions of customers, collected on a platform they own. Most organisations are not starting from that position. But every organisation is generating tokens right now, whether they know it or not. The question is whether those tokens are being captured, protected, and compounded, or whether they are flowing quietly to someone else.
Which context tokens are you generating?
Every customer transaction, service interaction, and purchase pattern is a context token. Retailers generating 40 million customer purchase histories, logistics firms accumulating 15 million routing decisions, healthcare providers capturing clinical pathways: all proprietary context token inventories. Most organisations have not mapped them. Fewer still have made them machine-readable in a way that compounds over time.
Which expert tokens are you building?
Every time a human expert reviews an AI output and corrects it, they generate an expert token. Every annotated decision, every quality-checked output, every calibrated model improvement represents accumulated specialist knowledge that becomes harder for competitors to replicate. The organisations building human-in-the-loop feedback systems are not just improving their AI; they are building a proprietary expert token supply that grows with every interaction.
Which tokens are you handing to someone else?
Every third-party AI platform you connect to receives some combination of your context, memory, and expert tokens in exchange for inference capability. The platform improves. Those improvements are distributed to every customer of that platform, including your direct competitors. The short-term productivity gain is real. So is the long-term competitive cost, and most organisations have not done that calculation.
What You Already Have That You Do Not Know About
The challenge for most businesses is not that they lack tokens. It is that their tokens are buried in formats machines cannot use, or held in systems that were never designed to generate a compounding advantage from them. The average large enterprise has petabytes of data and a clear AI strategy on the analyst call. In practice, that data sits in poorly structured databases and unstructured documents. The average knowledge worker still performs over a thousand copy-paste actions a week.
| Token type | Where it lives in most businesses | What it could become |
|---|---|---|
| Context tokens | CRM records, transaction histories, support logs | Personalised agent behaviour, predictive models, customer-level intelligence |
| Expert tokens | Analyst decisions, underwriter judgements, clinical annotations | Domain-specific models that outperform general-purpose AI on specialist tasks |
| Identity tokens | KYC records, biometric data, verified credentials | Trusted access infrastructure for agent-to-agent transactions |
| Memory tokens | Customer preferences, service history, stated goals | Persistent agent context that improves every subsequent interaction |
| Access tokens | API credentials, OAuth tokens, integration keys | Verified agent authorisation infrastructure for trusted multi-agent commerce |
| Asset tokens | Payment rails, inventory records, financial accounts | Programmable settlement infrastructure for autonomous agent transactions |
The gap between the data a business holds and the tokens it can deploy is where most of the value is being left. Not because the data does not exist, but because nobody has mapped it, structured it, or protected it as a strategic asset.
Why Picking the Best Model Is the Wrong Game
The cost of running large language models has dropped by a factor of one thousand over three years. Access to frontier model capability is approaching commodity pricing. Every organisation can afford GPT-4. Every organisation can afford Claude. The model is no longer the moat.
What Revolut built with PRAGMA demonstrates this cleanly. A standard language model fed Revolut's transaction data would treat it as text: amounts as digit sequences, merchant codes as subword fragments, event timing as a number with no structural meaning. PRAGMA treats the same data as what it is: a structured sequence of financial events with semantic types, value distributions, and temporal relationships. The underlying model architecture matters far less than the decision to build a representation of your data that only you can build.
This does not mean every organisation needs to build a billion-parameter foundation model. It means every organisation needs to understand which of its data assets are genuinely proprietary, which are being captured in a form that compounds, and which are being handed to third-party platforms that distribute the resulting intelligence to the entire market.
The Access Token Layer: Why Verification Is the Starting Point
There is one token category that sits underneath all the others: access tokens. Before context can be shared, before memory can be accessed, before a transaction can be initiated, something has to answer the question of whether the requesting agent is who it claims to be.
This is not a theoretical concern. An agent can claim to represent your brand today, query your systems, and return responses under your name, without any cryptographic proof that it is authorised to do so. As agent-to-agent (A2A) commerce scales from experimental to operational, the absence of a verified agent endpoint will be the single most exploitable gap in the commercial internet.
Fetch.ai's Agentverse and the Almanac registry are built around this problem. Every agent registered on the Almanac carries a cryptographically signed identity, verified on-chain. Before any token exchange happens, before context is shared or a transaction is initiated, both parties confirm with mathematical certainty that they are communicating with a verified, authenticated agent. This is what Know Your Agent (KYA) looks like in practice: not a policy, but a protocol.
In the same way that KYC verification became a compliance requirement as financial fraud scaled, KYA will become a commercial requirement as agentic interactions scale. The organisations that build this infrastructure now will not need to retrofit it when retrofitting means interrupting live commercial flows.
Where to Start
Revolut required a decade of platform data and a dedicated research organisation to build PRAGMA. Most businesses are not starting from that position, and they do not need to be.
The practical starting point is a token audit: an honest inventory of which tokens your organisation is already generating, which are being captured in machine-readable form, which are being given away to third-party platforms, and which represent a genuine proprietary advantage if protected and compounded.
Most organisations that do this exercise find the same three things. They are generating far more context tokens than they have made machine-readable. They are sharing more expert tokens with third-party platforms than they realised. And they have no verified access token infrastructure, which means any agent they deploy, or any agent that claims to represent them, is operating on assumption rather than verification.
The businesses that move from that audit to a strategy, even a narrow one focused on a single high-value token category, will be compounding while their competitors are still evaluating vendor proposals.
Revolut did not start by building a one-billion-parameter model. They started by capturing every banking event in a structured, machine-readable format. Everything else followed from that discipline. The same logic applies whether you are a retailer, a logistics operator, a professional services firm, or a manufacturer. Your context tokens already exist. The question is whether you own them.
Ready to map your token inventory?
The organisations that understand their own token types, and which ones they are building versus giving away, are the ones that will compound in the agent economy. We would welcome a conversation about where your organisation sits and what a token strategy might look like in practice.
Sources
- Ostroukhov, M., Mikhailov, R., Iashin, V., Sokolov, A., et al. "PRAGMA: Revolut Foundation Model." arXiv:2604.08649, Revolut Research and NVIDIA (April 2026)
- Ribbit Capital. "The Token Revolution." Annual Partner Letter, June 2025. Token taxonomy and strategic framing referenced with analytical agreement.
- Jensen Huang, NVIDIA CES 2025 Keynote. Token factories framing.
- Revolut PRAGMA pre-training dataset: 26M user records, 111 countries, 24B events, 207B tokens, 25-month temporal range 2023-2025
- OpenRouter: 5 trillion tokens processed monthly (2025)
Joe Hurst - Chief Revenue Officer
Joe.Hurst@fetch.ai