
@AI-ForceField: Two recent incidents left a deep impression on me.
The first is Justin Sun. He launched a project called "Boring Cat," without even a technical white paper, and its revenue exceeded one million in the first week. He then pushed out DeepPal and BAI Law, all relying on the same underlying trick: flipping tokens.
The second is Sheng Fu. Cheetah Mobile officially launched its "Model Gateway" service, allowing users to call mainstream global large models with a single API Key. He directly revealed the pricing, claiming his service costs only one-seventeenth of his competitors'.
One is the most controversial operator in the crypto space; the other is an internet veteran. Their styles are poles apart, yet they have both rushed into the exact same track.
This track is called the "AI Relay Station."
It doesn't build models, hoard computing power, or touch training data. Sounds like a middleman? But look closely: this track is疯狂 sucking up money, talent, and attention. What exactly is it? Why did it make these two pivot at the same time?
1. What Exactly is an AI Relay Station?
Let's recreate a real-world scenario first.
You are a developer who wants to integrate AI dialogue into your product. How many things do you have to do?
Register an overseas account, bind an overseas credit card (domestic cards will most likely be rejected), and set up a proxy yourself to solve network issues. Then, the interface format for each model is different: OpenAI has one, Claude has another, and Google Gemini has yet another. Want users to freely choose models? Write adaptation code for each one. Finally, you have to bear the API bills; a single GPT-4 call might cost dozens of cents, and as users increase, the bills stack up.
Four words: annoying, expensive, difficult.
The AI relay station exists to solve this pain point. It only does three things.
First, it purchases on your behalf. It buys tokens in bulk from manufacturers like OpenAI and Anthropic; large volumes mean cheaper prices and discounts. Second, it builds the road for you. Domestic servers connect directly without needing a VPN, and payments are also sorted out for you. Third, it unifies the interfaces. All differences between models are smoothed out; you only need one API Key to call all mainstream models.
To put it bluntly, it turns expensive, scattered, and hard-to-use AI capabilities into plug-and-play tools. You don't need to care about how it connects in the background; you just focus on using it.
It should be specifically noted that this business is B2B. Relay stations sell API calling capabilities, essentially serving as developer tools. Their customers are not ordinary users chatting with ChatGPT, but teams or individuals with development capabilities, such as SMEs, independent developers, and tech departments of large enterprises. C-end users can just use ready-made dialogue products; they have no need to touch APIs.
2. Underlying Logic: A Smart Dispatch System
Many people get a headache when they hear "underlying architecture." Let me explain it differently and you'll understand.
You want to send a package from Guangzhou to New York. You don't need to arrange the plane or handle customs yourself; just hand it to Cainiao. The AI relay station is that "Cainiao system," except it transports not packages, but your data requests.
It consists of four layers.
The first layer is the unified entry point. When you send a request, it first identifies who you are and checks your balance, then hands it over to the internal system.
The second layer is intelligent routing. This is the brain of the entire system. It makes a few judgments: if your budget is limited, it routes to a cheaper model; if the task is complex, it uses GPT-4; if the official API suddenly slows down, it automatically switches to a backup channel; if one API Key gets rate-limited, it switches to the next. You remain completely oblivious to all this; it does it all automatically for you.
The third layer is protocol conversion. Each model's interface looks different. This layer handles translation; you use only one set of parameters, and it converts them into the format recognized by each model.
The fourth layer is billing and caching. It accurately records how many tokens you've used and deducts fees on a pay-as-you-go basis. If someone else has asked the same question, the cache returns the result directly, saving tokens and money.
In terms of technical implementation, the mainstream approach is modifying Envoy or Cloudflare Gateway. The real challenge isn't writing those few hundred lines of code, but keeping it running stably. Withstanding sudden traffic spikes, fending off malicious attacks, and dealing with rate limits and account bans from upstream providers. Many think this is just putting a shell on something, but those who have actually done it know: making a gateway 99.9% available is harder than writing a model.
3. Business Model: Three Layers of Profitability, Each Deeper Than the Last
How to make money? There are three layers, from shallow to deep.
The first layer is the wholesale-retail price difference. Large customers can get 20% to 50% discounts through bulk purchasing. The relay station pools the needs of dozens of small and medium customers to negotiate with manufacturers. After getting the discount, it retails at 70% to 90% of the official price. Let's do the math: the purchasing cost is 50% of the official price, selling at 80%, yielding a 30% gross margin. With a monthly revenue of one million, the gross profit is 300,000. Deducting server and payment costs, the net profit is 150,000 to 200,000. This is the basic gameplay; the barrier to entry is low, so competition is the fiercest.
The second layer is the toll fee. Some users don't want to buy tokens from the relay station; they have their own official keys but want unified interfaces and automatic failover. The relay station doesn't sell tokens, only services, taking a 5% to 10% cut per call. This model is lighter, requiring no upfront capital for purchasing, but user stickiness is low, as they can switch away at any time.
The third layer is value-added services, which is also the deepest moat. Large enterprise customers are willing to pay higher prices for "stability" and "security." For example, a hot standby pool of multiple accounts—official account bans are commonplace, so the relay station maintains a pool of hundreds of keys; if one gets banned, it automatically switches to the next, completely seamlessly for the user. Another example is intelligent routing strategies: using cheap models for writing emails and powerful models for writing code, saving users money while the relay station earns the difference. There's also data caching, compliance auditing, and so on. The profit margin here is far higher than selling tokens, and once customers start using it, they can't be bothered to migrate.
The essential characteristics of this business are very obvious: extremely low marginal costs—adding a new user barely increases server costs; excellent cash flow—users top up in advance, so the platform always has a pool of money on hand; and obvious economies of scale—the more users, the larger the purchasing discounts, and the thicker the profits. A classic winner-takes-all scenario.
4. Why Did Justin Sun and Sheng Fu Rush In at the Same Time?
The two have completely different motives, but they spotted the same underlying trend.
Let's talk about Sheng Fu first. He once said, "We will not develop our own large models; we will only be the porters of large models." At the time, many thought this was modesty or a sign of lacking technical prowess. Looking back now, he was placing an early bet on a judgment: large models will rapidly become commoditized. Prices will get lower and lower, and performance across companies will increasingly converge. What will truly be scarce is no longer the models themselves, but the low-cost, low-barrier connectivity. Cheetah Mobile happens to have overseas payment channels, a global server network, and large-scale operations experience. So, building a model gateway isn't crossing over for him; it's a natural extension of his capabilities.
Now for Justin Sun. His approach is more aggressive. The underlying structure of his projects are all AI relay stations, but payments are made directly in USDT, and crypto discounts are even larger than fiat ones. Why? First, the startup cost is extremely low—no need to develop models or buy GPUs. Second, the top-up model naturally generates capital accumulation; if this money is in stablecoins, it can be directly put into DeFi to earn interest. Third, registering in offshore areas bypasses domestic regulations. For Justin Sun, the AI relay station itself is not the end goal, but a tool to acquire real-world application scenarios and cash flow for his crypto empire.
The two share the exact same judgment: large models are becoming public infrastructure, like electricity and bandwidth. Profits at the model layer will be squeezed thinner by competition, while the connection layer—that is, how to make it convenient and cheap for users to access these models—will take the lion's share of the industry's profits.
5. The Gray Areas Behind the Glamour
A fast-growing track inevitably has a side that cannot see the light of day.
The first is black market tokens. Some small relay stations use stolen credit cards to buy API quotas, making the cost almost zero. But the risk is extremely high; once discovered by the manufacturers, all associated accounts are banned, the users' pre-loaded money goes to zero, and data could be leaked.
The second is model downgrade fraud. Users think they are calling GPT-4, but the backend secretly swaps it for the free Llama 2. The output quality drops, and users might even think it's their own fault for not writing a good prompt.
The third is data abuse. Certain platforms log every user question and answer in plaintext, then use it to train their own models or sell it to third parties. This is explicitly prohibited in OpenAI's terms of service.
This is why the regular armies are accelerating the building of their own moats: obtaining official partner qualifications, promising not to mine data, and introducing third-party audits. Sheng Fu is playing the "listed company compliance and reliability" card. Gray markets feed on short-term dividends; regular armies feed on long-term trust.
6. Who is Making Money?
Training large models requires billions of dollars; fewer than five companies in China can afford to play that game. Building AI relay stations has a much lower barrier to entry, generates money much faster, and is much closer to the user.
Justin Sun sees it, Sheng Fu sees it, and capital is pouring in.
The next time you use an AI application, it's very likely not connecting directly to OpenAI's data centers, but routing through an intermediate layer called a "relay station." That name might be small today, but it is becoming a contested territory for all strategists.
Gold miners always take risks, while the people selling tools next to them make the steadiest profits.

微信扫一扫打赏
支付宝扫一扫打赏 
Comments (0)