AI Styling Studio — Infinite avatar looks from just 1 photo. Try it now.

Advertise here →

Submit your Tool

8000+ AI tools already listed

8K+Tools

100K+/moViews

25K+/moVisitors

Discover

Resources

AI NewsCan tech companies learn to love cheaper AI models?

Can tech companies learn to love cheaper AI models?

3:32 AM IST · June 10, 2026

Can tech companies learn to love cheaper AI models?

The AI boom has been built on a basic assumption: Bigger models are more powerful, and the most powerful models win. Now, the industry is about to learn what happens if that assumption starts to break. Mounting costs have already pressured users to give smaller and cheaper models a second look. Thiscost-conscious model-shoppingis new and it’s unclear how it will affect the industry, but the impact is likely to be significant. One prediction, laid out best by Coinbase co-founder Brian Armstrong, is that it will result in the vast majority of tasks shifting to cheaper models. “[D]emand for intelligence is near infinite, but 80% of workloads will be running on 99% cheaper models within 12-18 months,” Armstrongwrote on X. “20% of workloads will still run on latest gen models where IQ maxing is important.” It’s hard to overstate what a significant shift it will be for the AI industry if Armstrong’s prediction comes true. Before now, most AI companies have competed on quality, which has meant defaulting to the most advanced available model. If those same jobs can be handled by cheaper models without affecting quality, it would mean a massive shift in the economics of AI. And critically, much of the savings would be coming out of the pockets of the big labs, dealing a financial blow to OpenAI and Anthropic just as they’re heading for their IPOs. It’s a potentially seismic change in the industry, resting on one basic question: Are companies ready to switch to smaller models? Initial tests suggest that, when the system is arranged right, cheaper models could sub in without any sacrifice in quality. In a recent test by the legal AI tool Harvey, the company was able to reduce inference costs by 3x without reducing quality. The test,performed in partnershipwith the inference platform Fireworks AI, combined Claude Opus and Fireworks’ GLM 5.1, and shifted to Opus for the most intensive tasks. The result was a significantly lower load in terms of server time and overall cost. “Quality comes first, and in legal it always will,” Harvey co-founder Gabe Pereyra told TechCrunch, referring to the AI legal services his startup provides. “However, the definition of quality is evolving from simply using the most powerful model for everything, to using the best model that gets the right answer most efficiently.” This trend is often framed in terms of major labs versus Chinese models or open-weight ones, but that misses the bigger point. The real divide isn’t between proprietary and open models; it’s between large models and small ones. You can save money by switching from GPT-5.5 to DeepSeek’s V4 Flash, but switching to GPT-5.4-mini works just as well. There’s an active price war going on between in-house inference from the big labs and independently served open-weight models. For the bigger question of small versus large, it doesn’t really matter which kind of small model wins out. All of this might seem obvious — of course you shouldn’t use more compute than necessary — but it runs counter to the scaling-first approach that has dominated the industry until now. Inspired bythe bitter lesson, labs have leaned hard into training the most compute-intensive models possible, pushing the frontier of what AI models can do. With prices heavily subsidized by investors, clients had no reason to choose anything but the most advanced option. With token prices rising and subsidies slowing down, users are facing cost pressure for the first time. We don’t know whether the new cost pressure will actually drive enterprise users to smaller models. They could just as easily economize by making fewer calls, using less context, or simply giving up on the least promising deployments. But if it turns out that most deployments can be run just as well on a smaller model, it could put a serious damper on the growing demand for inference — and raise new questions about how to justify the cost of training a frontier model.

Latest AI News

View All News →

Almost half of U.S. singles feel negatively about AI in dating, Match says

Almost half of U.S. singles feel negatively about AI in dating, Match says

Dating app giant Match Group — which owns apps like Tinder, Hinge, and OkCupid — conducted astudyto determine how U.S. singles really feel about the relationship between AI and dating. Turns out, people don’t want AI messing with every aspect of human life. Across the industry, dating apps are experimenting with AI. Bumble introduced adating assistant named Bee, and Tinder isspendingso much on AI tools that it’s slowed its hiring process. Meanwhile, Hinge’s CEOstepped downlast year to launch a more AI-focused dating app altogether. But according to Match’s survey of 1,000 people aged 18 to 39, 47% of singles have a negative view of AI’s use in romantic contexts. This perspective varies depending on what the AI is being used for. About 40% of singles say they would refuse to date someone who uses an AI companion app, and that figure rises to 51% among women ages 18 to 24. However, only 12% of 18- to 24-year-olds said that they had used a companion app over the last three months, and only about a third of those users said they were seeking genuine connections with those chatbots. While Match says that people harbor a “near-universal” disapproval of actually dating an AI, like in the movie “Her,” that doesn’t mean that respondents are wholly opposed to AI features within apps. Some 64% of respondents said they could see how AI might help them in their dating journey. If we’re being pedantic,technically, every major dating app has already used some form of matching algorithm since before we knew what a GPT was. This survey refers to the new crop of AI features that basically every app is introducing, which help users punch up their profiles, choose photos, and keep conversations flowing. What dating app developers should take away from this survey is that people are not entirely closed off to AI; they just don’t want to be in a relationship with a robot, nor do they want to feel as though their dating experiences are overly inundated with technology that feels inauthentic. “Ask singles what they want from AI in dating, and the answer is pretty consistent: help with the hard parts, but hands off for the human parts,” Match wrote in a blog post. “Yes, they’ll use it to help them punch up a profile or for help figuring out what to say when a conversation goes quiet, but the actual connection is still theirs to create.” Hopefully, this message reaches dating entrepreneurs like Bumble founder Whitney Wolfe Herd, who suggested that dating app users could havepersonal bots that date other users’ bots. It’s pretty normal nowadays to say you met your partner online, but “his bot asked my bot out, and our bots hit it off” will never be a socially acceptable meet-cute.

View

OpenAI is bringing on some big guns in the lead-up to its IPO

OpenAI is bringing on some big guns in the lead-up to its IPO

OpenAI is bringing on some big names to the team in the lead-up to its public debut: Google DeepMind AI legend Noam Shazeer and former Trump White House AI policy official Dean Ball. Shazeer, a co-lead at Gemini and the founder of AI role-playing startup Character AI,announced his departure on Wednesday. He had been at Google since 2000, leaving only for a three-year period when he left to co-found Character AI. Two years ago,Google re-hired Shazeerin a $2.7 billion deal that gave the tech giant access to the startup’s technology. The move is the latest in a series of shufflings between the top AI labs, including Google, OpenAI, Anthropic, and Meta. Shazeer is credited for being one of the foundational minds behind modern generative AI. He co-authored the seminal 2017 paper “Attention Is All You Need,” which introduced the Transformer architecture. Before leaving Google, Shazeer had also reportedly been stirring the pot when it came to political issues. According toThe Information,Shazeer voiced opinions on internal messaging boards on transgender identity and Israel’s war in Gaza that resulted in management deleting his posts. Whether those controversies will follow him to his new employer remains to be seen. In the meantime, OpenAI is also shoring up its policy credentials by bringing Ball to the team. Ball had a brief stint last year in the White House, where he helped publish America’s AI Action Plan before stepping down to rejoin the techno-libertarian think tank the Foundation for American Innovation as a senior fellow. “I am pleased and honored to announce that, on July 6, I’ll be joining OpenAI as leader of a new team called Strategic Futures,”Ball wrote on X on Thursday. “Our mandate will be to help the company’s leadership shape frontier AI policy.” Ball will report directly to Chief Strategy Officer Jason Kwon. The “small, high-agency team” will focus on “matters pertaining to: catastrophic risk, recursive self-improvement, labor market impact, and the relationship between the frontier labs, governments (particularly the U.S. Federal Government), and society,” Ball wrote in ablog post. The Strategic Futures team will cover both public-facing policy and internal governance, he added. That last is important — Ball noted that “almost by necessity,” AI labs will have to lead on AI governance decisions. “In other words,internal governancewill be more central to the future of AI than most people realize,” Ball wrote. Ball’s decision to join OpenAI — arguably an AI favorite in the administration — comes as Anthropic battles once again with the U.S. government. Late last week, President Donald Trump ordered anexport control ban on Anthropic’s latest models,Fable 5 and Mythos 5, leading to the AI firm being forced to take the models down entirely to avoid noncompliance. For anyone who had “government interference” on their S-1 risk factor bingo card, Ball is what it looks like when a company locks in its insider status while a rival is squeezed. TechCrunch has reached out to OpenAI for more information.

View

Snap spins off AI video team into new company, Dotmo, due to costs

Snap spins off AI video team into new company, Dotmo, due to costs

Snap will be spinning off an internal generative AI video team into a separate company. The new company — dubbed Dotmo — will focus on developing AI models that can create interactive gaming experiences, Snap told TechCrunch. Snap cited the high costs of conducting such work internally as one of the reasons for the spinoff. While technically a separate company, Dotmo will retain its close ties to the Snapchat creator. For one thing, Snap will provide Dotmo with a license to adapt its technology for gaming and interactive entertainment platforms. At the same time, the initial Dotmo team will consist of a group of current Snap staff who are leaving Snap to launch the new venture. Additionally, while Dotmo won’t be funded by Snap directly, the company says that Bobby Murphy, its chief technology officer, will act as lead investor and will have a significant personal stake in the new firm. Though a financial backer, Murphy will continue to work for Snap full-time as its CTO and continue to lead its GenAI research and development initiatives. In exchange for the talent and the technology license, Snap will get a large equity stake in Dotmo, the company said — a position that could prove rewarding if the company prospers in the future. Dotmo may also eventually seek outside funding, Snap said. The move marks Snap’s second major spinoff effort this year. Earlier in 2026, Snapspun off Specs into a new companyto focus exclusively on the development of its smart glasses line. (Snap’s recent unveiling of Specswasn’t exactly a home runfor the company. Snap’s stock tanked afterconcerns were raisedabout the hefty price tag attached to the new smart glasses, which is around $2,200.) Snap also underwent a round of layoffs earlier this year, during whichsome 1,000 jobs were cut. Dotmo represents a different kind of spinoff than the Specs operation, in that its team will be focused on developing digital experiences that aren’t currently a part of Snap’s core business priorities, a Snap representative said. However, it could still be considered a partner in the future if the fit seems right, they added. Spin-offs can be a cost savings strategy for companies, although they can serve a variety of other purposes — like showing off a particular asset, generating investor attention, or providing operational flexibility to the team involved. In spinning out Dotmo, Snap may be reducing the financial burden associated with its AI efforts, while still maintaining exposure to any potential upside through its equity stake.

View

AI inference startup Baseten reportedly raising $1.5B months after its last mega-round

AI inference startup Baseten reportedly raising $1.5B months after its last mega-round

AI inference company Baseten is close to finalizing a stunning $1.5 billion funding round at a $13 billion valuation,the Wall Street Journal reports. Just five months ago, the startup announced that it had raised a$300 million Series Eat a $5 billion valuation. And that round was just nine months after raising a$150 million Series D. If finalized, this latest round would represent a 160% increase in valuation in less than half a year. However, the WSJ reports that this is asplit-priced round, a tactic startups are using to boost their headline valuation and make lead investors look good on paper. Some investors in this latest funding round are reportedly coming in at a $13 billion valuation, while others at $11 billion, sources told the Journal. This deal is said to be co-led by Spark Capital, Sands Capital, Altimeter Capital, and Wellington Management. Launched in 2019, Baseten is a startup benefiting from what The Next Wave hailed the “inference gold rush,” in which VCs are pouring enormous amounts of money into companies building the inference layer. Inference is what the model does after a user submits a prompt. Baseten promises to handle inference quickly while controlling costs by routing requests to the best-for-task model, especially to competent, less-expensive open source alternatives.

View

Submit your Tool

Submit AI Tools – The ultimate platform to discover, submit, and explore the best AI tools across various categories.

PoweredByAI.app is an AI Tools Directory helping individuals, businesses, and creators discover the best AI tools for writing, coding, design, productivity, and more.

Contact Promote Analytics Terms of Service Refund Policy Privacy Policy

© 2026 , Product of011BQ. All rights reserved.