Category: AI and ML

    AI and MLAI Business Strategychecks and balancesFinance AIService Industry AIWorld of Work

    Ensuring effective AI in insurance operations

    Artificial intelligence has been part of the insurance sector for years – the Finance function in many businesses is often the first to automate. But what’s remarkable in the instance of AI is how directly the technology is woven into day-to-day operational work. Not sitting in the background as a niche modelling capability, AI is […]

    The post Ensuring effective AI in insurance operations appeared first on AI News.

    AI and MLAI and UsAI in ActionArtificial IntelligenceDeep DivesFeatures

    AstraZeneca leads big pharma’s AI clinical trials revolution with real-world patient impact

    Big Pharma’s AI race extends across drug discovery, development, and clinical trials—but AstraZeneca has distinguished itself by deploying AI clinical trials technology at an unprecedented public health scale.  While competitors optimise internal R&D pipelines, AstraZeneca’s AI is already embedded in national healthcare systems, screening hundreds of thousands of patients and demonstrating what happens when AI […]

    The post AstraZeneca leads big pharma’s AI clinical trials revolution with real-world patient impact appeared first on AI News.

    AIAI and MLAI newsai tech newsAItech newsArtificial Intelligence

    Penguin Ai and FTI Unite Expertise for Next-Gen Revenue Cycle Performance

    Collaboration to help providers improve performance in critical RCM functions Penguin Ai, a health care artificial intelligence (AI) company, announced today a collaboration with FTI Consulting, Inc. (NYSE: FCN), a global business advisory firm, to help healthcare providers strengthen revenue cycle performance and reduce administrative burden. By integrating Penguin Ai’s platform...

    The post Penguin Ai and FTI Unite Expertise for Next-Gen Revenue Cycle Performance first appeared on AI-Tech Park.

    AIAI and MLAI newsAI servicesAI solutionsai tech news

    Bespin Global US Achieves AWS AI Services Competency

    Milestone Marks Significant Step in Bespin Global’s AI Growth Strategy Bespin Global, a leader in cloud and AI solutions, is excited to share that it has earned the AWS AI Services Competency. This achievement highlights Bespin Global’s dedication to delivering practical AI innovation for customers using AWS. The AWS AI Services Competency...

    The post Bespin Global US Achieves AWS AI Services Competency first appeared on AI-Tech Park.

    AIAI and MLAI newsai tech newsAItech newsartificial intelligence news

    Patronus AI Introduces Generative Simulators

    New research describes how simulations can generate fresh tasks, rules, and grading on the fly, enabling rich, adaptive RL environments for today’s agents Patronus AI today announced “Generative Simulators,” adaptive simulation environments that can continually create new tasks and scenarios, update the rules of the world in a simulation environment, and...

    The post Patronus AI Introduces Generative Simulators first appeared on AI-Tech Park.

    AIAI and MLTechnology

    Gemini 3 Flash arrives with reduced costs and latency — a powerful combo for enterprises

    Enterprises can now harness the power of a large language model that's near that of the state-of-the-art Google’s Gemini 3 Pro, but at a fraction of the cost and with increased speed, thanks to the newly released Gemini 3 Flash.

    The model joins the flagship Gemini 3 Pro, Gemini 3 Deep Think, and Gemini Agent, all of which were announced and released last month.

    Gemini 3 Flash, now available on Gemini Enterprise, Google Antigravity, Gemini CLI, AI Studio, and on preview in Vertex AI, processes information in near real-time and helps build quick, responsive agentic applications. 

    The company said in a blog post that Gemini 3 Flash “builds on the model series that developers and enterprises already love, optimized for high-frequency workflows that demand speed, without sacrificing quality.

    The model is also the default for AI Mode on Google Search and the Gemini application. 

    Tulsee Doshi, senior director, product management on the Gemini team, said in a separate blog post that the model “demonstrates that speed and scale don’t have to come at the cost of intelligence.”

    “Gemini 3 Flash is made for iterative development, offering Gemini 3’s Pro-grade coding performance with low latency — it’s able to reason and solve tasks quickly in high-frequency workflows,” Doshi said. “It strikes an ideal balance for agentic coding, production-ready systems and responsive interactive applications.”

    Early adoption by specialized firms proves the model's reliability in high-stakes fields. Harvey, an AI platform for law firms, reported a 7% jump in reasoning on their internal 'BigLaw Bench,' while Resemble AI discovered that Gemini 3 Flash could process complex forensic data for deepfake detection 4x faster than Gemini 2.5 Pro. These aren't just speed gains; they are enabling 'near real-time' workflows that were previously impossible.

    More efficient at a lower cost

    Enterprise AI builders have become more aware of the cost of running AI models, especially as they try to convince stakeholders to put more budget into agentic workflows that run on expensive models. Organizations have turned to smaller or distilled models, focusing on open models or other research and prompting techniques to help manage bloated AI costs.

    For enterprises, the biggest value proposition for Gemini 3 Flash is that it offers the same level of advanced multimodal capabilities, such as complex video analysis and data extraction, as its larger Gemini counterparts, but is far faster and cheaper. 

    While Google’s internal materials highlight a 3x speed increase over the 2.5 Pro series, data from independent benchmarking firm Artificial Analysis adds a layer of crucial nuance.

    In the latter organization's pre-release testing, Gemini 3 Flash Preview recorded a raw throughput of 218 output tokens per second. This makes it 22% slower than the previous 'non-reasoning' Gemini 2.5 Flash, but it is still significantly faster than rivals including OpenAI's GPT-5.1 high (125 t/s) and DeepSeek V3.2 reasoning (30 t/s).

    Most notably, Artificial Analysis crowned Gemini 3 Flash as the new leader in their AA-Omniscience knowledge benchmark, where it achieved the highest knowledge accuracy of any model tested to date. However, this intelligence comes with a 'reasoning tax': the model more than doubles its token usage compared to the 2.5 Flash series when tackling complex indexes.

    This high token density is offset by Google's aggressive pricing: when accessing through the Gemini API, Gemini 3 Flash costs $0.50 per 1 million input tokens, compared to $1.25/1M input tokens for Gemini 2.5 Pro, and $3/1M output tokens, compared to $ 10/1 M output tokens for Gemini 2.5 Pro. This allows Gemini 3 Flash to claim the title of the most cost-efficient model for its intelligence tier, despite being one of the most 'talkative' models in terms of raw token volume. Here's how it stacks up to rival LLM offerings:

    Model

    Input (/1M)

    Output (/1M)

    Total Cost

    Source

    Qwen 3 Turbo

    $0.05

    $0.20

    $0.25

    Alibaba Cloud

    Grok 4.1 Fast (reasoning)

    $0.20

    $0.50

    $0.70

    xAI

    Grok 4.1 Fast (non-reasoning)

    $0.20

    $0.50

    $0.70

    xAI

    deepseek-chat (V3.2-Exp)

    $0.28

    $0.42

    $0.70

    DeepSeek

    deepseek-reasoner (V3.2-Exp)

    $0.28

    $0.42

    $0.70

    DeepSeek

    Qwen 3 Plus

    $0.40

    $1.20

    $1.60

    Alibaba Cloud

    ERNIE 5.0

    $0.85

    $3.40

    $4.25

    Qianfan

    Gemini 3 Flash Preview

    $0.50

    $3.00

    $3.50

    Google

    Claude Haiku 4.5

    $1.00

    $5.00

    $6.00

    Anthropic

    Qwen-Max

    $1.60

    $6.40

    $8.00

    Alibaba Cloud

    Gemini 3 Pro (≤200K)

    $2.00

    $12.00

    $14.00

    Google

    GPT-5.2

    $1.75

    $14.00

    $15.75

    OpenAI

    Claude Sonnet 4.5

    $3.00

    $15.00

    $18.00

    Anthropic

    Gemini 3 Pro (>200K)

    $4.00

    $18.00

    $22.00

    Google

    Claude Opus 4.5

    $5.00

    $25.00

    $30.00

    Anthropic

    GPT-5.2 Pro

    $21.00

    $168.00

    $189.00

    OpenAI

    More ways to save

    But enterprise developers and users can cut costs further by eliminating the lag most larger models often have, which racks up token usage. Google said the model “is able to modulate how much it thinks,” so that it uses more thinking and therefore more tokens for more complex tasks than for quick prompts. The company noted Gemini 3 Flash uses 30% fewer tokens than Gemini 2.5 Pro. 

    To balance this new reasoning power with strict corporate latency requirements, Google has introduced a 'Thinking Level' parameter. Developers can toggle between 'Low'—to minimize cost and latency for simple chat tasks—and 'High'—to maximize reasoning depth for complex data extraction. This granular control allows teams to build 'variable-speed' applications that only consume expensive 'thinking tokens' when a problem actually demands PhD-level lo

    The economic story extends beyond simple token prices. With the standard inclusion of Context Caching, enterprises processing massive, static datasets—such as entire legal libraries or codebase repositories—can see a 90% reduction in costs for repeated queries. When combined with the Batch API’s 50% discount, the total cost of ownership for a Gemini-powered agent drops significantly below the threshold of competing frontier models

    “Gemini 3 Flash delivers exceptional performance on coding and agentic tasks combined with a lower price point, allowing teams to deploy sophisticated reasoning costs across high-volume processes without hitting barriers,” Google said. 

    By offering a model that delivers strong multimodal performance at a more affordable price, Google is making the case that enterprises concerned with controlling their AI spend should choose its models, especially Gemini 3 Flash. 

    Strong benchmark performance 

    But how does Gemini 3 Flash stack up against other models in terms of its performance? 

    Doshi said the model achieved a score of 78% on the SWE-Bench Verified benchmark testing for coding agents, outperforming both the preceding Gemini 2.5 family and the newer Gemini 3 Pro itself!

    For enterprises, this means high-volume software maintenance and bug-fixing tasks can now be offloaded to a model that is both faster and cheaper than previous flagship models, without a degradation in code quality.

    The model also performed strongly on other benchmarks, scoring 81.2% on the MMMU Pro benchmark, comparable to Gemini 3 Pro. 

    While most Flash type models are explicitly optimized for short, quick tasks like generating code, Google claims Gemini 3 Flash’s performance “in reasoning, tool use and multimodal capabilities is ideal for developers looking to do more complex video analysis, data extraction and visual Q&A, which means it can enable more intelligent applications — like in-game assistants or A/B test experiments — that demand both quick answers and deep reasoning.”

    First impressions from early users

    So far, early users have been largely impressed with the model, particularly its benchmark performance. 

    What It Means for Enterprise AI Usage

    With Gemini 3 Flash now serving as the default engine across Google Search and the Gemini app, we are witnessing the "Flash-ification" of frontier intelligence. By making Pro-level reasoning the new baseline, Google is setting a trap for slower incumbents.

    The integration into platforms like Google Antigravity suggests that Google isn't just selling a model; it's selling the infrastructure for the autonomous enterprise.

    As developers hit the ground running with 3x faster speeds and a 90% discount on context caching, the "Gemini-first" strategy becomes a compelling financial argument. In the high-velocity race for AI dominance, Gemini 3 Flash may be the model that finally turns "vibe coding" from an experimental hobby into a production-ready reality.