When “AI-Powered” Hides Domain Incompetence

by Altravista, AI

When “AI-Powered” Hides Domain Incompetence

Why Many Chatbots Fail in Complex B2B Ecommerce


The problem: when “AI-powered” does not mean “fit for purpose”

An ecommerce manager at an industrial distributor is facing the following situation:

* 40% of customer support requests are about technical compatibility between components

* The newly implemented “AI-powered” chatbot answers correctly only 15% of those questions

The vendor sold the solution as *“GPT-4-based”*, therefore — at least on paper — technically advanced.

So what went wrong?

The problem is not the technology, but a fundamental misunderstanding:

the ability to generate coherent text is not the same as the ability to understand and apply complex technical documentation.

This is a common mistake in B2B ecommerce: confusing the level of the model with the suitability of the architecture.

When “AI-Powered” Hides Domain Incompetence

Capability classification: where we really are

AI literature typically distinguishes three conceptual levels.

ANI (Artificial Narrow Intelligence)

Status: available and in production today

Characteristic: excels at specific tasks, fails outside its training domain

This is what we use every day: email classification, image recognition, text analysis, generative LLMs.

All commercially available AI systems today fall into this category.

Practical implication:

a model trained on generic conversations knows nothing about your product codes, compatibility rules, or business logic.

Without targeted integration, it simply cannot answer correctly.


AGI (Artificial General Intelligence)

Status: hypothetical

Characteristic: human-level general cognitive capabilities

Despite some marketing claims, no current system qualifies as AGI — not even the most advanced models.


ASI (Artificial Superintelligence)

Status: theoretical

Characteristic: intelligence beyond human capabilities in every domain

Relevant for philosophical debate, not for operational decisions.

Operational conclusion:

any solution promising “general intelligence” is selling a concept that does not exist today.

You are buying an ANI. Period.


Functional classification: what actually matters

Within ANI, there are fundamentally different architectures.

Understanding them is essential to evaluate whether a solution can solve *your* specific problem.


1. Pure generative LLMs

Function: text generation based on statistical patterns

Examples: base ChatGPT, Claude, Gemini (direct interfaces)

What they do well

* Content rephrasing

* Generic copy generation

* Non-critical conversations

What they do NOT do

* Access company-specific data

* Understand proprietary business logic

* Guarantee accuracy for technical information

B2B ecommerce:

useful for product descriptions, inadequate for technical support without data integration.


2. RAG agents (Retrieval-Augmented Generation)

Function: LLM + semantic search over proprietary documentation

Simplified architecture

1. User query → semantic embedding

2. Vector search over indexed documents

3. Retrieval of relevant content

4. Answer generation grounded in real sources

What they do well

* Use actual technical documentation

* Reduce hallucinations

* Provide traceable sources

What they require

* Well-structured documentation

* Consistent terminology

* A carefully designed indexing process

B2B ecommerce:

technical support for compatibility questions, when answers explicitly exist in manuals or datasheets.

Typical failure:

fragmented or inconsistent documentation → poor retrieval → incorrect answers.


3. Specialized agents (fine-tuned models)

Function: models trained on domain-specific datasets

Critical difference:

the model learns technical patterns *during training*, not only at query time.

What they do well

* Understand niche technical terminology

* Recognize complex recurring patterns

* Reduce errors in repetitive domain tasks

What they require

* Labeled training datasets

* ML expertise

* Rigorous validation processes

B2B ecommerce:

automatic classification of large catalogs where deterministic rules are insufficient.


4. Multi-agent systems

Function: orchestration of agents with clearly defined responsibilities

Typical setup

* Technical compatibility agent

* Pricing agent

* Availability agent

* Synthesis agent

Strengths

* Multi-step decision processes

* Better controllability

* Higher traceability

B2B ecommerce:

complex configurators combining technical and commercial rules.


5. Tool-enabled agents (function calling)

Function: LLMs that invoke APIs and external systems

Mechanism

1. Request analysis

2. API call (ERP, PIM, WMS)

3. Structured data retrieval

4. Answer generation

Key difference vs RAG:

RAG searches static documents; tool-enabled agents query live systems.

B2B ecommerce:

real-time pricing, stock, lead times, and commercial conditions.

When “AI-Powered” Hides Domain Incompetence

How to evaluate an AI solution: the right questions

When a vendor proposes “AI for ecommerce”, ask:

1. Which architecture are you using?

2. Which data does the system actually access?

3. How is accuracy measured?

4. What happens when a query is out of domain?

Red flag:

“It's based on GPT-4” as the only explanation.


Practical case: B2B technical support

Scenario:

industrial electronics distributor, 50,000 SKUs, B2B customers.

Typical question:

“Is component X compatible with system Y from manufacturer Z?”

Generic LLM

* No access to compatibility data

* Invented or evasive answers

* Escalation to human support

RAG on datasheets

* 60–70% accuracy

* Fails on implicit or cross-document compatibility

Multi-agent + tool-enabled system

1. Extract technical specifications

2. Query compatibility database

3. Perform technical reasoning

4. Generate answer with confidence level

Result: ~90% accuracy and a significant reduction in support tickets.


Operational conclusion

“AI” is an empty container.

The real question is:

Which architecture, trained on which data, integrated with which systems, solves my operational problem?

Three guiding principles

1. Start from the process, not the technology

2. Be skeptical of generic solutions

3. Measure everything with clear KPIs

The value is not in AI itself, but in the ability to integrate advanced language processing with your data, your processes, and your domain expertise.

If a vendor cannot show you architecture, data sources, and accuracy metrics on *your* specific use case, they are not selling a solution.

They are selling hype.

More articles

Universal Commerce Protocol (UCP): How Google Is Redefining eCommerce Distribution (and What to Do in the Next 60 Days)

Google and Shopify have launched UCP, an open protocol that allows AI assistants (Gemini, ChatGPT, Alexa) to query merchant inventories, prices, and policies in real time to complete transactions.

Read more

AI Systems Trained on Your Technical Manuals

In B2B technical catalogs, critical information already exists but is scattered across PDFs, manuals, and datasheets. When customers, support, and sales can’t access answers quickly, opportunities are lost due to poor accessibility, not lack of expertise.

Read more

Tell us about your project

Tell us your challenges and goals—we’ll suggest the quickest next step to move your commerce forward.

Our offices

  • Pavia
    Via Albericia 17
    27040 - Campospinoso, Pavia
    P. IVA 02195800186
    Tel.: +39 0385 833 911
When “AI-Powered” Hides Domain Incompetence | Altravista