When “AI-Powered” Hides Domain Incompetence
by Altravista, AI

Why Many Chatbots Fail in Complex B2B Ecommerce
The problem: when “AI-powered” does not mean “fit for purpose”
An ecommerce manager at an industrial distributor is facing the following situation:
* 40% of customer support requests are about technical compatibility between components
* The newly implemented “AI-powered” chatbot answers correctly only 15% of those questions
The vendor sold the solution as *“GPT-4-based”*, therefore — at least on paper — technically advanced.
So what went wrong?
The problem is not the technology, but a fundamental misunderstanding:
the ability to generate coherent text is not the same as the ability to understand and apply complex technical documentation.
This is a common mistake in B2B ecommerce: confusing the level of the model with the suitability of the architecture.

Capability classification: where we really are
AI literature typically distinguishes three conceptual levels.
ANI (Artificial Narrow Intelligence)
Status: available and in production today
Characteristic: excels at specific tasks, fails outside its training domain
This is what we use every day: email classification, image recognition, text analysis, generative LLMs.
All commercially available AI systems today fall into this category.
Practical implication:
a model trained on generic conversations knows nothing about your product codes, compatibility rules, or business logic.
Without targeted integration, it simply cannot answer correctly.
AGI (Artificial General Intelligence)
Status: hypothetical
Characteristic: human-level general cognitive capabilities
Despite some marketing claims, no current system qualifies as AGI — not even the most advanced models.
ASI (Artificial Superintelligence)
Status: theoretical
Characteristic: intelligence beyond human capabilities in every domain
Relevant for philosophical debate, not for operational decisions.
Operational conclusion:
any solution promising “general intelligence” is selling a concept that does not exist today.
You are buying an ANI. Period.
Functional classification: what actually matters
Within ANI, there are fundamentally different architectures.
Understanding them is essential to evaluate whether a solution can solve *your* specific problem.
1. Pure generative LLMs
Function: text generation based on statistical patterns
Examples: base ChatGPT, Claude, Gemini (direct interfaces)
What they do well
* Content rephrasing
* Generic copy generation
* Non-critical conversations
What they do NOT do
* Access company-specific data
* Understand proprietary business logic
* Guarantee accuracy for technical information
B2B ecommerce:
useful for product descriptions, inadequate for technical support without data integration.
2. RAG agents (Retrieval-Augmented Generation)
Function: LLM + semantic search over proprietary documentation
Simplified architecture
1. User query → semantic embedding
2. Vector search over indexed documents
3. Retrieval of relevant content
4. Answer generation grounded in real sources
What they do well
* Use actual technical documentation
* Reduce hallucinations
* Provide traceable sources
What they require
* Well-structured documentation
* Consistent terminology
* A carefully designed indexing process
B2B ecommerce:
technical support for compatibility questions, when answers explicitly exist in manuals or datasheets.
Typical failure:
fragmented or inconsistent documentation → poor retrieval → incorrect answers.
3. Specialized agents (fine-tuned models)
Function: models trained on domain-specific datasets
Critical difference:
the model learns technical patterns *during training*, not only at query time.
What they do well
* Understand niche technical terminology
* Recognize complex recurring patterns
* Reduce errors in repetitive domain tasks
What they require
* Labeled training datasets
* ML expertise
* Rigorous validation processes
B2B ecommerce:
automatic classification of large catalogs where deterministic rules are insufficient.
4. Multi-agent systems
Function: orchestration of agents with clearly defined responsibilities
Typical setup
* Technical compatibility agent
* Pricing agent
* Availability agent
* Synthesis agent
Strengths
* Multi-step decision processes
* Better controllability
* Higher traceability
B2B ecommerce:
complex configurators combining technical and commercial rules.
5. Tool-enabled agents (function calling)
Function: LLMs that invoke APIs and external systems
Mechanism
1. Request analysis
2. API call (ERP, PIM, WMS)
3. Structured data retrieval
4. Answer generation
Key difference vs RAG:
RAG searches static documents; tool-enabled agents query live systems.
B2B ecommerce:
real-time pricing, stock, lead times, and commercial conditions.

How to evaluate an AI solution: the right questions
When a vendor proposes “AI for ecommerce”, ask:
1. Which architecture are you using?
2. Which data does the system actually access?
3. How is accuracy measured?
4. What happens when a query is out of domain?
Red flag:
“It's based on GPT-4” as the only explanation.
Practical case: B2B technical support
Scenario:
industrial electronics distributor, 50,000 SKUs, B2B customers.
Typical question:
“Is component X compatible with system Y from manufacturer Z?”
Generic LLM
* No access to compatibility data
* Invented or evasive answers
* Escalation to human support
RAG on datasheets
* 60–70% accuracy
* Fails on implicit or cross-document compatibility
Multi-agent + tool-enabled system
1. Extract technical specifications
2. Query compatibility database
3. Perform technical reasoning
4. Generate answer with confidence level
Result: ~90% accuracy and a significant reduction in support tickets.
Operational conclusion
“AI” is an empty container.
The real question is:
Which architecture, trained on which data, integrated with which systems, solves my operational problem?
Three guiding principles
1. Start from the process, not the technology
2. Be skeptical of generic solutions
3. Measure everything with clear KPIs
The value is not in AI itself, but in the ability to integrate advanced language processing with your data, your processes, and your domain expertise.
If a vendor cannot show you architecture, data sources, and accuracy metrics on *your* specific use case, they are not selling a solution.
They are selling hype.
