This research is the result of a collaboration between Ali Can Kocabiyikoglu, Pascal Zaragoza, and Vincent Laval, part of the AI Laboratory | BL Research Team.
The Challenge of Modern Maintenance Support
A maintenance technician faces an unfamiliar “Error 20” on a medical device. Their hands are busy with tools, eyes on the equipment, yet they need to access repair documentation. This scenario highlights a fundamental problem: traditional maintenance support—paper manuals, tablets, phone calls—requires technicians to interrupt their workflow and divide their attention.
The cognitive overhead of switching between physical repairs and information retrieval creates inefficiencies and increases error risks. Technicians need both hands free for repairs while simultaneously accessing maintenance histories, procedures, and parts information.
The Coruscant Project addresses this challenge through a voice-first AI assistant designed for industrial maintenance. Leveraging OpenAI’s Realtime API and function-calling architecture, technicians can query maintenance databases, retrieve repair procedures, check inventory, and generate reports—all through natural voice interaction while keeping their focus on the equipment.
The Technical Challenge

Industrial maintenance demands more than simple chatbot interactions. Unlike consumer voice assistants that handle isolated queries, maintenance support requires sophisticated real-time, contextual, multi-modal AI interaction. While the industry explores full-duplex conversation systems, practical field applications often benefit from well-implemented half-duplex designs that balance conversational fluidity with system reliability.
- Real-time voice interaction: Using OpenAI’s Realtime API, the system provides low-latency responses while maintaining clear turn-taking patterns. This half-duplex approach ensures reliable communication in noisy industrial environments while still feeling natural and responsive.
- Function calling architecture: The system must access multiple data sources—maintenance histories, repair procedures, inventory databases—and intelligently route queries to appropriate backends. This goes beyond simple RAG implementations to orchestrated agent workflows.
- Dynamic context switching: Technicians move between different tasks—diagnosing issues, checking parts availability, documenting repairs. The AI must maintain context across these workflow transitions while adapting its responses appropriately.
- Mobile-first design: Field technicians work in diverse environments with varying network conditions. The solution must be responsive, reliable, and optimized for mobile devices while maintaining voice interaction quality.
Architecture & Approach
The Coruscant system leverages a modern technology stack designed for reliability and scalability in industrial settings, following the principle of using the right LLM for the right task. For speech-to-speech interaction, the system employs an event-based realtime API from OpenAI, providing natural voice conversations with minimal latency. This is wrapped in a Flutter mobile application, ensuring consistent performance across iOS and Android devices used by field technicians.
The underlying agent architecture strategically selects LLMs that vary in size according to workflow requirements. Simple queries and routine maintenance lookups utilize smaller, faster models for rapid response times. Complex diagnostic reasoning employs larger models with enhanced analytical capabilities. For image processing and OCR tasks within the RAG pipeline, the system leverages Pixtral, enabling technicians to extract information from equipment manuals, warning labels, and technical diagrams. This multi-model approach achieves the best of all worlds—optimizing for speed, accuracy, and cost depending on the specific maintenance task at hand.
The backend architecture employs an agent-based design that connects seamlessly to Carl Source systems—the company’s existing maintenance infrastructure. This integration approach preserves existing investments while adding intelligent voice capabilities.
Workflow orchestration is achieved through specialized agents, each handling specific maintenance tasks:
- WF1 – Historical Intervention Search: Implements RAG (Retrieval-Augmented Generation) to query past maintenance records. When a technician encounters an issue, this workflow searches through thousands of intervention reports to find similar problems and their solutions.
- WF2 – Procedure & Inventory Management: Simultaneously queries repair procedure databases and inventory systems. This parallel processing ensures technicians receive both the repair instructions and parts availability in a single interaction.
- WF3 – Automated Report Generation: Captures intervention details through conversation and generates standardized maintenance reports, eliminating post-repair paperwork.
A key part is implicit intent detection through function calling. Rather than requiring technicians to explicitly state their intent or navigate menus, the system interprets natural language and automatically routes to the appropriate workflow. This approach significantly reduces friction in high-stress maintenance scenarios.
The underlying orchestration leverages LangGraph for building sophisticated agentic workflows. Unlike simple linear pipelines, LangGraph enables the creation of stateful, graph-based agent systems that can handle complex decision trees and conditional logic. For maintenance scenarios, this means agents can dynamically adapt their behavior based on available information, system responses, and technician feedback.
Human-in-the-loop workflows are essential in industrial maintenance. The system recognizes when it lacks sufficient information and actively requests technician input. For instance, if historical data is inconclusive, Coruscant might ask: “Can you describe any unusual sounds or smells?” Future iterations will support image uploads, allowing technicians to share visual information for complex diagnostics. LangGraph’s architecture elegantly handles these interruptions, maintaining conversation state while incorporating new human-provided context into the decision process.
Real-World Impact
The system’s effectiveness is best illustrated through a complete maintenance scenario. Jean-Michel, facing an unfamiliar “Error 20” on medical equipment, simply speaks to Coruscant: “I’m in front of the puncture machine and Error 20 is displayed. Can you help?”
The system immediately recognizes the context—equipment ID from the scheduled intervention, error code from the query—and initiates WF1. Within seconds, it retrieves relevant historical data: “This error corresponds to a system overheating fault. A colleague resolved this on March 12, 2023, by replacing the air filter at the rear of the main unit.”
When Jean-Michel requests guidance, WF2 activates, providing step-by-step instructions while checking inventory: “You’ll need a TYU-987 filter—58 available in building B43—and four new 8mm CORX screws from the second-floor workshop.”
Key benefits include:
- Hands-free operation allows technicians to maintain focus on equipment while accessing information
- Institutional knowledge becomes instantly accessible through voice, preserving expertise from senior technicians
- Seamless workflow transitions between diagnosis, repair, and documentation without context loss
- Reduced cognitive load as technicians no longer juggle physical tasks with information retrieval
Lessons Learned & Future Directions
Developing Coruscant revealed critical insights about voice-first industrial applications. Voice UI design for professional contexts differs significantly from consumer applications—technicians need precise, actionable information rather than conversational pleasantries. Managing context across extended interactions proved challenging, requiring sophisticated state management to handle interruptions and task switching.
Implicit intent detection, while powerful, has limitations. Ambiguous queries like “it’s making a weird noise” require sophisticated disambiguation strategies. The Model Context Protocol (MCP) presents an interesting evolution here, potentially standardizing how AI systems access and interact with external tools and databases. This could simplify the integration of new data sources and maintenance systems without rebuilding agent architectures.
Task planning represents another frontier. Current implementations face tradeoffs between Chain-of-Thought (CoT) reasoning—which provides transparency and adaptability—and traditional fixed task planning that offers predictability and efficiency. CoT excels at handling novel situations but can be computationally expensive and occasionally unpredictable. Fixed planning ensures consistent execution but lacks flexibility when encountering edge cases.
Interestingly, modern LLMs enable a renaissance of goal-oriented dialogue systems. Classical approaches required extensive rule engineering for context management and repair mechanisms. Now, LLMs naturally handle context switching and error recovery while maintaining the structured workflows necessary for industrial applications. This hybrid approach allows rich conversational features—technicians can chitchat about the weather while waiting for results—without sacrificing task completion reliability.
The primary challenge remains guardrails and safety constraints. Industrial environments demand strict boundaries: the system must never suggest unsafe procedures or exceed technician certifications. Balancing conversational flexibility with operational safety requires sophisticated prompt engineering and real-time monitoring—an ongoing area of research and refinement.