📌 Golden-Retriever enhances Retrieval Augmented Generati...

📌 Golden-Retriever enhances Retrieval Augmented Generation (RAG) for industrial knowledge bases. Addresses challenges with domain-specific jargon and context interpretation.

📌 Results: Golden-Retriever improves total score of Meta-Llama-3-70B by 79.2% over vanilla LLM, 40.7% over RAG. Average improvement across three LLMs: 57.3% over vanilla LLM, 35.0% over RAG.

📌 Introduces reflection-based question augmentation before document retrieval. Identifies jargon, clarifies meaning based on context, augments question accordingly.

📌 Offline process: OCR extracts text from various document formats. LLMs summarize and contextualize to enhance document database.

📌 Online process: LLM identifies jargon and context in user query. Queries jargon dictionary for accurate definitions. Augments original question with clear context and resolved ambiguities.

📌 Jargon identification uses LLM instead of string-exact-match. Adapts to new terms, misspellings. Outputs structured list of identified terms.

📌 Context identification uses pre-specified context names and descriptions. LLM identifies context using few-shot examples with Chain-of-Thought prompting.

📌 Jargon dictionary queried using SQL. Retrieves extended definitions, descriptions, notes about identified terms.

📌 Augmented question integrates original query, context information, detailed jargon definitions. Explicitly states context, clarifies ambiguous terms.

📌 Fallback mechanism for unidentified jargon. Synthesizes response indicating missing information, instructs user to check spelling or contact knowledge base manager.

📌 Evaluation: Question-answering experiment using multiple-choice questions from new-hire training documents. Covers six domains, 9-10 questions each. Compared with vanilla LLM and RAG.