Day 24 - Google 5 Day Course continued, Options to add your own data to an LLM, Evaluating your Responses before presenting
Dec 9 2024
- Continuing Kaggle/Google day 1 paper - Foundational LLM and Text Generation
- Prompt Engineering
- Try using verbs that describe action in your prompts:
- Act, Analyze, Categorize, Classify, Contrast, Compare, Create, Describe, Define,
- Evaluate, Extract, Find, Generate, Identify, List, Measure, Organize, Parse, Pick, Predict, Provide, Rank, Recommend, Return, Retrieve, Rewrite, Select, Show, Sort,
- Summarize, Translate, Write.
- Types of Prompts
- System / Role/ Contextual prompting - who are you, what is your goal, what’s your style
- Step-back prompting - take a look at this info, and now answer
- Chain of thought - think through each step, don’t jump to the answer
- https://github.com/GoogleCloudPlatform/generative-ai/blob/main/language/prompts/examples/chain_of_thought_react.ipynb
- Self consistency - asking yourself 3x and go with most frequent answer
- Tree of thoughts - like self-consistency, but at a level above the prompt
- ReAct - using tools
- Also try to be clear with your instructions by providing both a “DO” and a “DO NOT” instruction set.
- Keep track of your prompts and test them each time
- Name/version
- Goal
- Model
- Temperature/Top-K, Top-P
- Token Limit
- Prompt
- Output
Ways to add new data to your LLM
-
Discussion with ChatGPT on different options to incorporating your own knowledge into a LLM, summarized in this table
Method Best For Pros Cons Soft Prompting Small FAQ sets, low-cost setups - No training required
- Flexible and easy to update- Limited by token size
- FAQs must be in every promptRAG (Retrieval-Augmented Generation) Large or dynamic FAQ datasets - Scalable
- FAQs dynamically retrieved
- No fine-tuning required- Complex setup
- Dependent on retrieval qualityFine-Tuning Highly repetitive or static FAQs - Efficient inference
- Consistent answers
- Tailored behavior- High initial cost
- Requires re-training for updatesMiddleware System Handling simple FAQ queries - Fast responses
- Hybrid with GPT-4- Dual system complexity
- Threshold tuningTool-Augmented Models Structured FAQ storage - Accurate
- Real-time updates- Development overhead Knowledge Distillation Reducing API costs - Efficient for repetitive FAQs
- Cost-effective inference- Training and maintenance overhead Student/Teacher Replicating GPT-4 quality on FAQs at scale - High-quality outputs
- Cost-effective once trained
- Reduces dependence on GPT-4- Training requires GPT-4 API calls
- Retraining for FAQ updatesAdaptive Prompt Chaining Ambiguous or multi-part queries - Improves query understanding
- Interactive experience- Slower interactions
- May require multiple exchangesKnowledge Graph Complex, interrelated FAQs - Rich context
- Supports reasoning- Complex to build and maintain Memory-Based Agents Session-specific FAQ refinement - Personalized experience
- Avoids repetition- Requires session infrastructure
Post-Hoc Grounding
-
Post-hoc grounding refers to the process of aligning model outputs with external, factual, or contextually relevant knowledge after the initial response is generated. This ensures the response is consistent, accurate, or relevant to specific requirements.
- For example:
- A model generates a generic response (e.g., “The capital of France is Paris”).
- Post-hoc grounding checks this response against a knowledge base or external source for accuracy.
- The output is corrected or refined if necessary before delivery to the user.
- This technique is common in tasks requiring high factual accuracy, like customer support or scientific reasoning.
- For example: