--- license: llama3.1 language: - zh - en base_model: - meta-llama/Llama-3.1-70B-Instruct extra_gated_fields: Full name: text Email: text Company: text Country: country Specific date: date_picker I want to use this model for: type: select options: - Research - Education - label: Other value: other I agree to use this model for non-commercial use ONLY: checkbox extra_gated_prompt: >- The information you provide will be collected, stored, processed and shared in accordance with the https://www.honhai.com/zh-tw/privacy-and-policy To request access to the Llama_3.1-FoxBrain-70B-V1.2 model weights, please contact us by email first. **Only requests from users who have contacted us in advance will be considered.** When sending your email, **make sure to use the same email address that you will enter in the application form.** This helps us verify your identity and approval status. You may contact us at: `harry.sh.liu@foxconn.com` ## ๐Ÿ“„ Llama_3.1-FoxBrain-70B-V1.2 Model Usage and License Agreement At this time, access to the FoxBrain model is granted **exclusively to academic institutions and research organizations**. We do not authorize commercial usage under the current release. For commercial or enterprise applications, Llama_3.1-FoxBrain-70B-V1.2 will be made available in the future through authorized channelsโ€”such as **deployment on AWS**โ€”accompanied by a separate licensing framework. Please stay tuned for updates regarding commercial availability. Welcome to the Llama_3.1-FoxBrain-70B-V1.2 model. Llama_3.1-FoxBrain-70B-V1.2 (hereinafter referred to as โ€œFoxBrainโ€) is a large language model developed by Hon Hai Research Institute based on the Meta Llama 3.1 architecture. It has been optimized using Traditional Chinese corpora from Taiwan and supports a wide range of inference tasks and application scenarios. This Agreement sets forth the terms and conditions that users (hereinafter โ€œYouโ€) must adhere to when using the FoxBrain model, including its weights, source code, APIs, and any derivative works. ## 1. Definitions 1. **License Agreement**: Subject to the terms herein, Hon Hai Research Institute grants You the rights to use, reproduce, modify, and distribute the FoxBrain model. 2. **Licensor**: Refers to Hon Hai Research Institute or the authorized owner of intellectual property rights to the FoxBrain model. 3. **You**: Any individual or entity authorized to use the model under this Agreement. 4. **FoxBrain Model**: The collection of training parameters, weights, source code, and related components. 5. **Derivative Models**: Models built upon FoxBrainโ€™s parameters, outputs, or modifications. ## 2. Usage Principles and Academic Orientation - FoxBrain is primarily intended for academic research, education, and technical exchange. Commercial use is strictly prohibited unless explicitly authorized. - Users must comply with the laws of the Republic of China (Taiwan) and the Meta Llama 3.1 license terms. - Any illegal, harmful, or rights-infringing usage is strictly forbidden. - Do not interfere with, disrupt, or compromise the integrity of the system or other users. - Users should promptly report any security vulnerabilities or anomalies to Hon Hai Research Institute. ## 3. User Responsibility and Disclaimer - If You violate any laws resulting in damages to Hon Hai Research Institute or third parties, You shall bear full responsibility. - Hon Hai Research Institute shall not be held liable for any misuse, including distribution of illegal content or unauthorized data access. - FoxBrain is provided โ€œas isโ€ for research purposes. Outputs may be inaccurate, biased, or controversial. Users shall independently assess and accept relevant risks. ## 4. Summary of Meta Llama 3.1 License Terms This model is built upon Metaโ€™s Llama 3.1 architecture. Users must comply with Metaโ€™s licensing restrictions, which include (but are not limited to): - Non-exclusive, worldwide, royalty-free usage rights - Prohibition of using the model to improve other LLMs (except for Llama-derived works) - If monthly active users exceed 700 million, a separate commercial license must be obtained - Proper attribution is required: โ€œThis model is licensed under the Llama 3.1 Community License. ยฉ Meta Platforms, Inc. All rights reserved.โ€ ๐Ÿ”— [Meta License Terms](https://llama.meta.com/llama3/license) ๐Ÿ”— [Meta Usage Policy](https://llama.meta.com/llama3/use-policy) ## 5. Prohibited Uses ### 5.1 Illegal or Infringing Activities - Violence, terrorism, discrimination, exploitation, deepfake technology, and unauthorized surveillance - Medical, legal, or financial services without authorization - Unlawful access to, use of, or inference from personal data ### 5.2 High-Risk Applications - Military, weapons manufacturing, heavy industry control, or critical infrastructure operations - Self-harm, suicide, or any activity that endangers personal safety ### 5.3 Deception and Abuse - Fraud, forgery, impersonation, or generating AI content without proper labeling The above list is not exhaustive. Any activity that violates laws, endangers human safety, or poses significant societal risks is strictly forbidden. ## 6. Miscellaneous - โ€œFoxBrainโ€ is a registered trademark of Hon Hai Research Institute. Use of the name, logos, or identifiers must comply with applicable trademark laws and this Agreement. - This Agreement does not constitute a commercial warranty or endorsement by Hon Hai Research Institute. - Hon Hai Research Institute reserves the right to modify, suspend, or terminate this Agreement at any time. - Use of the model by legal entities implies duly authorized representation. ## 7. Jurisdiction This Agreement is governed by the laws of the Republic of China (Taiwan). Any disputes shall be under the jurisdiction of the Taipei District Court as the court of first instance. --- # FoxBrain v1.2: Advanced Reasoning LLM with Dual Thinking Modes **FoxBrain** is a large language model (LLM) **independently developed by Foxconn**, representing a major milestone in the company's long-term strategy to create AI that deeply understands **industrial knowledge and high-reliability domains**. Currently at version **1.2**, FoxBrain delivers exceptional performance in **Chinese language understanding and generation**, and introduces **revolutionary dual thinking modes** for enhanced reasoning capabilities in **complex industrial contexts**. ๐Ÿ‘‰ Official GitHub: [FoxBrain_LLMs](https://github.com/TranNhiem/FoxBrain_LLMs?tab=readme-ov-file) --- ## ๐Ÿ†• What's New in Version 1.2 ### ๐Ÿง  **Dual Thinking Modes** FoxBrain v1.2 introduces two distinct reasoning approaches: - **๐ŸŽฏ Budget_Thinking Mode**: Step-by-step reasoning with resource management - Allocates computational "budget" based on problem complexity (1-9 steps) - Provides structured output with reasoning steps, reflections, and quality scores - Ideal for systematic problem-solving and transparent decision-making - **๐Ÿ’ญ Extend_Thinking Mode**: Deep analytical reasoning with extended thought process - Uses `` tags for comprehensive internal reasoning - Allows for more flexible and creative problem exploration - Perfect for complex analysis and open-ended challenges - โš ๏ธ **Important**: This mode is sensitive to the `presence_penalty` parameter - we recommend setting it to `1.5` for optimal performance ### โš™๏ธ **Enhanced Chat Template System** - **Flexible Mode Switching**: Seamlessly switch between thinking modes - **Custom System Prompts**: Support for user-defined system instructions - **Priority-Based Selection**: Custom prompts override default modes - **Backward Compatibility**: Maintains compatibility with existing implementations --- ## ๐Ÿ” Key Features - ๐Ÿง  **Dual Reasoning Architecture** Two specialized thinking modes for different problem types and complexity levels. - ๐Ÿญ **Industrial-Grade Performance** Built for the precision, consistency, and robustness required in mission-critical industrial applications. - ๐Ÿ“˜ **Optimized for Traditional Chinese** Fine-tuned on high-quality Taiwanese Traditional Chinese datasets for superior linguistic alignment. - ๐Ÿ’ก **Structured & Transparent Output** Budget mode provides step-by-step reasoning with quality assessments and resource tracking. - โš™๏ธ **Fast Inference with VLLM** Easily deployable on 2โ€“8 H100 GPUs with ultra-low latency and flexible configuration. - ๐Ÿ”ง **Developer-Friendly Integration** Simple parameter-based mode switching via `thinking_mode` parameter. --- ## ๐Ÿš€ Quickstart: Inference with VLLM ### ๐Ÿ–ฅ๏ธ Environment Requirements - Python 3.8+ - CUDA-compatible environment - 2 to 8 ร— H100 GPUs (4 GPUs recommended for optimal performance) - [`vllm`](https://github.com/vllm-project/vllm) installed ### ๐Ÿ“ฆ Install VLLM ```bash pip install vllm ``` ### ๐Ÿง  Launch Inference API ```bash vllm serve \ --model FoxBrain_v1.2_70B \ --api-key foxbrain-cit \ --port 8800 \ --max-model-len 32768 \ --tensor-parallel-size 4 \ --gpu-memory-utilization 0.85 \ --enforce-eager ``` ### ๐Ÿ’ป Python Usage Examples #### **Budget_Thinking Mode Example** ```python from vllm import LLM, SamplingParams from transformers import AutoTokenizer # Load model and tokenizer llm = LLM(model="FoxBrain_v1.2_70B", tensor_parallel_size=4) tokenizer = AutoTokenizer.from_pretrained("FoxBrain_v1.2_70B") messages = [ {"role": "user", "content": "Solve this complex engineering problem: How would you optimize a manufacturing assembly line with 3 bottlenecks?"} ] # Use Budget_Thinking mode prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, thinking_mode="Budget_Thinking" ) # Generate with structured reasoning sampling_params = SamplingParams(temperature=0.3, max_tokens=2048) outputs = llm.generate([prompt], sampling_params) print(outputs[0].outputs[0].text) ``` **Expected Output Structure:** ``` 6 # Initial budget for complex problem First, I need to identify the three bottlenecks... 5 # Remaining budget Next, I'll analyze the throughput capacity... 4 My analysis is on track, need to consider dependencies 0.7 To optimize the assembly line with 3 bottlenecks: 1) Implement parallel processing at bottleneck A, 2) Add buffer stations before bottleneck B, 3) Upgrade equipment at bottleneck C. Expected 25% throughput improvement. Comprehensive solution addressing all bottlenecks with quantified benefits 0.9 ``` #### **Extend_Thinking Mode Example** ```python # Use Extend_Thinking mode with recommended parameters prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, thinking_mode="Extend_Thinking" ) # Important: Use presence_penalty=1.5 for Extend_Thinking mode sampling_params = SamplingParams( temperature=0.3, presence_penalty=1.5, max_tokens=2048 ) outputs = llm.generate([prompt], sampling_params) print(outputs[0].outputs[0].text) ``` **Expected Output Structure:** ``` This is a complex manufacturing optimization problem. Let me think through this systematically... First, I should understand what constitutes a bottleneck in manufacturing: - Limited capacity point in the process - Determines overall system throughput - Can be equipment, labor, or process-related For the three bottlenecks, I need to consider: 1. Root cause analysis for each bottleneck 2. Interdependencies between bottlenecks 3. Cost-benefit analysis of solutions 4. Implementation timeline and resource requirements ... Based on my analysis of manufacturing assembly line optimization, here's a comprehensive approach to address the three bottlenecks: [Final detailed answer follows] ``` #### **Custom System Prompt Example** ```python # Use custom system prompt (overrides thinking modes) messages = [ {"role": "system", "content": "You are a specialized manufacturing engineer focused on lean principles."}, {"role": "user", "content": "Analyze this production issue..."} ] prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, thinking_mode="Budget_Thinking" # This will be ignored due to custom system prompt ) ``` ### ๐ŸŽฎ Interactive Terminal Interface ```python # Complete interactive example with mode switching import os from vllm import LLM, SamplingParams from transformers import AutoTokenizer # Load model llm = LLM(model="FoxBrain_v1.2_70B", tensor_parallel_size=4, gpu_memory_utilization=0.85) tokenizer = AutoTokenizer.from_pretrained("FoxBrain_v1.2_70B") current_mode = 'Budget_Thinking' messages = [] print("FoxBrain v1.2 Interactive Terminal") print("Commands: 'mode1' (Budget_Thinking), 'mode2' (Extend_Thinking), 'custom' (custom prompt), 'reset', 'quit'") while True: user_input = input("User: ").strip() if user_input.lower() == 'quit': break elif user_input.lower() == 'mode1': current_mode = 'Budget_Thinking' messages = [] print("Switched to Budget_Thinking mode!") continue elif user_input.lower() == 'mode2': current_mode = 'Extend_Thinking' messages = [] print("Switched to Extend_Thinking mode!") continue messages.append({"role": "user", "content": user_input}) # Apply chat template with selected thinking mode prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, thinking_mode=current_mode ) # Generate response sampling_params = SamplingParams(temperature=0.7, max_tokens=2048) outputs = llm.generate([prompt], sampling_params) response = outputs[0].outputs[0].text.strip() print(f"Assistant: {response}") messages.append({"role": "assistant", "content": response}) ``` --- ## ๐Ÿ“Š Academic & Human Evaluation Benchmarks ### ๐ŸŽ“ Taiwan MMLU+ (Academic Benchmark) FoxBrain v1.2 was evaluated on **Taiwan MMLU+** with both thinking modes, showing improved performance in complex reasoning tasks: ### ๐Ÿง  Reasoning Capability Analysis **Budget_Thinking Mode Performance:** - โœ… **Structured Problem Solving**: +15% improvement in multi-step reasoning - โœ… **Resource Efficiency**: Optimal performance within allocated computational budget - โœ… **Transparency**: Clear reasoning trace for audit and debugging **Extend_Thinking Mode Performance:** - โœ… **Deep Analysis**: +22% improvement in complex analytical tasks - โœ… **Creative Solutions**: Enhanced performance in open-ended problems - โœ… **Comprehensive Coverage**: Better handling of nuanced, multi-faceted challenges ### ๐Ÿ‘ฅ MT-Bench (Human Preference Evaluation) Updated MT-Bench results for v1.2 with thinking mode comparisons: > ๐Ÿ… FoxBrain v1.2 demonstrated significant improvements in reasoning tasks, with Budget_Thinking mode excelling in systematic problems and Extend_Thinking mode leading in creative tasks. --- ## ๐Ÿค– Suggested Use Cases by Mode ### ๐ŸŽฏ **Budget_Thinking Mode - Best For:** - ๐Ÿญ **Manufacturing Process Optimization**: Step-by-step analysis with resource constraints - ๐Ÿ“Š **Quality Control Procedures**: Systematic inspection and validation workflows - ๐Ÿ”ง **Troubleshooting Protocols**: Structured diagnostic procedures with clear steps - ๐Ÿ“ˆ **Performance Analysis**: Quantified assessments with measurable outcomes - ๐ŸŽฏ **Project Planning**: Resource-aware task breakdown and timeline estimation ### ๐Ÿ’ญ **Extend_Thinking Mode - Best For:** - ๐Ÿงช **Research & Development**: Deep analysis of complex technical problems - ๐ŸŽจ **Creative Problem Solving**: Innovative approaches to engineering challenges - ๐Ÿ“ **Technical Documentation**: Comprehensive analysis and explanation - ๐Ÿค” **Strategic Planning**: Long-term thinking and scenario analysis - ๐Ÿ” **Root Cause Analysis**: In-depth investigation of complex system failures ### ๐ŸŽ›๏ธ **Custom System Prompts - Best For:** - ๐Ÿข **Domain-Specific Applications**: Tailored behavior for specific industries - ๐Ÿ‘ฅ **Role-Specific Interactions**: Customized persona for different use cases - ๐Ÿ”’ **Compliance Requirements**: Specific guidelines and constraints - ๐ŸŽฏ **Specialized Workflows**: Custom instructions for unique business processes --- ## ๐Ÿšง Roadmap & Version History ### Version History - ๐Ÿ“Œ **Version 1.0**: Foundation model with strong Chinese language proficiency - ๐Ÿ”„ **Version 1.1**: Enhanced reasoning capabilities and improved efficiency - ๐Ÿ†• **Version 1.2**: **Dual thinking modes with structured reasoning architecture** - ๐Ÿ”œ **Version 2.0**: Advanced industrial knowledge integration and domain expertise - ๐ŸŒ† **Long-Term Vision**: Comprehensive smart manufacturing and industrial AI platform --- ## โš ๏ธ Important Notes for v1.2 ### ๐Ÿ”ง **Migration from v1.0/v1.1** - Chat template has been updated - ensure you're using the latest tokenizer - Default mode is `Budget_Thinking` if no `thinking_mode` is specified - Custom system prompts take precedence over thinking modes ### ๐Ÿ’พ **Memory Requirements** - Budget_Thinking mode: Standard memory usage - Extend_Thinking mode: May require additional memory for extended reasoning - Multi-GPU setup (4+ GPUs) recommended for optimal performance ### ๐ŸŽ›๏ธ **Parameter Recommendations** - **Budget_Thinking mode**: - `temperature=0.3-0.7` - Standard sampling parameters work well - **Extend_Thinking mode**: - `temperature=0.3-0.5` - **โš ๏ธ Critical**: `presence_penalty=1.5` (model may generate unexpected results without this setting) - **General settings**: - `max_tokens=2048-4096` depending on problem complexity --- ## ๐Ÿ“„ License This model is released under the **Llama 3.1 Community License Agreement**. --- ## ๐Ÿ™Œ Contributors - AI Research Center of Hon Hai Research Institute (model training, deployment & evaluation) - Meta-Llama (base model) --- ## ๐Ÿ“ซ Contact For support or partnership inquiries: ๐Ÿ“ง harry.sh.liu@foxconn.com --- **FoxBrain v1.2** - Where structured reasoning meets industrial intelligence. ๐Ÿš€