SiliconCloud · AI Models · LobeChat

Ctrl K

Model List

46

Hunyuan A52B Instruct

Tencent/Hunyuan-A52B-Instruct

Hunyuan-Large is the industry's largest open-source Transformer architecture MoE model, with a total of 389 billion parameters and 52 billion active parameters.

DeepSeek V2.5

deepseek-ai/DeepSeek-V2.5

DeepSeek V2.5 combines the excellent features of previous versions, enhancing general and coding capabilities.

DeepSeek V2 Chat

deepseek-ai/DeepSeek-V2-Chat

DeepSeek-V2 is a powerful and cost-effective mixture of experts (MoE) language model. It has been pre-trained on a high-quality corpus of 81 trillion tokens and further enhanced through supervised fine-tuning (SFT) and reinforcement learning (RL). Compared to DeepSeek 67B, DeepSeek-V2 offers stronger performance while saving 42.5% in training costs, reducing KV cache by 93.3%, and increasing maximum generation throughput by 5.76 times. The model supports a context length of 128k and performs excellently in standard benchmark tests and open-ended generation evaluations.

QwQ 32B Preview

Qwen/QwQ-32B-Preview

QwQ-32B-Preview is Qwen's latest experimental research model, focusing on enhancing AI reasoning capabilities. By exploring complex mechanisms such as language mixing and recursive reasoning, its main advantages include strong analytical reasoning, mathematical, and programming abilities. However, it also faces challenges such as language switching issues, reasoning loops, safety considerations, and differences in other capabilities.

Qwen2.5 7B Instruct (Free)

Qwen/Qwen2.5-7B-Instruct

Qwen2.5 is a brand new series of large language models designed to optimize the handling of instruction-based tasks.

Qwen2.5 7B Instruct (LoRA)

LoRA/Qwen/Qwen2.5-7B-Instruct

Qwen2.5-7B-Instruct is one of the latest large language models released by Alibaba Cloud. This 7B model shows significant improvements in coding and mathematics. It also provides multilingual support, covering over 29 languages, including Chinese and English. The model has made notable advancements in instruction following, understanding structured data, and generating structured outputs, especially JSON.

Qwen2.5 7B Instruct (Pro)

Pro/Qwen/Qwen2.5-7B-Instruct

Qwen2.5-7B-Instruct is one of the latest large language models released by Alibaba Cloud. This 7B model shows significant improvements in coding and mathematics. It also provides multilingual support, covering over 29 languages, including Chinese and English. The model has made notable advancements in instruction following, understanding structured data, and generating structured outputs, especially JSON.

Qwen2.5 14B Instruct

Qwen/Qwen2.5-14B-Instruct

Qwen2.5 is a brand new series of large language models designed to optimize the handling of instruction-based tasks.

Qwen2.5 32B Instruct

Qwen/Qwen2.5-32B-Instruct

Qwen2.5 is a brand new series of large language models designed to optimize the handling of instruction-based tasks.

Qwen2.5 72B Instruct

Qwen/Qwen2.5-72B-Instruct

A large language model developed by the Alibaba Cloud Tongyi Qianwen team

Qwen2.5 72B Instruct (LoRA)

LoRA/Qwen/Qwen2.5-72B-Instruct

Qwen2.5-72B-Instruct is one of the latest large language models released by Alibaba Cloud. This 72B model shows significant improvements in coding and mathematics. It also provides multilingual support, covering over 29 languages, including Chinese and English. The model has made notable advancements in instruction following, understanding structured data, and generating structured outputs, especially JSON.

Qwen2.5 72B Instruct (Vendor-A)

Vendor-A/Qwen/Qwen2.5-72B-Instruct

Qwen2.5-72B-Instruct is one of the latest large language models released by Alibaba Cloud. This 72B model shows significant improvements in coding and mathematics. It also provides multilingual support, covering over 29 languages, including Chinese and English. The model has made notable advancements in instruction following, understanding structured data, and generating structured outputs, especially JSON.

Qwen2.5 72B Instruct 128K

Qwen/Qwen2.5-72B-Instruct-128K

Qwen2.5 is a new large language model series with enhanced understanding and generation capabilities.

Qwen2.5 Coder 7B Instruct (Free)

Qwen/Qwen2.5-Coder-7B-Instruct

Qwen2.5-Coder-7B-Instruct is the latest version in Alibaba Cloud's series of code-specific large language models. This model significantly enhances code generation, reasoning, and repair capabilities based on Qwen2.5, trained on 55 trillion tokens. It not only improves coding abilities but also maintains advantages in mathematics and general capabilities, providing a more comprehensive foundation for practical applications such as code agents.

Qwen2.5 Coder 7B Instruct (Pro)

Pro/Qwen/Qwen2.5-Coder-7B-Instruct

Qwen2.5-Coder-7B-Instruct is the latest version in Alibaba Cloud's series of code-specific large language models. This model significantly enhances code generation, reasoning, and repair capabilities based on Qwen2.5, trained on 55 trillion tokens. It not only improves coding abilities but also maintains advantages in mathematics and general capabilities, providing a more comprehensive foundation for practical applications such as code agents.

Qwen2.5 Coder 32B Instruct

Qwen/Qwen2.5-Coder-32B-Instruct

Qwen2.5-Coder focuses on code writing.

Qwen2.5 Math 72B Instruct

Qwen/Qwen2.5-Math-72B-Instruct

Qwen2.5-Math focuses on problem-solving in the field of mathematics, providing expert solutions for challenging problems.

Qwen2 1.5B Instruct (Free)

Qwen/Qwen2-1.5B-Instruct

Qwen2-1.5B-Instruct is an instruction-tuned large language model in the Qwen2 series, with a parameter size of 1.5B. This model is based on the Transformer architecture and employs techniques such as the SwiGLU activation function, attention QKV bias, and group query attention. It excels in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning across multiple benchmark tests, surpassing most open-source models. Compared to Qwen1.5-1.8B-Chat, Qwen2-1.5B-Instruct shows significant performance improvements in tests such as MMLU, HumanEval, GSM8K, C-Eval, and IFEval, despite having slightly fewer parameters.

Qwen2 1.5B Instruct (Pro)

Pro/Qwen/Qwen2-1.5B-Instruct

Qwen2-1.5B-Instruct is an instruction-tuned large language model in the Qwen2 series, with a parameter size of 1.5B. This model is based on the Transformer architecture and employs techniques such as the SwiGLU activation function, attention QKV bias, and group query attention. It excels in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning across multiple benchmark tests, surpassing most open-source models. Compared to Qwen1.5-1.8B-Chat, Qwen2-1.5B-Instruct shows significant performance improvements in tests such as MMLU, HumanEval, GSM8K, C-Eval, and IFEval, despite having slightly fewer parameters.

Qwen2 7B Instruct (Free)

Qwen/Qwen2-7B-Instruct

Qwen2-72B-Instruct is an instruction-tuned large language model in the Qwen2 series, with a parameter size of 72B. This model is based on the Transformer architecture and employs techniques such as the SwiGLU activation function, attention QKV bias, and group query attention. It can handle large-scale inputs. The model excels in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning across multiple benchmark tests, surpassing most open-source models and demonstrating competitive performance comparable to proprietary models in certain tasks.

Qwen2 7B Instruct (Pro)

Pro/Qwen/Qwen2-7B-Instruct

Qwen2-7B-Instruct is an instruction-tuned large language model in the Qwen2 series, with a parameter size of 7B. This model is based on the Transformer architecture and employs techniques such as the SwiGLU activation function, attention QKV bias, and group query attention. It can handle large-scale inputs. The model excels in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning across multiple benchmark tests, surpassing most open-source models and demonstrating competitive performance comparable to proprietary models in certain tasks. Qwen2-7B-Instruct outperforms Qwen1.5-7B-Chat in multiple evaluations, showing significant performance improvements.

Qwen2 72B Instruct

Qwen/Qwen2-7B-Instruct

Qwen2-72B-Instruct is an instruction-tuned large language model in the Qwen2 series, with a parameter size of 72B. This model is based on the Transformer architecture and employs techniques such as the SwiGLU activation function, attention QKV bias, and group query attention. It can handle large-scale inputs. The model excels in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning across multiple benchmark tests, surpassing most open-source models and demonstrating competitive performance comparable to proprietary models in certain tasks.

Qwen2 72B Instruct (Vendor-A)

Vendor-A/Qwen/Qwen2-7B-Instruct

Qwen2-72B-Instruct is an instruction-tuned large language model in the Qwen2 series, with a parameter size of 72B. This model is based on the Transformer architecture and employs techniques such as the SwiGLU activation function, attention QKV bias, and group query attention. It can handle large-scale inputs. The model excels in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning across multiple benchmark tests, surpassing most open-source models and demonstrating competitive performance comparable to proprietary models in certain tasks.

Qwen2 VL 7B Instruct (Pro)

Pro/Qwen/Qwen2-VL-7B-Instruct

Qwen2-VL is the latest iteration of the Qwen-VL model, achieving state-of-the-art performance in visual understanding benchmarks.