gemini API Text Chat Models Guide
Gemini API offers a powerful suite of text chat models designed to handle a wide range of natural language processing tasks. This guide provides a comprehensive overview of the available models, their capabilities, and how to integrate them into your applications using the chat completions endpoint.

Introduction

Gemini API offers a powerful suite of text chat models designed to handle a wide range of natural language processing tasks. This guide provides a comprehensive overview of the available models, their capabilities, and how to integrate them into your applications using the chat completions endpoint.

Available Models

Gemini API offers several model families with different capabilities, performance characteristics, and use cases:

Gemini 2.5 Pro Models

  • gemini-2.5-pro-preview-06-05 (Latest)

  • gemini-2.5-pro-preview-05-06

  • gemini-2.5-pro-preview-03-25

  • gemini-2.5-pro-exp-03-25

The Gemini 2.5 Pro models represent the most advanced and capable models in the lineup, offering superior reasoning, context understanding, and nuanced responses for complex tasks.

Gemini 2.5 Flash Models

  • gemini-2.5-flash-preview-05-20-thinking

  • gemini-2.5-flash-preview-05-20-nothinking

  • gemini-2.5-flash-preview-05-20

  • gemini-2.5-flash-preview-04-17-thinking

  • gemini-2.5-flash-preview-04-17-nothinking

  • gemini-2.5-flash-preview-04-17

The Flash variants provide faster response times while maintaining high quality, making them ideal for applications that require quick interactions.

Gemini 2.0 Models

  • gemini-2.0-pro-exp-02-05

  • gemini-2.0-flash-thinking-exp-1219

  • gemini-2.0-flash-thinking-exp-01-21

  • gemini-2.0-flash-lite-preview-02-05

  • gemini-2.0-flash-lite-001

  • gemini-2.0-flash-lite

  • gemini-2.0-flash-exp

  • gemini-2.0-flash-001

  • gemini-2.0-flash

These models offer a balance of performance and efficiency, with specialized variants for different use cases.

Making API Requests

The Gemini API uses a REST interface for chat completions. Here's how to structure your requests:

Endpoint

POST https://ai.burncloud.com/v1/chat/completions

Headers

  • Content-Type: application/json

  • Authorization: Bearer YOUR_API_KEY

Request Body Parameters

Parameter

Type

Description

messages

array

Array of message objects with role and content

model

string

The specific model to use (from the list above)

group

string

Optional grouping parameter (e.g., "default")

stream

boolean

Whether to stream the response (default: false)

stream_options

object

Options for streaming (e.g., include_usage)

return_reasoning

boolean

Whether to include reasoning in response

Example Request

curl --location 'https://ai.burncloud.com/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--data '{
  "messages": [
    {
      "role": "user",
      "content": "有人说猪会上,有人说不会,这里反应了一个什么哲学问题?请详细分析"
    }
  ],
  "model": "gemini-2.5-pro-preview-05-06",
  "group": "default",
  "stream": false,
  "stream_options": {"include_usage": true},
  "return_reasoning": true
}'

Response Format

The API returns a JSON response with the following structure:

{
  "id": "chat-xxxxxxxx",
  "object": "chat.completion",
  "created": 1717000000,
  "model": "gemini-2.5-pro-preview-05-06",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The model's response text..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 456,
    "total_tokens": 579
  },
  "reasoning": "The model's reasoning process..." // Only if return_reasoning is true
}

Advanced Features

Streaming Responses

For applications requiring real-time interactions, enable the stream parameter to receive chunks of the response as they're generated:

{
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}

Reasoning Traces

The return_reasoning parameter provides insight into the model's thought process:

{
  "return_reasoning": true
}

This is particularly useful for:

  • Debugging model responses

  • Educational applications

  • Transparency in decision-making processes

  • Fine-tuning prompts based on model reasoning

Model Selection Guide

  • For complex reasoning tasks: Use the latest Gemini 2.5 Pro models

  • For quick responses with good quality: Choose Gemini 2.5 Flash models

  • For cost-efficient operations: Consider Gemini 2.0 Flash Lite variants

  • For explicit reasoning traces: Use models with "-thinking" suffix

Best Practices

  1. Start with clear instructions in your prompts to guide the model's responses

  2. Test different models to find the best fit for your specific use case

  3. Use streaming for interactive applications to improve perceived responsiveness

  4. Monitor token usage to optimize costs and performance

  5. Implement retry logic for handling rate limits and temporary errors

Conclusion

The Gemini API Text Chat Models provide flexible, powerful natural language processing capabilities for a wide range of applications. By understanding the different model options and how to effectively structure your requests, you can leverage these models to create sophisticated AI-powered experiences.


For more information, refer to the official documentation or contact support for specific implementation questions.

Support Days: Monday - Friday CA,  USA:    9:00 - 17:00 HK, China: 9:00 - 17:00
WhatsApp +1(302)275-2870
Telegram @BurncloudCharlie
Email Contact@burncloud.com
Telegram