Skip to main content
FlyMyAI logo

Media AI Models

A curated catalog of Google Gemini generative media models, built in on your FlyMyAI account: Nano Banana 2 (gemini-3.1-flash-image-preview) for images and Veo 3.1 for video. Every method returns an agent_file with an id and a public_url you can feed straight into the next step (e.g. send_photo, send_video, or a Media Editor pipeline).

What it can do

MethodWhat it does
generate_imageText-to-image with Nano Banana 2 → a .png agent_file. Optional aspect_ratio (14 values, incl. extremes like 1:4, 4:1, 1:8, 8:1), image_size (512, 1K, 2K, 4K), enable_web_search / enable_image_search grounding, and thinking_level (Low / High) for complex prompts.
edit_imageImage-to-image editing with Nano Banana 2. Pass the source image (HTTPS URL or base64) plus up to 13 more reference images (image1..image13); same aspect_ratio, image_size, search-grounding, and thinking_level options. Returns the edited image as an agent_file.
generate_videoText-to-video or image-to-video with Veo 3.1 → an .mp4 agent_file. Supports a starting image, a last_frame to interpolate to, up to 3 reference_images for visual consistency, aspect_ratio (16:9 / 9:16), duration_seconds (4, 6, or 8), resolution (720p / 1080p / 4k), number_of_videos, generate_audio, and negative_prompt. Polls up to ~6 minutes.
analyze_imageGemini vision (VLM) analysis of one or more public image URLs. Returns a text description - item, lighting, suggested context - with an optional prompt to focus the analysis.

generate_video defaults to veo-3.1-generate-preview (best quality); you can also pick veo-3.1-fast-generate-preview (lower latency), veo-3.1-lite-generate-preview (fastest), or the Veo 3.0 / 2.0 models.

How to get credentials

None - Media AI Models is a built-in tool. The Gemini API key lives on FlyMyAI infrastructure, so there is nothing to create or paste. Just enable it.

Fields to fill in FlyMyAI

None.

Troubleshooting

  • Image input not used - edit_image and analyze_image need direct public HTTPS URLs (or base64). Pre-upload via download_link if the source needs auth.
  • Video takes a while - Veo 3.1 polls up to ~6 minutes. Generate the video first, then chain its public_url into later steps.
  • Silent video wanted - set generate_audio to false on generate_video when you plan to add an ElevenLabs voiceover afterwards.
Built with care by FlyMy.AI.