Media AI Models
A curated catalog of Google Gemini generative media models, built in on your
FlyMyAI account: Nano Banana 2 (gemini-3.1-flash-image-preview) for images and
Veo 3.1 for video. Every method returns an agent_file with an id and a
public_url you can feed straight into the next step (e.g. send_photo,
send_video, or a Media Editor pipeline).
What it can do
| Method | What it does |
|---|---|
generate_image | Text-to-image with Nano Banana 2 → a .png agent_file. Optional aspect_ratio (14 values, incl. extremes like 1:4, 4:1, 1:8, 8:1), image_size (512, 1K, 2K, 4K), enable_web_search / enable_image_search grounding, and thinking_level (Low / High) for complex prompts. |
edit_image | Image-to-image editing with Nano Banana 2. Pass the source image (HTTPS URL or base64) plus up to 13 more reference images (image1..image13); same aspect_ratio, image_size, search-grounding, and thinking_level options. Returns the edited image as an agent_file. |
generate_video | Text-to-video or image-to-video with Veo 3.1 → an .mp4 agent_file. Supports a starting image, a last_frame to interpolate to, up to 3 reference_images for visual consistency, aspect_ratio (16:9 / 9:16), duration_seconds (4, 6, or 8), resolution (720p / 1080p / 4k), number_of_videos, generate_audio, and negative_prompt. Polls up to ~6 minutes. |
analyze_image | Gemini vision (VLM) analysis of one or more public image URLs. Returns a text description - item, lighting, suggested context - with an optional prompt to focus the analysis. |
generate_video defaults to veo-3.1-generate-preview (best quality); you can
also pick veo-3.1-fast-generate-preview (lower latency),
veo-3.1-lite-generate-preview (fastest), or the Veo 3.0 / 2.0 models.
How to get credentials
None - Media AI Models is a built-in tool. The Gemini API key lives on FlyMyAI infrastructure, so there is nothing to create or paste. Just enable it.
Fields to fill in FlyMyAI
None.
Troubleshooting
- Image input not used -
edit_imageandanalyze_imageneed direct public HTTPS URLs (or base64). Pre-upload viadownload_linkif the source needs auth. - Video takes a while - Veo 3.1 polls up to ~6 minutes. Generate the video
first, then chain its
public_urlinto later steps. - Silent video wanted - set
generate_audiotofalseongenerate_videowhen you plan to add an ElevenLabs voiceover afterwards.
Built with care by FlyMy.AI.