Convert PDF to text using OCR
Prompt with Images in flux[dev]
Generate depth map from images
Generate depth video from input video
Extract text from images using OCR
Separate vocals from instrumental music in audio files
interactive demo for cube 3d model
Track, rank and evaluate open Arabic LLMs and chatbots
Generate subtitles for YouTube videos
Generate 3D room layouts from RGB panoramas
image captioning, VQA
Generate voice from text using reference audio
Generate a video from a prompt and an image
Media understanding
Answer questions about images by chatting
Qwen2.5-VL 7B & 3B
Explore BERT model interactions
Generate images from text prompts
Replace characters in a video with characters in photos
Generate summaries for long-form text