StreamingMemeStreamingMeme
LeaderboardsEventsSubmit News
SUBSCRIBE

Daily Brief

The streaming industry in your inbox every morning.

Daily Brief

The streaming industry in your inbox every morning.

StreamingMeme

The streaming technology industry news aggregator.

About UsNewsletterSubmit News
© 2026 StreamingMeme. All rights reserved.
← AI for Video
AI & VideoTechnical Development

Jina extends text embeddings to image, audio, and video

Jina extends text embeddings to image, audio, and video
Arxiv

Jina by Elastic researchers introduced jina-embeddings-v5-omni, a suite of multimodal embedding models that extends existing text embedding models to support image, audio, and video inputs. The models utilize a "frozen-encoder model composition" approach, connecting pre-trained modality-specific encoders to a frozen text embedding model via compact projectors, resulting in competitive performance with less training. The jina-embeddings-v5-omni-small model achieves strong text-only performance and competitive scores on image and audio tasks compared to other open-weight multimodal embedding models.

Key Takeaways

  • jina-embeddings-v5-omni comes in two base models: nano at 0.95B parameters and small at 1.57B parameters.
  • The training recipe freezes the text backbone, vision encoder, and audio encoder, and updates only fc_vision_2, fc_audio, and modality delimiter embeddings.
  • The paper says the trainable components are 0.35% of the joint model’s total weights.
  • On the open-weight benchmark table, jina-embeddings-v5-omni-small scores 67.00 on text, 56.05 on image, 41.20 on video, and 51.46 on audio, for a 53.93 average.
  • For visual document retrieval on ViDoRe-in-MIEB, jina-embeddings-v5-omni-small scores 79.08, while jina-embeddings-v5-omni-nano scores 70.05.

Why It Matters

This is a practical way to extend a text embedding stack into multimodal retrieval without retraining the full model. The paper keeps text embeddings identical to Jina Embeddings v5 Text, which matters for existing retrieval and RAG pipelines that depend on stable vector geometry. It also shows the strongest results on visual document retrieval, while video remains the weak spot in the benchmark tables. Watch the release’s task-specific variants and the gap between image/audio scores and MMEB-Video performance, since those are the clearest signs of where the recipe holds up and where it doesn’t.


Read full article at arxiv.org

Related Articles

Amazon Web Services, Inc.: AWS SageMaker Adds Multi-Turn RL for Specialized AI Model Training
Agora: Agora Integrates OpenAI Real-Time API for Low-Latency Conversational AI
wTVision: wTVision Debuts CricketStats CG, Enters Cricket Graphics Market in Bangladesh

Newest

1 day ago
Pro AVL Central: Blackmagic Debuts Fairlight Live, Boosts DaVinci Resolve 21 with AI and Photo Tools
1 day ago
NewscastStudio: MXL Rapid Development Challenges Traditional Broadcast Standardization
1 day ago
Smpte: SMPTE Media Technology Summit Returns to Pasadena November 2026
1 day ago
Tech Times: Let's Encrypt charts Merkle Tree Certificate path for post-quantum TLS
1 day ago
cvefeed.io: Netty Fixes Undetected Stream Truncation in Chunked OHTTP Messages
1 day ago
Ietf: IETF Advances Network Protocol Drafts for Streaming Infrastructure
1 day ago
Forasoft: Fora Soft Launches Monthly WebRTC & Real-time Video Engineering Report
1 day ago
Atis: ATIS Outlines Practical Roadmap for North American 5G Standalone Deployment
1 day ago
Youtube: 3GPP Advances 5G-Advanced with Release 19, Commences 6G Studies
1 day ago
3gpp: 3GPP Release 6 Refines Radio Network Rules for Cell Handover, Measurement
1 day ago
3gpp: 3GPP Details 20 Mobile Telecommunications Releases, Including Open Release 21
1 day ago
Pro AVL Central: Matrox Launches IPMX-Ready Maevex MGX Series for 4K60 AV-over-IP
1 day ago
GitHub: OpenMOSS Expands MOSS-TTS Family with Nano Model, Enhanced SoundEffects
1 day ago
NewscastStudio: Media Exchange Layer (MXL) Complements ST 2110 for Software-Defined Production
1 day ago
Penligent Security Blog – AI-Driven Hacking Tutorials, Exploit PoCs & Cybersecurity Research: HTTP/2 Bomb Vulnerability: Apache, Envoy, Nginx Face DoS Risk
1 day ago
SamsungNewsroom: Samsung Galaxy S26 Series Introduces Cine LUT for Accessible Mobile Color Grading
1 day ago
KORE1: Spotify Engineers: A Six-Profile Map for Strategic Hiring
1 day ago
TV Tech: GatesAir Establishes Brazil Hub for DTV+ Rollout, Local Support
1 day ago
Telecompaper: Technicolor Joins Pearl TV Initiative for Affordable ATSC 3.0 Converter Boxes
1 day ago
law360: Generative AI, SEPs Drive IP Licensing Activity from May 22-June 4

Upcoming Events

Jun
8–11
NEM Dubrovnikhttps://neweumarket.com/dubrovnik/
Jun
11–12
Arctic 15https://arctic15.com/
Jun
13–19
InfoCommhttps://www.infocommshow.org/
Jun
16–19
Stream TV Show (formerly the Pay TV Show)https://www.streamtvshow.com/
Jun
17–19
Content Tokyo 2024https://www.content-tokyo.jp/ja-jp.html
View all events →

Top Sources

  1. 1.wTVision163
  2. 2.MSN150
  3. 3.Calendly86
  4. 4.Advanced Television63
  5. 5.Sports Video Group62
  6. 6.Cord Cutters News40
  7. 7.TV Technology39
  8. 8.AOL34
Full leaderboards →

Newest

1 day ago
Pro AVL Central: Blackmagic Debuts Fairlight Live, Boosts DaVinci Resolve 21 with AI and Photo Tools
1 day ago
NewscastStudio: MXL Rapid Development Challenges Traditional Broadcast Standardization
1 day ago
Smpte: SMPTE Media Technology Summit Returns to Pasadena November 2026
1 day ago
Tech Times: Let's Encrypt charts Merkle Tree Certificate path for post-quantum TLS
1 day ago
cvefeed.io: Netty Fixes Undetected Stream Truncation in Chunked OHTTP Messages
1 day ago
Ietf: IETF Advances Network Protocol Drafts for Streaming Infrastructure
1 day ago
Forasoft: Fora Soft Launches Monthly WebRTC & Real-time Video Engineering Report
1 day ago
Atis: ATIS Outlines Practical Roadmap for North American 5G Standalone Deployment
1 day ago
Youtube: 3GPP Advances 5G-Advanced with Release 19, Commences 6G Studies
1 day ago
3gpp: 3GPP Release 6 Refines Radio Network Rules for Cell Handover, Measurement
1 day ago
3gpp: 3GPP Details 20 Mobile Telecommunications Releases, Including Open Release 21
1 day ago
Pro AVL Central: Matrox Launches IPMX-Ready Maevex MGX Series for 4K60 AV-over-IP
1 day ago
GitHub: OpenMOSS Expands MOSS-TTS Family with Nano Model, Enhanced SoundEffects
1 day ago
NewscastStudio: Media Exchange Layer (MXL) Complements ST 2110 for Software-Defined Production
1 day ago
Penligent Security Blog – AI-Driven Hacking Tutorials, Exploit PoCs & Cybersecurity Research: HTTP/2 Bomb Vulnerability: Apache, Envoy, Nginx Face DoS Risk
1 day ago
SamsungNewsroom: Samsung Galaxy S26 Series Introduces Cine LUT for Accessible Mobile Color Grading
1 day ago
KORE1: Spotify Engineers: A Six-Profile Map for Strategic Hiring
1 day ago
TV Tech: GatesAir Establishes Brazil Hub for DTV+ Rollout, Local Support
1 day ago
Telecompaper: Technicolor Joins Pearl TV Initiative for Affordable ATSC 3.0 Converter Boxes
1 day ago
law360: Generative AI, SEPs Drive IP Licensing Activity from May 22-June 4

Upcoming Events

Jun
8–11
NEM Dubrovnikhttps://neweumarket.com/dubrovnik/
Jun
11–12
Arctic 15https://arctic15.com/
Jun
13–19
InfoCommhttps://www.infocommshow.org/
Jun
16–19
Stream TV Show (formerly the Pay TV Show)https://www.streamtvshow.com/
Jun
17–19
Content Tokyo 2024https://www.content-tokyo.jp/ja-jp.html
View all events →

Top Sources

  1. 1.wTVision163
  2. 2.MSN150
  3. 3.Calendly86
  4. 4.Advanced Television63
  5. 5.Sports Video Group62
  6. 6.Cord Cutters News40
  7. 7.TV Technology39
  8. 8.AOL34
Full leaderboards →