← Back to Products

Qasar Media Server

Enterprise media management solution

Qasar Media Server - Features Overview

A comprehensive media server solution providing scalable upload, processing, adaptive playback, and intelligent labeling capabilities. This repository includes a FastAPI backend server, iOS packages for easy integration, sample applications, and advanced AR technologies for content labeling and moderation.

Table of Contents


Server Features

FastAPI Backend

The server is built on FastAPI with PostgreSQL and Redis, providing a robust, scalable foundation for media management.

Core Capabilities

  • RESTful API with OpenAPI/Swagger documentation
  • Multi-tenant architecture with tenant and user isolation
  • Asynchronous processing using background workers
  • Database migrations via Alembic
  • Redis-based job queue for media processing tasks

Chunked Upload System

Efficient large-file upload handling with resume support:

  • Session-based uploads: Create upload sessions with configurable chunk sizes
  • Resumable uploads: Track chunk metadata in Redis for reliable resume capability
  • Progress tracking: Real-time upload progress monitoring
  • Multi-chunk support: Upload files in configurable chunk sizes
  • Automatic validation: File integrity checks during upload

Upload Flow:
1. Create upload session with metadata (filename, total size, chunk size)
2. Upload chunks sequentially or in parallel
3. Complete session to trigger processing

Media Processing Pipeline

Automated transcoding and packaging for multiple formats:

  • Video Processing:
  • Multi-resolution HLS transcoding (1080p, 720p, 480p)
  • DASH manifest generation
  • WebM/VP9 encoding
  • Thumbnail generation (up to 25 thumbnails per video)
  • Poster image extraction
  • Duration and dimension extraction (width, height, SAR, DAR)

  • Audio Processing:

  • WebM/Opus encoding
  • Spectrogram generation
  • Poster image from spectrogram

  • Image Processing:

  • WebP conversion
  • Thumbnail generation
  • Poster image creation

  • Non-packaged Types: Images, audio files, and documents are served directly without transcoding

Media Management API

Comprehensive media asset management:

  • Media Retrieval: Get media by ID with full metadata
  • Search: Full-text search across labels and captions
  • Recommendations: Stub for recommendation engine integration
  • Feed Endpoints: New media, recommended media, search results
  • Metadata Management: Captions, user tags, labels
  • Status Tracking: Processing status (queued, processing, ready, error)

Label Management System

Flexible labeling infrastructure:

  • Label Banks: Tenant-specific label collections
  • General Labels: Content categorization labels
  • Moderation Labels: Content moderation and safety labels
  • Label Statistics: Track label usage, confidence scores, and match counts
  • Label Associations: Link labels to media assets with confidence scores
  • Hierarchical Labels: Support for parent-child label relationships

Storage Architecture

Organized storage structure:

/assets/
  /<tenant_id>/
    /<user_id>/
      /<media_id>/
        /media/          # Original uploaded file
        /hls/            # HLS playlists and segments
        /dash/           # DASH manifest and segments
        /webm/           # WebM encoded files
        /webp/           # WebP converted images
        /thumbnails/     # Thumbnail images
        /posters/        # Poster images
        /transcripts/    # Transcript files

Sample Applications

iOS Sample Application (QasarMedia)

A complete SwiftUI iOS application demonstrating integration with the media server.

Features

  • Vertical Video Feed: TikTok-style scrollable feed with autoplay
  • Smart Playback: Autoplay when video reaches top 25% of screen
  • Infinite Scroll: Pagination for seamless content browsing
  • Media Upload: Chunked upload with progress tracking
  • Media Download: Save videos to Photos or Files app
  • Settings Management: Configure server URL, tenant ID, and user ID
  • HLS Playback: Native AVPlayer integration with adaptive streaming
  • Label Management: View and manage labels for uploaded content
  • Creator View: Upload and label media content

Technical Stack

  • SwiftUI for modern UI
  • AVFoundation for media playback
  • Combine for reactive programming
  • CoreML for on-device ML inference
  • Vision Framework for computer vision tasks

Web Application (Flask)

A Flask-based web application for media management and moderation.

Features

  • Media Feed: Browse uploaded media with pagination
  • Creator Interface: Upload and manage media content
  • Label Management: Create and manage label banks
  • Moderation Interface: Review and moderate content using moderation labels
  • Settings: Configure server and tenant settings
  • Tailwind CSS: Modern, responsive UI styling

Blueprints

  • feed: Media browsing and viewing
  • creator: Media upload and creation
  • labels: General label management
  • moderation_labels: Content moderation labels
  • settings: Application configuration

iOS Packages

Two Swift Package Manager packages for easy integration into new or existing iOS applications.

MediaUpload Package

A complete solution for uploading media to the Qasar Media Server.

Components

  • MediaUploadService: Main service for coordinating uploads
  • UploadCoordinator: Manages upload sessions and chunk coordination
  • UploadAPIClient: REST API client for server communication
  • UploadProgressView: SwiftUI view for displaying upload progress
  • VideoLabeler: Protocol for on-device video labeling (optional)
  • UploadTypes: Type definitions for upload operations

Features

  • Chunked Upload: Automatic chunking and upload coordination
  • Resume Support: Resume interrupted uploads
  • Progress Tracking: Real-time upload progress callbacks
  • Error Handling: Comprehensive error handling and retry logic
  • Label Integration: Optional on-device labeling before upload
  • Background Upload: Support for background upload tasks

Usage

let uploadService = MediaUploadService(
    apiClient: UploadAPIClient(baseURL: serverURL),
    labeler: yourLabelingService // Optional
)

try await uploadService.uploadVideo(
    url: videoURL,
    tenantId: tenantUUID,
    userId: userUUID
)

VideoPlayer Package

A high-performance video player component with adaptive streaming support.

Components

  • VideoPlayerView: SwiftUI view for video playback
  • VideoPlayerManager: Centralized player management and prewarming
  • VideoPlayerItem: Protocol for media items
  • ThumbnailCache: Efficient thumbnail caching system

Features

  • Adaptive Streaming: HLS and DASH support with automatic quality selection
  • Player Prewarming: Preload next videos for seamless playback
  • Thumbnail Scrubbing: Preview thumbnails during scrubbing
  • Aspect Ratio Handling: Correct SAR/DAR calculation for proper display
  • Poster Images: Display poster images before playback starts
  • Memory Management: Efficient player lifecycle management
  • Tap Gestures: Customizable tap handling for play/pause

Usage

VideoPlayerView(
    item: mediaItem,
    onTap: { /* Handle tap */ },
    onScrubStart: { /* Handle scrub start */ },
    onScrubEnd: { time in /* Handle scrub end */ }
)

AR Technologies for Labeling and Moderation

Advanced on-device AI/ML technologies for automatic content labeling and moderation.

MobileCLIP Integration

Semantic understanding using Apple's MobileCLIP models:

  • Image Embeddings: Extract semantic embeddings from video frames
  • Text Embeddings: Generate embeddings for label phrases using multiple templates
  • Similarity Matching: Match video frames to labels using cosine similarity
  • Template-based Prompts: Multiple prompt templates for improved accuracy
  • "a photo of {label}"
  • "a video frame of {label}"
  • "a close-up photo of {label}"
  • "a scene of {label}"
  • "a photo of the {label}"

YOLO Object Detection

Real-time object detection using YOLOv8:

  • CoreML Integration: Native CoreML model support
  • Vision Framework: Integration with Apple's Vision framework
  • Custom Label Mapping: Map detected objects to label bank entries
  • Confidence Scoring: Track detection confidence for moderation decisions
  • Batch Processing: Efficient frame-by-frame detection

Vision ROI Analyzer

Intelligent region-of-interest analysis:

  • Face Detection: Detect faces in video frames
  • Landmark Detection: Identify facial landmarks for selfie detection
  • Animal Detection: Recognize animals in content
  • Object Tracking: Track objects across frames

Labeling Service

Comprehensive labeling pipeline:

  • Frame Sampling: Intelligent frame sampling at configurable FPS
  • Keyframe Detection: Prioritize keyframes for efficient processing
  • Multi-model Fusion: Combine CLIP and YOLO results
  • Confidence Thresholding: Filter labels by confidence scores
  • Temporal Aggregation: Aggregate labels across video duration
  • Label Bank Integration: Match against tenant-specific label banks

Label Banks

Flexible label management system:

  • General Labels: Content categorization (objects, scenes, activities)
  • Moderation Labels: Safety and content moderation labels
  • Phrase Variants: Multiple phrases per label for improved matching
  • Statistics Tracking: Track label usage and performance metrics
  • Tenant Isolation: Separate label banks per tenant

Smart Reframe Transforms

Intelligent video composition:

  • ROI-based Cropping: Crop videos based on detected regions of interest
  • Aspect Ratio Adaptation: Adapt content for different display formats
  • Composition Service: Generate optimal video compositions

Adaptive Playback and Packaging

High-quality adaptive streaming with multiple format support.

HLS (HTTP Live Streaming)

Multi-resolution adaptive streaming:

  • Multi-resolution Support: 1080p, 720p, and 480p variants
  • 5-second Segments: Optimal balance between quality and latency
  • Master Playlist: Automatic playlist generation with bandwidth hints
  • H.264 Encoding: Broad device compatibility
  • AAC Audio: High-quality audio encoding
  • Independent Segments: Enable efficient seeking and caching

Encoding Profiles:
- 1080p: 5000k video bitrate, 128k audio
- 720p: 3000k video bitrate, 128k audio
- 480p: 1500k video bitrate, 128k audio

DASH (Dynamic Adaptive Streaming over HTTP)

Alternative adaptive streaming format:

  • MPD Manifest: Media Presentation Description generation
  • H.264 Encoding: Consistent codec support
  • Multi-bitrate Support: Automatic quality selection

WebM Support

Modern web format encoding:

  • VP9 Video: Efficient video compression
  • Opus Audio: High-quality audio codec
  • WebP Images: Modern image format with superior compression

Thumbnail Generation

Comprehensive thumbnail support:

  • Video Thumbnails: Up to 25 thumbnails per video (configurable interval)
  • Spectrogram Thumbnails: For audio files
  • Image Thumbnails: Scaled versions of images
  • WebP Format: Efficient thumbnail storage
  • Timestamp Association: Each thumbnail linked to video timestamp

Poster Images

High-quality poster generation:

  • Video Posters: Extracted from video content using thumbnail algorithm
  • Audio Posters: Spectrogram-based posters
  • Image Posters: Scaled versions of images
  • Optimal Sizing: 1280px width for video posters

Transcript Support

Text content extraction:

  • Transcript Storage: Organized transcript file management
  • API Integration: Ready for transcription service integration
  • Structured Format: Plain text transcripts with metadata

Processing Features

Advanced processing capabilities:

  • Asynchronous Processing: Background worker processing
  • Error Handling: Comprehensive error tracking and reporting
  • Status Tracking: Real-time processing status updates
  • Requeue Support: Retry failed processing jobs
  • Metadata Extraction: Duration, dimensions, aspect ratios
  • Format Detection: Automatic media type detection

API Endpoints

RESTful API for accessing processed media:

  • Media Metadata: /api/v1/media/{media_id}
  • HLS Playback: /api/v1/media/{media_id}/hls/{file_path}
  • DASH Playback: /api/v1/media/{media_id}/dash/{file_path}
  • Thumbnail Access: /api/v1/media/{media_id}/serve/thumbnail/{file_path}
  • Poster Access: /api/v1/media/{media_id}/serve/poster/{file_path}
  • Transcript Access: /api/v1/media/{media_id}/serve/transcript/{file_path}

Technology Stack

Backend

  • FastAPI: Modern Python web framework
  • PostgreSQL: Relational database with JSONB support
  • Redis: Job queue and caching
  • Alembic: Database migrations
  • FFmpeg: Media transcoding
  • HlsKit-Py: Advanced HLS processing (optional)

iOS

  • Swift 5.9+: Modern Swift language features
  • SwiftUI: Declarative UI framework
  • AVFoundation: Media playback and processing
  • CoreML: On-device machine learning
  • Vision Framework: Computer vision tasks
  • Combine: Reactive programming

Web

  • Flask: Python web framework
  • Tailwind CSS: Utility-first CSS framework
  • JavaScript: Client-side interactivity

Deployment

The repository includes comprehensive deployment support:

  • Docker: Multi-stage Dockerfiles for different components
  • Docker Compose: Local development environment
  • Nginx Configuration: Production-ready reverse proxy setup
  • Deployment Scripts: Automated deployment workflows
  • Multi-architecture Support: ARM64 and x86_64 support

Summary

The Qasar Media Server provides a complete solution for:

  1. Scalable Media Upload: Chunked uploads with resume support
  2. Intelligent Processing: Automated transcoding to multiple adaptive formats
  3. High-Quality Playback: HLS/DASH adaptive streaming with multi-resolution support
  4. AI-Powered Labeling: On-device CLIP and YOLO integration for automatic labeling
  5. Content Moderation: Label bank system for safety and compliance
  6. Easy Integration: Swift packages for seamless iOS integration
  7. Sample Applications: Complete reference implementations

Perfect for building modern media applications with advanced AI capabilities and professional-grade adaptive streaming.