GPU Rendering

Hydra uses Apple’s Metal graphics API to render Nintendo Switch games on macOS. The rendering system includes sophisticated shader translation and a high-performance command buffer system.

Metal Rendering Pipeline

The Metal renderer handles all GPU operations, from shader compilation to final frame presentation.

Architecture Overview

class Renderer : public RendererBase {
  public:
    // Surface
    void SetSurface(void* surface) override;
    ISurfaceCompositor* AcquireNextSurface() override;

    // Buffer management
    BufferBase* CreateBuffer(u64 size) override;
    BufferBase* AllocateTemporaryBuffer(const u64 size) override;
    
    // Texture operations
    TextureBase* CreateTexture(const TextureDescriptor& descriptor) override;
    
    // Shader compilation
    ShaderBase* CreateShader(const ShaderDescriptor& descriptor) override;
    
    // Pipeline state
    PipelineBase* CreatePipeline(const PipelineDescriptor& descriptor) override;
    void BindPipeline(const PipelineBase* pipeline) override;
    
    // Draw commands
    void Draw(ICommandBuffer* command_buffer,
              const engines::PrimitiveType primitive_type,
              const u32 start, const u32 count,
              const u32 base_instance, const u32 instance_count) override;
              
  private:
    MTL::Device* device;
    MTL::CommandQueue* command_queue;
    CA::MetalLayer* layer;
};

Key Components

Metal Device & Command Queue

The renderer maintains a Metal device and command queue for GPU operations:

MTL::Device* device;
MTL::CommandQueue* command_queue;

All rendering commands are submitted through the command queue to the GPU.

State Management

The renderer tracks the current rendering state including:

struct State {
    const RenderPass* render_pass;
    Viewport viewports[VIEWPORT_COUNT];
    Scissor scissors[VIEWPORT_COUNT];
    const Pipeline* pipeline;
    BufferView index_buffer;
    std::array<BufferView, VERTEX_ARRAY_COUNT> vertex_buffers;
    std::array<std::array<BufferView, CONST_BUFFER_BINDING_COUNT>,
               usize(ShaderType::Count)> uniform_buffers;
    std::array<std::array<CombinedTextureSampler, TEXTURE_BINDING_COUNT>,
               usize(ShaderType::Count)> textures;
};

Pipeline Caches

Multiple caches optimize pipeline state creation:

DepthStencilStateCache* depth_stencil_state_cache;
BlitPipelineCache* blit_pipeline_cache;
ClearColorPipelineCache* clear_color_pipeline_cache;
ClearDepthPipelineCache* clear_depth_pipeline_cache;

Shader Compilation

Hydra supports two shader backends for compiling Nintendo Switch shaders to Metal:

MSL Backend

Translates shaders to Metal Shading Language source code

AIR Backend

Uses Apple Intermediate Representation for optimized performance

Metal Shading Language (MSL)

The MSL backend converts Nintendo Switch GPU shaders to Metal Shading Language source code, which is then compiled at runtime.

Shader::Shader(const ShaderDescriptor& descriptor) : ShaderBase(descriptor) {
    // Compile options
    MTL::CompileOptions* options = MTL::CompileOptions::alloc()->init();
    if (true) // Configurable
        options->setPreserveInvariance(true);
    
    // MSL compilation
    switch (descriptor.backend) {
    case ShaderBackend::Msl: {
        // Convert shader code to string
        std::string source;
        source.assign(descriptor.code.begin(), descriptor.code.end());
        
        // Compile MSL source to Metal library
        NS::Error* error;
        library = METAL_RENDERER_INSTANCE.GetDevice()->newLibrary(
            ToNSString(source), options, &error);
        if (error) {
            LOG_ERROR(MetalRenderer, "Failed to create Metal library: {}",
                      error->localizedDescription()->utf8String());
            return;
        }
        break;
    }
    }
    
    // Extract main function
    function = library->newFunction(ToNSString("main_"));
    library->release();
}

MSL Features

Runtime Compilation

MSL shaders are compiled at runtime, allowing for:

Dynamic shader modifications
Easier debugging with readable source code
Shader hot-reloading during development

Invariance Preservation

The compiler preserves invariance to ensure consistent results:

options->setPreserveInvariance(true);

This is critical for maintaining visual accuracy across different GPUs.

Fast Math (Optional)

Fast math optimizations can be enabled for performance:

options->setFastMathEnabled(true); // Configurable

Apple Intermediate Representation (AIR)

The AIR backend uses Apple’s intermediate representation format for potentially better performance.

case ShaderBackend::Air: {
    // Create dispatch data from AIR binary
    auto dispatch_data =
        dispatch_data_create(descriptor.code.data(), 
                           descriptor.code.size(),
                           dispatch_get_global_queue(0, 0),
                           ^{});
    
    // Load AIR library
    NS::Error* error;
    library = METAL_RENDERER_INSTANCE.GetDevice()->newLibrary(
        dispatch_data, &error);
    if (error) {
        LOG_ERROR(MetalRenderer, "Failed to create Metal library: {}",
                  error->localizedDescription()->utf8String());
        return;
    }
    break;
}

AIR Features

Pre-compiled Shaders

AIR shaders are pre-compiled to an intermediate format:

Faster load times (no runtime compilation)
Reduced CPU overhead during shader loading
More consistent performance

Binary Format

AIR uses a binary format that’s loaded directly:

auto dispatch_data = dispatch_data_create(
    descriptor.code.data(), descriptor.code.size(), ...);

Metal Optimization

AIR benefits from Apple’s shader compiler optimizations:

Better instruction scheduling
Improved register allocation
GPU-specific optimizations

Shader Backend Comparison

Pros:

Easier to debug (human-readable)
Better for development
Supports runtime modifications
No pre-compilation step needed

Cons:

Runtime compilation overhead
Slightly slower initial load
More CPU usage during shader creation

Render Pass System

The renderer uses a sophisticated render pass system to manage rendering operations:

void BindRenderPass(const RenderPassBase* render_pass) override;
RenderPassBase* CreateRenderPass(const RenderPassDescriptor& descriptor) override;

Render passes group related drawing operations together for optimal GPU performance.

Resource Binding

Hydra’s Metal renderer supports efficient resource binding:

void BindVertexBuffer(const BufferView& buffer, u32 index);

Drawing Operations

The renderer supports both indexed and non-indexed drawing:

// Non-indexed drawing
void Draw(ICommandBuffer* command_buffer,
          const engines::PrimitiveType primitive_type,
          const u32 start, const u32 count,
          const u32 base_instance, const u32 instance_count);

// Indexed drawing
void DrawIndexed(ICommandBuffer* command_buffer,
                 const engines::PrimitiveType primitive_type,
                 const u32 start, const u32 count,
                 const u32 base_vertex, const u32 base_instance,
                 const u32 instance_count);

Performance Considerations

Shader Caching

Compiled shaders are cached to avoid recompilation

Pipeline Caching

Pipeline states are cached for fast switching

Command Buffers

Commands are batched for efficient GPU submission

Resource Pooling

Temporary resources are pooled and reused

Graphics Capture Support

The renderer supports Metal frame capture for debugging:

void BeginCapture() override;
void EndCapture() override;

Use Xcode’s Metal debugger to capture and analyze frames rendered by Hydra.

Configuration

The shader backend is configured in the settings:

MSL: Best for development and debugging
AIR: Best for performance and release builds

See the Configuration page for details on the shader_backend option.

Documentation Index

​Metal Rendering Pipeline

​Architecture Overview

​Key Components

​Shader Compilation

MSL Backend

AIR Backend

​Metal Shading Language (MSL)

​MSL Features

​Apple Intermediate Representation (AIR)

​AIR Features

​Shader Backend Comparison

​Render Pass System

​Resource Binding

​Drawing Operations

​Performance Considerations

Shader Caching

Pipeline Caching

Command Buffers

Resource Pooling

​Graphics Capture Support

​Configuration

Metal Rendering Pipeline

Architecture Overview

Key Components

Shader Compilation

Metal Shading Language (MSL)

MSL Features

Apple Intermediate Representation (AIR)

AIR Features

Shader Backend Comparison

Render Pass System

Resource Binding

Drawing Operations

Performance Considerations

Graphics Capture Support

Configuration