Skip to main content
Hydra uses Apple’s Metal graphics API to render Nintendo Switch games on macOS. The rendering system includes sophisticated shader translation and a high-performance command buffer system.

Metal Rendering Pipeline

The Metal renderer handles all GPU operations, from shader compilation to final frame presentation.

Architecture Overview

class Renderer : public RendererBase {
  public:
    // Surface
    void SetSurface(void* surface) override;
    ISurfaceCompositor* AcquireNextSurface() override;

    // Buffer management
    BufferBase* CreateBuffer(u64 size) override;
    BufferBase* AllocateTemporaryBuffer(const u64 size) override;
    
    // Texture operations
    TextureBase* CreateTexture(const TextureDescriptor& descriptor) override;
    
    // Shader compilation
    ShaderBase* CreateShader(const ShaderDescriptor& descriptor) override;
    
    // Pipeline state
    PipelineBase* CreatePipeline(const PipelineDescriptor& descriptor) override;
    void BindPipeline(const PipelineBase* pipeline) override;
    
    // Draw commands
    void Draw(ICommandBuffer* command_buffer,
              const engines::PrimitiveType primitive_type,
              const u32 start, const u32 count,
              const u32 base_instance, const u32 instance_count) override;
              
  private:
    MTL::Device* device;
    MTL::CommandQueue* command_queue;
    CA::MetalLayer* layer;
};

Key Components

The renderer maintains a Metal device and command queue for GPU operations:
MTL::Device* device;
MTL::CommandQueue* command_queue;
All rendering commands are submitted through the command queue to the GPU.
The renderer tracks the current rendering state including:
struct State {
    const RenderPass* render_pass;
    Viewport viewports[VIEWPORT_COUNT];
    Scissor scissors[VIEWPORT_COUNT];
    const Pipeline* pipeline;
    BufferView index_buffer;
    std::array<BufferView, VERTEX_ARRAY_COUNT> vertex_buffers;
    std::array<std::array<BufferView, CONST_BUFFER_BINDING_COUNT>,
               usize(ShaderType::Count)> uniform_buffers;
    std::array<std::array<CombinedTextureSampler, TEXTURE_BINDING_COUNT>,
               usize(ShaderType::Count)> textures;
};
Multiple caches optimize pipeline state creation:
DepthStencilStateCache* depth_stencil_state_cache;
BlitPipelineCache* blit_pipeline_cache;
ClearColorPipelineCache* clear_color_pipeline_cache;
ClearDepthPipelineCache* clear_depth_pipeline_cache;

Shader Compilation

Hydra supports two shader backends for compiling Nintendo Switch shaders to Metal:

MSL Backend

Translates shaders to Metal Shading Language source code

AIR Backend

Uses Apple Intermediate Representation for optimized performance

Metal Shading Language (MSL)

The MSL backend converts Nintendo Switch GPU shaders to Metal Shading Language source code, which is then compiled at runtime.
Shader::Shader(const ShaderDescriptor& descriptor) : ShaderBase(descriptor) {
    // Compile options
    MTL::CompileOptions* options = MTL::CompileOptions::alloc()->init();
    if (true) // Configurable
        options->setPreserveInvariance(true);
    
    // MSL compilation
    switch (descriptor.backend) {
    case ShaderBackend::Msl: {
        // Convert shader code to string
        std::string source;
        source.assign(descriptor.code.begin(), descriptor.code.end());
        
        // Compile MSL source to Metal library
        NS::Error* error;
        library = METAL_RENDERER_INSTANCE.GetDevice()->newLibrary(
            ToNSString(source), options, &error);
        if (error) {
            LOG_ERROR(MetalRenderer, "Failed to create Metal library: {}",
                      error->localizedDescription()->utf8String());
            return;
        }
        break;
    }
    }
    
    // Extract main function
    function = library->newFunction(ToNSString("main_"));
    library->release();
}

MSL Features

MSL shaders are compiled at runtime, allowing for:
  • Dynamic shader modifications
  • Easier debugging with readable source code
  • Shader hot-reloading during development
The compiler preserves invariance to ensure consistent results:
options->setPreserveInvariance(true);
This is critical for maintaining visual accuracy across different GPUs.
Fast math optimizations can be enabled for performance:
options->setFastMathEnabled(true); // Configurable

Apple Intermediate Representation (AIR)

The AIR backend uses Apple’s intermediate representation format for potentially better performance.
case ShaderBackend::Air: {
    // Create dispatch data from AIR binary
    auto dispatch_data =
        dispatch_data_create(descriptor.code.data(), 
                           descriptor.code.size(),
                           dispatch_get_global_queue(0, 0),
                           ^{});
    
    // Load AIR library
    NS::Error* error;
    library = METAL_RENDERER_INSTANCE.GetDevice()->newLibrary(
        dispatch_data, &error);
    if (error) {
        LOG_ERROR(MetalRenderer, "Failed to create Metal library: {}",
                  error->localizedDescription()->utf8String());
        return;
    }
    break;
}

AIR Features

AIR shaders are pre-compiled to an intermediate format:
  • Faster load times (no runtime compilation)
  • Reduced CPU overhead during shader loading
  • More consistent performance
AIR uses a binary format that’s loaded directly:
auto dispatch_data = dispatch_data_create(
    descriptor.code.data(), descriptor.code.size(), ...);
AIR benefits from Apple’s shader compiler optimizations:
  • Better instruction scheduling
  • Improved register allocation
  • GPU-specific optimizations

Shader Backend Comparison

Pros:
  • Easier to debug (human-readable)
  • Better for development
  • Supports runtime modifications
  • No pre-compilation step needed
Cons:
  • Runtime compilation overhead
  • Slightly slower initial load
  • More CPU usage during shader creation

Render Pass System

The renderer uses a sophisticated render pass system to manage rendering operations:
void BindRenderPass(const RenderPassBase* render_pass) override;
RenderPassBase* CreateRenderPass(const RenderPassDescriptor& descriptor) override;
Render passes group related drawing operations together for optimal GPU performance.

Resource Binding

Hydra’s Metal renderer supports efficient resource binding:
void BindVertexBuffer(const BufferView& buffer, u32 index);

Drawing Operations

The renderer supports both indexed and non-indexed drawing:
// Non-indexed drawing
void Draw(ICommandBuffer* command_buffer,
          const engines::PrimitiveType primitive_type,
          const u32 start, const u32 count,
          const u32 base_instance, const u32 instance_count);

// Indexed drawing
void DrawIndexed(ICommandBuffer* command_buffer,
                 const engines::PrimitiveType primitive_type,
                 const u32 start, const u32 count,
                 const u32 base_vertex, const u32 base_instance,
                 const u32 instance_count);

Performance Considerations

Shader Caching

Compiled shaders are cached to avoid recompilation

Pipeline Caching

Pipeline states are cached for fast switching

Command Buffers

Commands are batched for efficient GPU submission

Resource Pooling

Temporary resources are pooled and reused

Graphics Capture Support

The renderer supports Metal frame capture for debugging:
void BeginCapture() override;
void EndCapture() override;
Use Xcode’s Metal debugger to capture and analyze frames rendered by Hydra.

Configuration

The shader backend is configured in the settings:
  • MSL: Best for development and debugging
  • AIR: Best for performance and release builds
See the Configuration page for details on the shader_backend option.