Overview
RAVE (Realtime Audio Variational autoEncoder) is a variational autoencoder designed for fast and high-quality neural audio synthesis. Developed by Antoine Caillon and Philippe Esling, RAVE provides an official implementation for realtime audio applications. It supports dataset preparation using regular and lazy preprocessing methods, allowing training directly on raw audio files. The tool facilitates training with various configurations, including v1, v2, discrete, and causal models. Data augmentation techniques are also available to improve model generalization. RAVE is built with non-causal convolutions by default but can be configured for causal mode to lower latency. The models can be exported to torchscript files for realtime processing. RAVE finds utility in music performance, installations, and research, requiring citation when used.
