RAVE

Overview

RAVE (Realtime Audio Variational autoEncoder) is a variational autoencoder designed for fast and high-quality neural audio synthesis. Developed by Antoine Caillon and Philippe Esling, RAVE provides an official implementation for realtime audio applications. It supports dataset preparation using regular and lazy preprocessing methods, allowing training directly on raw audio files. The tool facilitates training with various configurations, including v1, v2, discrete, and causal models. Data augmentation techniques are also available to improve model generalization. RAVE is built with non-causal convolutions by default but can be configured for causal mode to lower latency. The models can be exported to torchscript files for realtime processing. RAVE finds utility in music performance, installations, and research, requiring citation when used.

Common tasks

Audio Synthesis Neural Audio Encoding Realtime Audio Processing

FAQ

View all

What is RAVE?

RAVE (Realtime Audio Variational autoEncoder) is a variational autoencoder for fast and high-quality neural audio synthesis.

How do I install RAVE?

Install RAVE using pip: `pip install acids-rave`. Ensure you install torch and torchaudio beforehand.

Can RAVE process audio in realtime?

Yes, RAVE is designed for realtime audio processing and can be used with Max/MSP or PureData.

What are the hardware requirements for RAVE?

RAVE requires a GPU, and the minimum GPU memory varies based on the model configuration (e.g., 8GB for v1, 16GB for v2).

FAQ+