AudioFlamingo

Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities". PAPER LINK

Install

pip3 install audio-flamingo

Usage

import torch
from audio_flamingo.model import AudioFlamingo

# Generate a random input sequence
text = torch.randint(0, 256, (1, 1024))
audio = torch.randn(1, 16000)

# Initialize AudioFlamingo model
model = AudioFlamingo(
    dim=512,
    num_tokens=256,
    max_seq_len=1024,
    heads=8,
    depth=6,
    dim_head=64,
    dropout=0.1,
    context_dim=512,
)

# Pass the input sequence through the model
output = model(text, audio)  # (1, 1024, 256)

# Print the output shape
print(output.shape)
# Path: audio_flamingo/model.py

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

AudioFlamingo

Install

Usage

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

AudioFlamingo

Install

Usage

License