summarize text from english audio

ASR -> Summarization Workflow

This workflow takes audio as input, runs an Audio Speech Recognition model, and uses the resulting text as input to a summarization model. It outputs both the text from the ASR model and the summarized text from the summarization model.

The models are as follows:

asr-wav2vec2-large-robust-ft-swbd-300h-english -> text-summarization-english-distilbart-cnn-12-6

Limitations: text summarization model is trained on CNN text and doesn't summarize well beyond news summarizations.

