What is Audio-Event-Detection & Clustering?

The Audio Event Detection (AED) and Clustering analyses aim to automatically detect and categorize sounds in large audio datasets without supervision. Our AED-C pipeline uses an AI-based clustering method (DBSCAN), which shows higher performance over other methods (e.g., k-means).   


This analysis provides an efficient way to summarize and explore the sound categories in an audio dataset. The potential uses include: 

  • Quickly identifying communities of species
  • Estimating species richness and composition
  • Discovering unknown sound categories
  • Quickly searching for examples of a desired signal/call, without the need for any existing examples
  • Collecting training data for supervised audio recognition models
    • Pattern Matching can be used after AED & Clustering to efficiently detect more examples of a desired sound

The pipeline consists of two main steps:

  1. Audio Event Detection (AED): automatically detect relevant sounds in raw field recordings (learn how to run AED here)
  2. Clustering (C): cluster these sounds based on feature similarities (learn how to run clustering here)

Learn more about the AED-C pipeline in our white paper