Omny Studio offers a range of analytics reports for publishers to understand what, when, where, and how their audio content is being downloaded and played. The analytics system measures content downloads, podcast RSS subscribers, and content consumption over a wide range of publishing endpoints.
Due to the nature of popular podcast players like Apple Podcasts, background caching, progressive downloads, and limited listener identifiers, podcast download server requests may not accurately reflect the number of unique people that have played the content, thus the data requires processing.
To ensure metrics can be consistently defined and measured across the industry, the IAB has released guidelines on how podcast analytics should be filtered and measured. The actual implementation of these guidelines will vary from provider to provider due to practical technical or design differences. This document outlines the various filtering techniques Omny Studio employs.
Note: Because we continuously refine our filtering parameters as well as experiment with methodologies to improve the accuracy of our metrics, this document may be amended from time to time to reflect any changes in our approach.
Download analytics filtering
Download analytics tracks downloads of any published audio files including but not limited to podcast players, embed players, and third-party apps and websites.
Ignore non-GET HTTP requests
We do not count HTTP requests with a method other than “GET” when calculating downloads requests.
We do not consider these downloads because we observed some podcast players will use HTTP request method “HEAD” to download file metadata without downloading any audio content.
Ignore non-playback HTTP range requests
We do not count HTTP range requests with a range of 0-1, a range of 0 bytes or an invalid range. (We will count requests without a range specified)
We do not consider these downloads because we observed some podcast players will use a 0-1 request to check if the server supports range requests without downloading any audio content.
Ignore known bots and spiders
We do not count download requests with an application user-agent that is identified to be a known bot or spider application, or an empty user-agent.
We do not consider these downloads because bots and spiders regularly download files for indexing purposes and do not correlate with people listening.
We utilize the open-source “UA-Parser” user-agent database enhanced with additional proprietary data to parse user-agents. This database is regularly updated to detect new podcast applications and bots as they are documented.
Ignore blacklisted user-agents
We do not count download requests from a list of application user-agents we’ve identified to be intentionally or unintentionally problematic applications.
We do not consider these downloads because we observed some mobile players and applications generate an excessive number (e.g. 100s) of download requests that do not correlate with people listening.
Ignore blacklisted and cloud service provider IPs
We do not count download requests from a list of IPs we’ve identified to be servers of third-party services caching content and cloud service providers.
We do not consider these downloads because third-party services are downloading files for caching and mirroring purposes and do not correlate with people listening.
Our database of cloud service provider IP ranges include:
- Amazon AWS
- Google Cloud
- Microsoft Azure
This database of IP ranges is regularly updated with the official lists published by the providers and IP ranges registered by the provider on the American Registry for Internet Numbers (ARIN).
Ignore duplicate downloads in a 30 minute rolling window
We do not count duplicate download requests from an identifiable unique session (defined below) in 30 minute rolling window. Any duplicate downloads inside the rolling window will increase the rolling window.
Due to the limited listener identifiers available from podcast apps, we use the following data to identify unique sessions at best effort:
- Episode ID
- IP address (IPv4 or IPv6)
- Referral URL (available only for web-based players)
We believe this rolling window approach and timeframe is a fair balance between over-counting multiple downloads from a single listener and under-counting multiple downloads from multiple listeners in a shared IP environment such as offices, schools, and homes.
We also believe a rolling window rather than a fixed time window is more appropriate to international publishers and publishers who operate across large regions because it is not artificially limited to a specific timezone such as UTC.
Ignore partial downloads with less than a minute of audio content
We do not count download requests where the amount of data served in an identifiable unique session (defined above) is less than a minute's worth of audio content.
We calculate the minute of audio content threshold by multiplying the MP3 bitrate by 60 seconds plus the size of any metadata and ID3 headers.
Downloads are marked as pending validation on the initial request and can be observed as "pending downloads" in our analytics dashboards.
We perform log-based validation periodically by analyzing web server logs from our content delivery network (CDN) and combine the total number of bytes successfully delivered (status code of 200 or 206) from server to client.
Downloads that meet the minimum threshold are counted and downloads that do not meet the threshold within 24 hours since the initial request are filtered.
Note: In some scenarios where we redirect downloads to a third-party ad server we do not have access to logs that indicate how much data was transferred and we cannot validate or filter those downloads using this technique, thus all downloads will be counted. We recommend comparing analytics provided by the third-party ad system for additional information.
Cached files on other platforms
Some syndicated platforms and third-party services (e.g. Google Play Podcast and Spotify) may cache files on their own platforms. Plays on these platforms will not register a download on our server.
Where possible we will ingest and display these metrics separately to downloads on our analytics dashboards.
Disable preloading in Omny Studio players
We do not preload audio content in the Omny Studio web or embed player before the user or autoplay (when permissible by the device) has initiated playback.
Furthermore our embed player use an explicit tracking system that will only count downloads when audio is playing. This prevents unintentional tracking from any advanced prefetching behaviour in browsers.
Podcast RSS subscribers filtering
Playlist subscribers analytics tracks downloads of the RSS feed of Omny Studio playlists by podcast players to provide an indication of how many unique users are subscribed to a podcast.
Identifying unique users
Due to the limited listener identifiers available from podcast apps, we use the following data to identify unique subscribers at best effort:
- IP address (IPv4 or IPv6)
We are planning to utilise user-agent data in the near future to identify unique subscribers.
Ignore known bots and spiders
We do not count RSS download requests with an application user-agent that is identified to be a known bot or spider applications.
We do not consider these subscribers because bots and spiders regularly download RSS files for indexing purposes and do not correlate with people subscribing.
Ignore cloud service provider IPs
We do not count download requests from a list of IPs we’ve identified to be servers of third-party services caching content and cloud service providers. (List of providers is provided above)
We do not consider these subscribers because third-party services are downloading files for caching and mirroring purposes and do not correlate with people subscribing.
Consumption analytics filtering
Consumption analytics tracks playback behaviour of audio content in Omny Studio embed players and third-party players who has implemented our consumption analytics player API.
We utilize client-side tracking of player events such as play, pause and seek to generate behavioural reports such as how many people played, how long they played and which parts of the content they played.
Identifying playback sessions
We use a globally unique identifier (GUID) to identify unique playback sessions.
Repeated pausing and seeking inside the same content will not count as a new playback session. However reloading the player or changing content (in a list of content) will be considered a new session even if the user has already played the content before.
Ignore sessions shorter than 10 seconds
We ignore any playback sessions with a total duration of less than 10 seconds.
We include non-consecutive sessions such as two segments of 0:00-0:05 and 0:30-0:36. This session will be counted since the total duration of all segments was 11 seconds.
We do not count sessions shorter than 10 seconds because we believe these short plays may be unintentional or accidental plays.