Skip to content

Parquet Schema Reference

The open-source snapshots contain Apache Parquet files organized in two categories: per-country files (one set per country) and global metadata files (shared across all countries).

release/
country=US/
links.parquet
rankings.parquet
trending.parquet
metadata/core/
media_details.parquet
movies.parquet
tv_shows.parquet
...
metadata/translations/
lang_fr.parquet
...

Streaming availability links for one country. One row per unique (media, provider, offer) combination. Sorted by `media_id, season_number, episode_number, provider_id, addon_provider_id, link_token` for deterministic output and efficient predicate pushdown.

Sorted by:media_id, season_number, episode_number, provider_id, addon_provider_id, link_token

ColumnTypeNullDescription
media_idUTF8Internal Popcorn Time media identifier. Alphanumeric string, e.g. "mv00012345" (movie) or "tv00067890" (TV show).
season_numberINT16YesSeason number for TV show season-level or episode-level links. NULL for movie links and show-level links.
episode_numberINT16YesEpisode number within the season. NULL for movie, show-level, and season-level links.
provider_idUTF8Streaming provider identifier, e.g. "netflix", "hulu", "disney_plus". Matches the provider registry in providers.json.
addon_provider_idUTF8YesAdd-on channel provider identifier (e.g. Showtime on Prime Video). NULL for direct offers.
link_tokenUTF8AES-256-SIV encrypted, bs58-encoded web URL. Decrypt via go.popcorntime.app/go/{link_token}. Format: bs58([key_version:u8] ++ aes_siv_encrypt(url)).
available_fromDATEYesProvider-reported availability start date. NULL if not reported.
available_toDATEYesProvider-reported availability expiry date. NULL if not reported or no expiry.
first_seen_atDATEYesDate the link was first discovered by the spider. NULL for links imported from legacy data.
price_typesLIST<UTF8>YesOffer price types, sorted alphabetically. Values: flatrate, rent, buy, free, cinema, ads, fast, flatrate_and_buy.
formatsLIST<UTF8>YesAvailable video formats, sorted alphabetically. Values: sd, hd, uhd, 4k, 3d.
audio_languagesLIST<UTF8>YesAvailable audio language codes (ISO 639-1), sorted alphabetically. Example: `["en","es","fr"]`.
subtitle_languagesLIST<UTF8>YesAvailable subtitle language codes (ISO 639-1), sorted alphabetically.
platformsLIST<UTF8>YesPlatform identifiers that have deep links for this offer, sorted alphabetically. Values: android_tv, fire_tv, ios, roku, webos. Actual URLs are in the private platforms.parquet.

rankings.parquet

Weekly popularity rankings per country. Denormalized with media fields so the API can serve rankings without joins. Rows are pre-sorted by `position ASC`.

Sorted by:position

ColumnTypeNullDescription
media_idUTF8Internal Popcorn Time media identifier.
scoreINT32Composite popularity score used for ranking (higher = more popular).
positionINT32Rank position in this country, starting at 1 (1 = most popular).
pointsINT32Raw point count contributing to the score.
slugUTF8YesURL-safe media slug, e.g. "the-dark-knight-2008". NULL if media record not found.
titleUTF8YesPrimary display title. NULL if media record not found.
original_titleUTF8YesOriginal-language title. NULL if same as title or not available.
yearUTF8YesRelease year as a string. NULL if not available.
content_typeUTF8YesContent type: "movie" or "tv_show". NULL if not available.
posterUTF8YesPoster image path (TMDB CDN relative path). NULL if not available.
backdropUTF8YesBackdrop image path (TMDB CDN relative path). NULL if not available.
popularityUTF8YesTMDB popularity score as a string. NULL if not available.
tmdb_ratingUTF8YesTMDB average rating as a string (0–10). NULL if not available.
genresUTF8YesJSON array of genre slugs sorted alphabetically, e.g. `["action","drama"]`. NULL if no genres.
providersUTF8YesJSON array of provider IDs available in this country, e.g. `["netflix","hulu"]`. NULL if no providers.

Daily trending positions per country. `provider_id = "__all__"` for aggregate trending (not provider-specific). Rows are pre-sorted by `(source, position)`.

Sorted by:source, position

ColumnTypeNullDescription
media_idUTF8Internal Popcorn Time media identifier.
provider_idUTF8Provider identifier, or "__all__" for aggregate trending across all providers.
positionINT32Trending rank position, starting at 1 (1 = most trending).
sourceUTF8Trending source: "tmdb_day" or "tmdb_week".

Global metadata

Shared across all countries, under metadata/core/.

media_details.parquet

Core media metadata (movies and TV shows), denormalized. Exported from the local metadata staging database. All columns are nullable UTF8 strings. Sorted by `id` with small row groups (2K rows) for efficient predicate pushdown in the Cloudflare Worker.

Sorted by:id

ColumnTypeNullDescription
idUTF8YesInternal Popcorn Time media identifier, e.g. "mv00012345" or "tv00067890".
tmdb_idUTF8YesTMDB numeric ID as a string.
slugUTF8YesURL-safe slug, e.g. "the-dark-knight-2008".
titleUTF8YesPrimary display title.
original_titleUTF8YesOriginal-language title.
homepageUTF8YesOfficial homepage URL.
yearUTF8YesRelease year as a string.
countryUTF8YesCountry of origin (ISO 3166-1 alpha-2).
budgetUTF8YesProduction budget in USD as a string.
revenueUTF8YesBox office revenue in USD as a string.
releasedUTF8YesRelease date (YYYY-MM-DD).
content_typeUTF8YesContent type: "movie" or "tv_show".
taglineUTF8YesTagline.
overviewUTF8YesPlot overview.
classificationUTF8YesAge classification, e.g. "PG-13". Country-neutral; see content_ratings for per-country values.
posterUTF8YesPoster image path (TMDB CDN relative path).
backdropUTF8YesBackdrop image path (TMDB CDN relative path).
popularityUTF8YesTMDB popularity score as a string.
tmdb_ratingUTF8YesTMDB average rating as a string (0–10).
vote_countUTF8YesTMDB vote count as a string.
runtimeUTF8YesMovie runtime in minutes as a string. NULL for TV shows.
in_productionUTF8YesWhether the TV show is still in production. NULL for movies.
last_air_dateUTF8YesLast air date for TV shows (YYYY-MM-DD). NULL for movies.
genresUTF8YesJSON array of genre objects: [{"id":28,"slug":"action","name":"Action"},...].
ratingsUTF8YesJSON array of rating objects: [{"source":"imdb","rating":"8.5"},...].
external_idsUTF8YesJSON array of external ID objects: [{"source":"imdb","id":"tt0468569"},...].
videosUTF8YesJSON array of video objects: [{"source":"youtube","id":"EXeTwQWrcwY"},...].
content_ratingsUTF8YesJSON object of per-country content ratings: {"US":"PG-13","GB":"12A"}.

movies.parquet

Movie-specific metadata fields.

Sorted by:id

ColumnTypeNullDescription
idUTF8YesInternal media ID (join with media_details.parquet).
runtimeUTF8YesRuntime in minutes.

tv_shows.parquet

TV show-specific metadata fields.

Sorted by:id

ColumnTypeNullDescription
idUTF8YesInternal media ID (join with media_details.parquet).
in_productionUTF8YesWhether the show is still producing new episodes.
last_air_dateUTF8YesLast episode air date (ISO 8601).

seasons.parquet

TV show season metadata.

Sorted by:media_id, season_number

ColumnTypeNullDescription
media_idUTF8YesParent TV show media ID.
season_numberUTF8YesSeason number.
titleUTF8YesSeason title.
overviewUTF8YesSeason overview/description.
posterUTF8YesSeason poster image path.
air_dateUTF8YesFirst air date of the season (ISO 8601).

episodes.parquet

TV show episode metadata.

Sorted by:media_id, season_number, episode_number

ColumnTypeNullDescription
media_idUTF8YesParent TV show media ID.
season_numberUTF8YesSeason number.
episode_numberUTF8YesEpisode number within the season.
titleUTF8YesEpisode title.
overviewUTF8YesEpisode overview/description.
backdropUTF8YesEpisode still/backdrop image path.
air_dateUTF8YesAir date (ISO 8601).

genres.parquet

Genre definitions.

Sorted by:id

ColumnTypeNullDescription
idUTF8YesGenre ID.
slugUTF8YesURL-safe slug (e.g. "action", "sci-fi").
nameUTF8YesDisplay name (e.g. "Action", "Science Fiction").

media_genres.parquet

Media-to-genre associations.

ColumnTypeNullDescription
media_idUTF8YesInternal media ID.
genre_idUTF8YesGenre ID (join with genres.parquet).

media_ids.parquet

External ID mappings (IMDB, TMDB, TVDB).

ColumnTypeNullDescription
media_idUTF8YesInternal media ID.
sourceUTF8YesSource: "imdb", "tmdb", or "tvdb".
external_idUTF8YesExternal ID value (e.g. "tt0903747" for IMDB).

media_ratings.parquet

Ratings from external sources.

ColumnTypeNullDescription
media_idUTF8YesInternal media ID.
sourceUTF8YesRating source (e.g. "imdb", "tmdb").
external_ratingUTF8YesRating value as string (e.g. "8.5").

media_rankings.parquet

Per-country ranking scores (aggregated).

ColumnTypeNullDescription
media_idUTF8YesInternal media ID.
countryUTF8YesISO 3166-1 alpha-2 country code.
scoreUTF8YesComposite quality score.
positionUTF8YesRank position (1 = most popular).
pointsUTF8YesRaw points.

media_content_ratings.parquet

Per-country content ratings (e.g. "TV-MA", "PG-13").

ColumnTypeNullDescription
media_idUTF8YesInternal media ID.
countryUTF8YesISO 3166-1 alpha-2 country code.
ratingUTF8YesContent rating string (e.g. "TV-MA", "PG-13", "15").

media_release_dates.parquet

Per-country release dates.

ColumnTypeNullDescription
media_idUTF8YesInternal media ID.
countryUTF8YesISO 3166-1 alpha-2 country code.
release_dateUTF8YesRelease date (ISO 8601).

media_talents.parquet

Cast and crew associations.

ColumnTypeNullDescription
media_idUTF8YesInternal media ID.
people_idUTF8YesPerson ID (join with peoples.parquet).
roleUTF8YesCharacter name (cast) or job title (crew).
role_typeUTF8Yes"cast" or "crew".
rankUTF8YesBilling order (lower = higher billing).

peoples.parquet

Person records (actors, directors, writers, etc.).

ColumnTypeNullDescription
idUTF8YesPerson ID.
nameUTF8YesFull name.

media_videos.parquet

Trailers and video clips.

ColumnTypeNullDescription
media_idUTF8YesInternal media ID.
sourceUTF8YesVideo source ("youtube" or "rumble").
video_idUTF8YesVideo ID on the source platform.

providers.parquet

Provider registry with per-country weights, denormalized. One row per (provider, country) combination. Exported from the local metadata staging database.

Sorted by:country, weight

ColumnTypeNullDescription
idUTF8YesProvider identifier, e.g. "netflix", "disney_plus".
parent_idUTF8YesParent provider ID for add-on channels (e.g. Prime Video for Showtime). NULL for direct providers.
nameUTF8YesDisplay name, e.g. "Netflix", "Disney+".
logoUTF8YesLogo image URL.
short_idUTF8YesShort identifier used in display contexts.
countryUTF8YesCountry code (ISO 3166-1 alpha-2) for this weight entry.
weightUTF8YesDisplay order weight within the country (higher = shown first).

provider_weights.parquet

Per-country provider popularity weights.

ColumnTypeNullDescription
provider_idUTF8YesProvider ID (join with providers.parquet).
countryUTF8YesISO 3166-1 alpha-2 country code.
weightUTF8YesPopularity weight (higher = more popular in this country).

collections.parquet

Curated collections metadata. Collections can be editorial (hand-picked), dynamic (rule-based, evaluated at snapshot time), or thematic (curated around a theme). Only public collections are exported.

Sorted by:position, id

ColumnTypeNullDescription
idUTF8YesCollection identifier.
nameUTF8YesDisplay name.
slugUTF8YesURL-safe slug.
descriptionUTF8YesDescription text.
countryUTF8YesCountry code this collection is curated for (ISO 3166-1 alpha-2). NULL = global.
languageUTF8YesPrimary language code for the collection (ISO 639-1).
categoryUTF8YesCollection category: "editorial", "dynamic", or "thematic".
positionUTF8YesDisplay order (lower = shown first).
cover_media_idUTF8YesInternal media ID used as the collection cover/hero image. NULL if none.
countriesUTF8YesJSON array of ISO 3166-1 alpha-2 country codes this collection targets. NULL = all countries.

media_collections.parquet

Media collection memberships, denormalized with media fields.

Sorted by:collection_id, position

ColumnTypeNullDescription
media_idUTF8YesInternal Popcorn Time media identifier.
collection_idUTF8YesCollection identifier.
positionUTF8YesDisplay position within the collection (lower = shown first).
slugUTF8YesMedia URL-safe slug.
titleUTF8YesMedia primary display title.
original_titleUTF8YesMedia original-language title.
yearUTF8YesMedia release year as a string.
content_typeUTF8YesMedia content type: "movie" or "tv_show".
posterUTF8YesMedia poster image path (TMDB CDN relative path).
backdropUTF8YesMedia backdrop image path (TMDB CDN relative path).

Featured homepage content, denormalized with media or collection fields. Editorially curated hero carousel items. Supports both individual media and collections. Managed via the team portal, synced through the pipeline.

Sorted by:country, feature_kind, rank

ColumnTypeNullDescription
countryUTF8YesISO 3166-1 alpha-2 country code.
r#typeUTF8YesFeatured item type: "media" or "collection".
media_idUTF8YesInternal media identifier. Set when type = "media".
collection_idUTF8YesCollection identifier. Set when type = "collection".
rankUTF8YesDisplay rank within the feature kind (lower = shown first).
feature_kindUTF8YesFeature period: "day" (refreshed daily) or "week" (refreshed weekly).
featured_fromUTF8YesDate from which this featured entry is active (YYYY-MM-DD).
slugUTF8YesMedia or collection URL-safe slug.
titleUTF8YesMedia title or collection name.
original_titleUTF8YesMedia original-language title. NULL for collections.
yearUTF8YesMedia release year. NULL for collections.
content_typeUTF8YesContent type: "movie", "tv_show", or "collection".
taglineUTF8YesEditorial tagline override (e.g. "Just landed on Netflix") or media's default tagline.
overviewUTF8YesCollection description or media plot overview.
posterUTF8YesPoster image path (TMDB CDN relative path). NULL for collections.
backdropUTF8YesBackdrop image path (TMDB CDN relative path). NULL for collections.
popularityUTF8YesTMDB popularity score as a string.
tmdb_ratingUTF8YesTMDB average rating as a string (0–10). NULL for collections.

lang_{language}.parquet

Per-language media translations. One file per language, e.g. `translations/lang_en.parquet`. Contains localized title, poster, backdrop, tagline, and overview.

Sorted by:media_id

ColumnTypeNullDescription
media_idUTF8YesInternal Popcorn Time media identifier.
titleUTF8YesLocalized title.
posterUTF8YesLocalized poster image path (TMDB CDN relative path).
backdropUTF8YesLocalized backdrop image path (TMDB CDN relative path).
taglineUTF8YesLocalized tagline.
overviewUTF8YesLocalized plot overview.