"A media market, broadcast market, media region, designated market area (DMA), television market area, or simply market is a region where the population can receive the same (or similar) television and radio station offerings." --- Wikipedia
In the Internet era, with an ever increasing proportion of households that have 'cut-the-cord,' media markets mean less than ever. But local television news continues to draw older people. And local network affiliates are still an important part of media 'diets' of older Americans.
Using FCC DTV Maps, we first create a comprehensive database of all local channels per zip code. Then, we cluster zip codes two different ways: 1. using k-means based on overlap between tv stations, 2. using manhattan distance.
We use the list of zip codes and iterate over FCC DTV Maps and produce a CSV with the following columns:
(each zipcode has multiple rows --- one row per channel)
zipcode, callsign, network, channel_number, band, ia, signal_strength (strong/moderate, weak, no signal), facility_id, city_of_license, rf_channel, rx_strength, tower_distance, repacked_channel, repacking_dates
For ~ 2,000 zip codes, the search came back empty. Here's the log file.
-
group_zips: clusters zip codes based on overlap between list of TV stations (with certain signal strength) and appends the grouping variable. We use deterministic (within a certain manhattan distance) multi-assignment (each zipcode can be part of multiple clusters) clustering. We run it for diff = 0, 1, and 2 and save outputs in new_diff_0_group_hash.pkl (compressed), new_diff_1_group_hash.pkl (compressed) and new_diff_2_group_hash.pkl (compressed) respectively.
-
k_means: we use k-means to cluster zip codes based on overlap between list of TV stations (with certain signal strength) and appends the grouping variable. We run it for k = 200 and save the output in (k_means_200.pkl (compressed).
import pandas as pd
pd.read_pickle("new_diff_0_group_hash.pkl", compression = "xz")
Suriyan Laohaprapanon and Gaurav Sood