---
name: kingsets-bigquery
description: Use this skill when the user asks to query Kalshi or Polymarket prediction market data in BigQuery
---

# KingSets BigQuery Skill

Your job is to craft — but not execute — BigQuery SQL for fetching Kalshi/Polymarket prediction market data from the KingSets dataset.

**BigQuery location:** project `pmdsproject`, dataset `pmdsbeta`. All tables must be referenced as `pmdsproject.pmdsbeta.<TABLE_NAME>`. There is no `kingsets` project — do not query from it.

---

## Hard Rules

- **Never use JOIN for trades queries.** Use a `WITH` CTE to find market IDs/tickers first, then select trades for those in the next step.
- **Always partition trades queries by date.** `kalshi_trades` is partitioned on `DT` (TIMESTAMP, DAY). `polymarket_block_trades` is partitioned on `BLOCK_DT` (TIMESTAMP, DAY). Unpartitioned queries on these tables are very expensive. Always ask the user for a date range before writing a trades query. Always filter using `DATE(field) IN (...)` or `DATE(field) BETWEEN ... AND ...` — never use raw TIMESTAMP comparisons like `BLOCK_DT >= '...'` as these do not properly leverage the DAY partition.
- **Always use case-insensitive LIKE for text filtering.** Text fields are not guaranteed to follow consistent casing. Use `LOWER(field) LIKE '%value%'` or `UPPER(field) LIKE '%VALUE%'`. Never apply `LOWER()`/`UPPER()` to numeric fields like `EVENT_ID` or `MARKET_ID` (type `INTEGER`), which will cause a type error.
- **All timestamps in the DB are UTC.** Many market text fields reference ET times — be careful when filtering by date.
- **Never use `START_DT` / `END_DT` to find markets for a particular day.** These fields represent when the market was opened/closed for trading, not when a sports match took place or when a crypto period (5m, 15m, 1h, etc.) started. Instead filter on text fields using date patterns -- formats vary, so use multiple LIKE conditions:
  - **Kalshi:** check `LOWER(EVENT_TICKER)` for `YYMONDD` format, e.g. `%26mar01%` for 1 March 2026
  - **Polymarket:** check `LOWER(EVENT_TICKER)` and `LOWER(MARKET_SLUG)` for formats like `%2026-03-01%` or `%-march-1-%`. Additionally, always include `OR DATE(END_DT) = 'YYYY-MM-DD'` as a fallback, because the date embedded in tickers and slugs can sometimes be inaccurate.
- **Never JOIN trades tables with markets tables just to get `ASSET_LABEL`.** Both `kalshi_trades` and `polymarket_block_trades` already contain `ASSET_LABEL` directly. Only use a CTE/subquery on the markets table when you need to look up market IDs or tickers — never when the user has already provided them directly.
- **Never filter `kalshi_markets` by `SERIES_TICKER`** — that field does not exist in the table. To scope by series, filter on `EVENT_TICKER` using a LIKE pattern (e.g. `LOWER(EVENT_TICKER) LIKE 'kxbtc%'`).
- **Never filter `polymarket_markets` by `SERIES_ID`** — that field does not exist in the table. To scope by series, join or filter via `EVENT_ID` from `polymarket_events`.
- **When filtering trades for markets on a specific date, ask user if the query should widen the date range by one day on each side.** For example, for hourly markets on 1 March 2026, filter `DT` / `BLOCK_DT` from 28 Feb to 2 Mar. This would account for UTC/ET offsets and ensures no trades are missed at day boundaries.
- **Default sort order for trades:** `kalshi_trades` should be ordered by `DT`. `polymarket_block_trades` should be ordered by `BLOCK_NUMBER, TX_INDEX, LOG_INDEX` — trades from the same block share the same `BLOCK_DT` so intra-block ordering requires these fields.
- **Discovery first.** When the user describes a market they saw on the Kalshi/Polymarket website, always write a discovery query against events/markets tables before attempting a trades query. Naming conventions are inconsistent and must be confirmed before assuming.

---

## Data Hierarchy

```
Series → Events → Markets → Trades
```

- **Kalshi:** joined by `SERIES_TICKER` → `EVENT_TICKER` → `MARKET_TICKER`
- **Polymarket:** joined by `SERIES_ID` → `EVENT_ID` → `MARKET_ID`

---

## Schemas

### Kalshi

**kalshi_series**
| Field | Type |
|---|---|
| SERIES_TICKER | STRING |
| SERIES_TITLE | STRING |
| CATEGORY | STRING |
| FREQUENCY | STRING |
| SERIES_TAGS | STRING |
| SETTLEMENT_SOURCES | STRING |
| INGESTION_DT | TIMESTAMP |

**kalshi_events**
| Field | Type |
|---|---|
| SERIES_TICKER | STRING |
| EVENT_TICKER | STRING |
| EVENT_TITLE | STRING |
| EVENT_SUBTITLE | STRING |
| CATEGORY | STRING |
| INGESTION_DT | TIMESTAMP |

**kalshi_markets**
| Field | Type |
|---|---|
| EVENT_TICKER | STRING |
| MARKET_TICKER | STRING |
| MARKET_TITLE | STRING |
| MARKET_SUBTITLE | STRING |
| YES_SUBTITLE | STRING |
| NO_SUBTITLE | STRING |
| RESULT | STRING |
| START_DT | TIMESTAMP |
| END_DT | TIMESTAMP |
| CLOSED_DT | TIMESTAMP |
| INGESTION_DT | TIMESTAMP |

**kalshi_trades** *(partitioned by DT, DAY)*
| Field | Type |
|---|---|
| DT | TIMESTAMP |
| SERIES_TICKER | STRING |
| EVENT_TICKER | STRING |
| MARKET_TICKER | STRING |
| ASSET_LABEL | STRING |
| ASSET_SIDE | STRING |
| SIDE | STRING |
| SIZE | BIGNUMERIC |
| PRICE | BIGNUMERIC |
| TRADE_ID | STRING |

### Polymarket

**polymarket_series**
| Field | Type |
|---|---|
| SERIES_ID | INTEGER |
| SERIES_TICKER | STRING |
| SERIES_TITLE | STRING |
| SERIES_TYPE | STRING |
| RECURRENCE | STRING |
| INGESTION_DT | TIMESTAMP |

**polymarket_events**
| Field | Type |
|---|---|
| SERIES_ID | INTEGER |
| EVENT_ID | INTEGER |
| EVENT_TICKER | STRING |
| EVENT_TITLE | STRING |
| START_DT | TIMESTAMP |
| END_DT | TIMESTAMP |
| CLOSED_DT | TIMESTAMP |
| RESOLUTION_SOURCE | STRING |
| EVENT_TAGS | STRING |
| INGESTION_DT | TIMESTAMP |

**polymarket_markets**
| Field | Type |
|---|---|
| EVENT_ID | INTEGER |
| MARKET_ID | INTEGER |
| MARKET_SLUG | STRING |
| QUESTION | STRING |
| START_DT | TIMESTAMP |
| END_DT | TIMESTAMP |
| CLOSED_DT | TIMESTAMP |
| RESOLUTION_SOURCE | STRING |
| ASSETA_LABEL | STRING |
| ASSETA_ID | STRING |
| ASSETB_LABEL | STRING |
| ASSETB_ID | STRING |
| RESULT | STRING |
| MARKET_TAGS | STRING |
| CONDITION_ID | STRING |
| INGESTION_DT | TIMESTAMP |

**polymarket_block_trades** *(partitioned by BLOCK_DT, DAY)*
| Field | Type |
|---|---|
| BLOCK_DT | TIMESTAMP |
| BLOCK_NUMBER | STRING |
| TX_INDEX | INTEGER |
| LOG_INDEX | INTEGER |
| SERIES_ID | INTEGER |
| EVENT_ID | INTEGER |
| MARKET_ID | INTEGER |
| ASSET_LABEL | STRING |
| SIDE | STRING |
| SIZE | BIGNUMERIC |
| PRICE | BIGNUMERIC |
| TAKER_ADDRESS | STRING |
| ASSET_ID | STRING |
| TX_HASH | STRING |

---

## Trade Price Format

Prices are decimal. `0.58` = 58 cents as shown on the website.

---

## Market Results & Winner Detection

### Kalshi

`kalshi_markets.RESULT` is either `"yes"` or `"no"`, indicating which binary outcome was settled as the winner. When joining with `kalshi_trades`, compare `kalshi_trades.ASSET_SIDE = kalshi_markets.RESULT` to determine whether the asset of that trade was the winning outcome.

### Polymarket

`polymarket_markets.RESULT` contains the asset ID of the winning outcome, equal to either `ASSETA_ID` or `ASSETB_ID`. When joining with `polymarket_block_trades`, compare `polymarket_block_trades.ASSET_ID = polymarket_markets.RESULT` to determine whether the asset of that trade was the winning outcome.

---

## Kalshi Naming Patterns & Quirks

- **Event tickers** follow `{SERIES_TICKER}-{SUFFIX}`. The series ticker is the root (e.g. `KXNBAGAME`, `KXBTC15M`).
- **Sports event tickers** encode the game date and teams: `KXNBAGAME-26MAR01CLEBKN` = NBA game on 2026-03-01, Cleveland vs Brooklyn. The date portion (`26MAR01`) is likely an ET date, not UTC.
- **Sports secondary markets** (points, assists, rebounds, spread, etc.) use different series tickers with the same date/team suffix: `KXNBAPTS-26MAR01CLEBKN`, `KXNBASPREAD-26MAR01CLEBKN`, etc. To find all markets for a game, filter `EVENT_TICKER LIKE '%26MAR01CLEBKN%'` and exclude/include `LIKE '%GAME-%'` to separate win/loss from stats markets.
- **League codes in tickers are not standardised across all leagues.** NBA/NHL/NFL/MLB/MLS are known patterns (`KXNBA`, `KXNHL`, etc.), but other leagues (PGA, Premier League, etc.) need to be discovered by LIKE filtering on `EVENT_TICKER` or `EVENT_TITLE`.
- **Kalshi markets are always binary (YES/NO).** Even in a sports match, each team has its own market record (e.g. `KXNBAGAME-26MAR01CLEBKN-CLE` and `KXNBAGAME-26MAR01CLEBKN-BKN`), and a trade's `ASSET_SIDE` is always `yes` or `no`.
- **Crypto event ticker patterns:**
  - 15-min up/down: `KXBTC15M-{YYMMDD}{HHMM}` (e.g. `KXBTC15M-26FEB280030`)
  - Hourly anchor price: `KXBTCD-{YYMMDD}{HH}` (e.g. `KXBTCD-26FEB2810`)
  - Hourly range: `KXBTC-{YYMMDD}{HH}` (e.g. `KXBTC-26FEB2823`)
  - Monthly high/low: `KXBTCMAXMON-BTC-{YYMMDD}` / `KXBTCMINMON-BTC-{YYMMDD}`
  - The hour in the ticker is ET, not UTC.
- **Useful discovery fields on kalshi_events:** `EVENT_TITLE`, `EVENT_SUBTITLE`, `CATEGORY` (`Sports`, `Crypto`, etc.)

---

## Polymarket Naming Patterns & Quirks

- **Tags are the primary filter mechanism.** `EVENT_TAGS` and `MARKET_TAGS` contain bracket-wrapped labels like `[nba]`, `[games]`, `[soccer]`, `[crypto]`, `[up-or-down]`, `[weekly]`, `[5m]`, etc. Always filter with `LOWER(field) LIKE '%[tag]%'`.
- **`CLOSED_DT IS NULL`** means the event/market is ongoing or upcoming.

### ⚠️ Sports: draws vs no-draws — critical difference

This is the most important structural quirk in Polymarket:

**Sports with no draws (NBA, NHL, NFL, MLB, etc.):**
- One `polymarket_events` record per match.
- Multiple `polymarket_markets` records per event (match winner, spreads, over/under, etc.).
- The main winner market is identified by: `polymarket_markets.QUESTION = polymarket_events.EVENT_TITLE`.
- `ASSETA_LABEL` / `ASSETB_LABEL` contain the two team names.
- Filter events by: `LOWER(EVENT_TAGS) LIKE '%[nba]%' AND LOWER(EVENT_TAGS) LIKE '%[games]%'`

**Sports with draws (soccer, etc.):**
- One match spans **multiple** `polymarket_events` records — one per match winner (home win, away win, draw) plus additional events for sub-markets (halftime result, more markets, etc.).
- The main match event is the one where `EVENT_TITLE` does not contain a dash followed by a description suffix (e.g. `"FC Internazionale Milano vs. Genoa CFC"` is the main event; `"FC Internazionale Milano vs. Genoa CFC - Halftime Result"` is a sub-event).
- Each outcome (home/away/draw) is a separate `polymarket_markets` record under the main event, with `QUESTION` describing that specific outcome (e.g. `"Will FC Internazionale Milano win on 2026-02-28?"`).
- `ASSETA_LABEL` / `ASSETB_LABEL` are `Yes` / `No` in this case (not team names).
- Filter events by sport: `LOWER(EVENT_TAGS) LIKE '%[soccer]%' AND LOWER(EVENT_TAGS) LIKE '%[games]%'`; league codes vary (e.g. `[ucl]`, `[sea]`) and should be discovered first.

### Polymarket crypto patterns

- **Up/Down markets:** filter `LOWER(MARKET_TAGS) LIKE '%[up-or-down]%'`. Time granularity tags: `[5m]`, `[15m]`, `[1h]`, `[daily]`. Coin tags: `[bitcoin]`, `[ethereum]`, `[solana]`, `[xrp]`.
- **Target price markets** (above/below a specific price): filter `LOWER(MARKET_TAGS) LIKE '%[crypto]%'` and `LOWER(MARKET_TAGS) NOT LIKE '%[up-or-down]%'`. Granularity: `[weekly]`, `[monthly]`, `[daily]`.
- The database does not contain the anchor/reference price for up/down markets.
- `ASSETA_LABEL` / `ASSETB_LABEL` for crypto up/down are `Up` / `Down`; for target price markets they are `Yes` / `No`.

---

## Trade Asset & Side Fields

Understanding these fields is essential for correct aggregations (min/max price, volume, etc.):

**Polymarket (`polymarket_block_trades`):**
- Each `MARKET_ID` has exactly 2 tradable outcomes, stored as `ASSET_LABEL` (e.g. `Yes`/`No`, `Up`/`Down`, `Penguins`/`Rangers`).
- Always `GROUP BY ASSET_LABEL` when aggregating over a market.

**Kalshi (`kalshi_trades`):**
- `ASSET_LABEL` identifies the market's subject (e.g. `Cleveland`) and is the same for all trades of a `MARKET_TICKER`.
- `ASSET_SIDE` is the binary outcome: `yes` or `no`.
- Always `GROUP BY ASSET_LABEL, ASSET_SIDE` when aggregating over a market.

**Both exchanges:**
- `SIDE` (`BUY`/`SELL`) is the trader's action, independent of the outcome. It is not an outcome field and should not be confused with `ASSET_SIDE`.

---

## Discovery-First Workflow

When a user asks for trades data, follow this order:

1. **Identify the exchange** (Kalshi or Polymarket).
2. **Write a discovery query** against events (and markets if needed) using LIKE filters on title/ticker/tags fields to locate the relevant records. Ask the user to confirm the results look right.
3. **Extract the market ID(s)/ticker(s)** from confirmed results.
4. **Ask for a date range** for the trades query.
5. **Write the trades query** using a `WITH` CTE for market IDs, then select trades with a date partition filter.

If the user doesn't know which league code or series ticker to use, provide a broad discovery query (e.g. filter by `EVENT_TITLE LIKE '%keyword%'`) and explain they may need to adjust the filter after seeing results.
