academic · dataset overview

Tayyar dataset

A position dataset for MENA political actors. 98 parties and 274 politicians across 20 countries, scored on 16 axes. Every fact carries an external source citation; the entire dataset is exported in full as CSV and JSON, and the code that produces it will be open-sourced under MIT. This page is the canonical citable surface — the cards below summarize what's in the dataset, the methodology buttons below jump to the deep dives.

Snapshot 2026-06-21 · v0.2 · CC BY-NC-SA 4.0

Total entities 372 98 parties · 274 politicians

Verified 51% 190 / 372 with source citations

Countries 20 MENA states with seeded party and politician data

Axes 16 2 compass + 14 issue, each with a formal rubric

Position scores 1195 Including composite + lens-divergent (declared / behavioral)

Events tracked 98 /pulse — semi-live with confidence tier

Source documents 291 /documents — verbatim primary-source corpus

Verified quotes 101 /quotes — every row with at least one URL citation

What's included

Field-level source citations. Each fact about each entity links back to where it came from — founding year cites a different source than current leader, which cites a different source than legal status. The coverage page tracks the rollup.
16 calibrated axes. Economic, social, state-religion, democracy, west-alignment, regional-stance, Palestinian question, civil liberties, regime stance, pan-Arab, federalism, modernization, gender, iran-posture, press-freedom, sectarianism. Each one comes with a scoring rubric and concrete MENA examples anchored along the scale.
Richer status than yes/no. Parties carry a government role (lead, coalition major / minor, confidence-and-supply, opposition major / minor, extra-parliamentary, banned) and a legal status (legal, restricted, outlawed, dissolved, merged away). Opposition and independent flags sit alongside.
Declared vs. behavioral on key cases. Parties like Hezbollah and Hamas read as more committed to democracy in what they declare than in what they do; 20 parties carry a rhetoric-vs-record gap that is itself the finding. The home compass has a lens toggle to switch between views.
291 primary-source documents and 101 verified quotes. The document corpus carries verbatim manifestos, charters, parliamentary speeches, and UN addresses with country / party / politician attribution. The quote corpus drives Who-said-it and the "On the record" sections on every party / politician page.
A semi-live event feed. Pulse tracks recent political shifts with a confidence rating — confirmed, reported, rumored, speculative — so in-flux developments (party formations, merger talks) can sit alongside confirmed events without being conflated. Subscribable as RSS.
Free downloads. Every table is available as CSV or JSON at /data. No API key, no rate limit.

Cite as

Cite the paper. The dataset is in active development; a tagged snapshot pins a citation that won't shift as it evolves, and a public repository and permanent DOI are in preparation.

How to cite 1 reference

Paper 1 Gara, T. (2026). The Model as One Rater Among Several: Measuring Political Positions in Data-Sparse Regions with a Language-Model Panel. Preprint; arXiv ID forthcoming.

Open ↗

Show BibTeX

@unpublished{gara_tayyar_2026,
  author = {Gara, Tarek},
  title  = {The Model as One Rater Among Several: Measuring Political Positions in Data-Sparse Regions with a Language-Model Panel},
  year   = {2026},
  note   = {Preprint; arXiv ID forthcoming},
  url    = {https://tarekgara.com/tayyar/paper}
}

Gara, T. (2026). The Model as One Rater Among Several: Measuring Political Positions in Data-Sparse Regions with a Language-Model Panel [Preprint]. https://tarekgara.com/tayyar/paper

Where to read more

Methodology — how the dataset got built and where it falls short
Findings — the structural patterns the data shows
Coverage — verification status, country breakdowns, special-status leaderboards
Axes catalog — all 16 axes with correlations and per-axis stats

Downloads

Every table exportable as CSV or JSON. No auth required.

What's coming next

Changes are tracked as versioned snapshots; the repository will be open-sourced under MIT on publication. The roadmap, in the order it'll land:

Document-grounded scoring. Positions derived from reading party platforms, speeches, and voting records — with the specific passages cited. The hand-coded scores stay as the baseline; the document-grounded ones replace them as they're produced.
Inter-rater agreement. Cohen's κ between hand-coded and document-grounded scores reported on methodology. Where they agree, the rubric's doing its job; where they don't, that's a finding worth writing up.
Lens system at scale. Declared / behavioral / perceived rows generated for every party, not just the hand-coded marquee cases. The compass lens toggle then carries information across the whole dataset.
Second-pass verification. Each fact and each position score reviewed against primary sources by someone other than the author.