TxT360: A Top-Quality LLM Pre-training Dataset Requires the Perfect Blend

About TxT360

TL;DR We introduce TxT360 (Trillion eXtracted Text), the first dataset to globally deduplicate 99 CommonCrawl snapshots and 14 high-quality data sources from diverse domains (e.g., FreeLaw, PG-19, etc.). The large-scale deduplication process and rich metadata stored enables precise control over data distribution. We demonstrate a simple but effective upsampling recipe that creates a 15+ trillion-token corpus, outperforming FineWeb 15T on several key metrics. With the information, TxT360 empowers pre-trainers to explore more advanced weighting techniques, a feature not commonly available in previous pre-training datasets. Our findings highlight the importance of both high-quality data sources and appropriate weighting for optimal blending in LLM training.

In line with our 360° open source spirit, we document all detailed steps, reasons of our decisions, detailed statistics, our code (stay tuned!), analysis results and more, in addition to the dataset itself. We hope this can serve as a useful resource for future developers.

Building on top of the prior studies on pre-training data, TxT360 carefully implements data processing steps including extraction, filtering, deduplication, personally identifiable information removal, and other steps. Unlike DCLMand RedPajama V2,we also hope to provide a dataset at this scale that is ready to go, without requiring further filtering.

How to Read this Blog Post?

This document contains all the details and is lengthy. We recommend readers to use the Table of Contents to jump to the appropriate sections. The post might also be slightly too long for mobile reading (sorry!). At each top level section, we provided a quick guide for the content. We also recommend readers to consider this post as a reference for some high level statistics related to pre-training datasets.

Advanced blog navigation elements are available on laptops and larger viewing windows.

Why TxT360

In this year we have seen excellent datasets released by the community. Among those, most datasets focus on one source (e.g., crawled websites, code bases, papers). However, it is not trivial to combine these sources together due to the potential duplication across them. TxT360 is the first dataset to combine most of sources commonly used in pretraining.

Data Source	TxT360	FineWeb	RefinedWeb	PedPajamaV2	C4	Dolma	RedPajamaV1	The Pile
CommonCrawl Snapshots	99	96	90	84	1	24	5	0.6% of 74
Papers	5 Sources	-	-	-	-	1 Source	1 Source	4 Sources
Wikipedia	310+ Languages	-	-	-	-	Included	Included	English Only
FreeLaw	Included	-	-	-	-	-		Included
DM Math	Included	-	-	-	-	-		Included
USPTO	Included	-	-	-	-	-		Included
PG-19	Included	-	-	-	-	Included	Included	Included
HackerNews	Included	-	-	-	-	-	-	Included
Ubuntu IRC	Included	-	-	-	-	-	-	Included
EuroParl	Included	-	-	-	-	-	-	Included
StackExchange	Included	-	-	-	-	-	Included	Included
Code	**	-	-	-	-	Included	Included	Included

In LLM pretraining, it is common to combine all possible text sources due to the Scaling Law. Crawled web pages are included to provide a vast quantity of data which can cover long tail and diverse information, while curated datasets such as Wikipedia are also used, which often provide the 'deep-dive' domain information. By integrating the reach of web data with the quality of curated sources, TxT360 meets and surpasses the rigorous standards required for state-of-the-art LLM pre-training.

** TxT360 does not include very specific domains such as code and math. This decision was made due to the perceived low duplication code with other sources, and the different logic required to build those datasets. We leave that to future work and recommend users refer to existing projects such as Stack V2.

Our Approach

To produce TxT360, a comprehensive data processing pipeline was designed to account for the nuances of both web and curated datasets. The pipeline presents a unified framework for processing both data types, making it convenient and easily adaptive for users to revise and fine-tune the pipeline for their own use cases.

Web datasets are inherently noisy and varied. The TxT360 pipeline implements sophisticated filtering and deduplication techniques to clean and remove redundancies while preserving data integrity.

Curated datasets are typically structured and consistently formatted, but also can cause troubles with their own special formatting preferences. TxT360 filters these sources with selective steps to maintain their integrity while providing seamless integration into the larger dataset. Both data source types are globally deduplicated together resulting in ~5T tokens of high-quality data. The table below shows the source distribution of TxT360 tokens. Note that we do not recommend to use the raw distribution of the deduplicated dataset, a simple recipe is provided in the studies section.

Data Source	Raw Data Size	Token Count	Information Cut-Off Date
CommonCrawl	9.2 TB	4.83T	2024-30
Papers	712 GB	154.96B	Q4 2023
Wikipedia	199 GB	35.97B	-
Freelaw	71 GB	16.7B	Q1 2024
DM Math	22 GB	5.23B	-
USPTO	45 GB	4.95B	Q3 2024
PG-19	11 GB	2.63B	-
HackerNews	4.1 GB	1.08B	Q4 2023
Ubuntu IRC	4.7 GB	1.54B	Q3 2024
Europarl	6.1 GB	1.96B	-
StackExchange	79 GB	27B	Q4 2023

We provide details and context for the choices behind TxT360 in the respective Common Crawl Data Processing and Curated Source Processing section. A deep dive describing the deduplication process can be found in the Shared Processing Steps section.

Common Crawl Snapshot Processing

What This Section Contains

This section provides a complete discussion on the filtering applied to the 99 Common Crawl snapshots that comprise the web data section of TxT360. The section is split into the following topic areas:

Web Data Processing Summary
Document Preparation
Line-Level Filtering
Local Deduplication
Each section is complete with code and comparisons to Dolma,DataTrove,and/or RedPajama-V-2
Estimated Reading Time: 31 minutes

To generate a high-quality dataset from large-scale webpages, we have investigated the processing steps used by the community and made our choices based on careful manual inspection. Below is a comprehensive list of datasets we reviewed the comparison of filters we have applied.

TxT360 CommonCrawl Filtering vs Other Pretraining Datasets

The following section provides explicit details covering the reasoning and decisions behind each of the filters we applied. The table below provides a high-level comparison of TxT360's filtering compared to other commonly used pretraining datasets.

Dataset	Data Reading	Text Extraction	URL Filtering	Language Identification	Line Removal	PII Filtering	Exact Deduplication	Fuzzy Deduplication
TxT360	warc	trafilatura	Yes	fastText	Yes	Yes	Bloom Filter	Global
FineWeb	warc	trafilatura	Yes	fastText	Yes	Yes	n/a	Local
RefinedWeb	warc	trafilatura	Yes	fastText	Yes	No	ExactSubStr	Local
RedPajamaV2	wet	n/a	Yes	fastText	Yes	No	Bloom Filter	Local
C4	wet	n/a	No	langdetect	Yes	No	n/a	Local
Dolma	warc	?	No	fastText	Yes	Yes	Bloom Filter	Local
RedPajamaV1	wet	n/a	No	fastText	No	No	n/a	Local
The Pile	warc	jusText	No	pycld2	No	No	n/a	Global

The table below provides a comparison of the quality filters that have been applied to each dataset. Of note, TxT360 does not use any machine learning (ML) based filters. ML filters are a useful and efficient filtering processing that should be consider for any filtering project. However, we are leaving this to future work.

Dataset	QF: ML-based	QF: Repition-based	QF: Correction-based	QF: Gopher Rules	QF: C4 Rules
TxT360	No	Yes	Yes	Yes	Yes
FineWeb	No	Yes	Yes	Yes	Yes
RefinedWeb	No	Yes	Yes	Yes	Yes
RedPajamaV2	Yes	Yes	No	Yes	Yes
C4	No	No	No	No	Yes
Dolma	No	Yes	No	Yes	Yes
RedPajamaV1	Yes	No	No	No	No
The Pile	Yes	No	No	No	No

Our filtering rate is illustrated below. Before deduplication, our filtering rate is comparable to RefinedWeb. During global deduplication, we removed approximately 85.89% of the data, significantly higher than previous works, indicating a large number of duplicates across snapshots.

A significant portion of the documents is filtered after the whole process. This figure illustrates the percentage of documents filtered at each step. The grey bars represent the filtered documents. The statistics are largely consistent with prior work (e.g., RefinedWeb) across most steps, though we have incorporated some custom filtering steps.

Document Preparation

Text Extraction: Common Crawl provides webpage texts via two formats: WARC (Web ARChive format) and WET (WARC Encapsulated Text). WARC files contain the raw data from the crawl, which store the full HTTP response and request metadata. WET files contain plaintexts extracted by Common Crawl. In line with previous works , we found WET files to include boilerplate content like navigation menus, ads, and other irrelevant texts.

We directly read WARC files with the warcio library instead of WET files and extracted text using Trafilatura. Similar to RefinedWeb, we avoid using Machine Learning (ML)-based metrics for filtering documents to prevent bias introduced by ML models. Importantly, we apply global deduplication across the entire dataset, whereas previous works only use local deduplication. Note that although The Pile also employed global deduplication on its web data (Pile-CC), this accounted for just 0.6\% of 74 snapshots.

Text Extraction Examples

Data sample: 3 of 99

Raw format

{
    "text": "Thai Culinary Experience\nEscorted Tours\nBrazil Three Jewels\nChina Vacations\nGolden Route of China\nEuropean Highlights\nHighlights of Israel\nItaly's Great Cities\nJapan Vacations\nJapan Sampler\nHeart of Japan\nInland Sea of Japan\nGems of Central Europe\nThe Best of Italy\nSplendors of Sicily\nThailand & Golden Triangle\nGems of Turkey\nHosted Tours\nAustralian Express\nChina's Classic Cities\nJapan (Tokyo & Kyoto)\nMalaysia Truly Asia\nRome, Florence & Venice\nSingapore & Bangkok\nSingapore & Tokyo\nSafari\nKenya Explorer\nTanzania\nCulinary\nLe Cordon Bleu\nHong Kong Culinary\nProvence, France\nSingapore Culinary\nFlavors of Singapore\nThai Culinary\nTuscany Classic\nTuscany Overture\n18-35 year olds\nEuropean Magic\nHoneymoon\nBora Bora & Tahiti\nCruises\nCity Stay\nBangkok\nBeijing & Shanghai\nLondon, Paris & Rome\nLondon & Paris\nParis-Bordeaux\nParis-Loire Valley\nUSA East\nUSA West\nRail\nRailpass\nRome, Florence & Venice\n18-35 year olds\nEuropean Magic\nTravel Guides\n| China Vacations | Yangtze Cruises | Escorted Tours | Main Index |\nThai Culinary Experience\nFloating Market Discover the exotic blend of Eastern and Western culinary influences harmoniously combined into something distinctively Thai.\n5 Night from $1,625*\nCulinary Holiday includes:--\nRound trip airfare from the West Coast and New York on United Airlines†\n5 nights at Shangri-La Hotel Bangkok\nDaily breakfast\nHalf day Royal Grand Palace Tour\nHalf day Cooking School at the Shangri-La with lunch\nEvening Dinner and show at the famous Sala Rim Nam riverside restaurant\nAll service charges and hotel taxes\nRound trip airport transfers by private car\nThai Cooking Class with a Twist: You will depart the hotel for Or-Tor-Kor Market. Here you will explore with a guided tour of this traditional Thai vegetable market. Upon returning to the hotel and entering the Salathip Restaurant you can relax and enjoy a fun, hands-on cooking class and a delicious Thai luncheon with desssert in souvenir apron to remind you of your very special cooking class.\n* Prices are per person, double occupancy for travel October 1 - 31, 2004. Prices are subject to change, availability, holiday/seasonal supplements, blackout dates and any restrictions that apply. Prices do not include international departure and immigration taxes of up to $103. Weekend surcharges apply.\n† Departure cities include: SFO, OAK, SJC, LAX, ONT, SNA, SEA, HNL, JFK, or EWR.\nCall (800) 990-3454 for reservation\n",
    "url": "http://1vacation.com/thaifood.html"
}

Extracted format

{
    "text": "Thai Culinary Experience\n|\nFloating Market\n|Discover the exotic blend of Eastern and Western culinary influences harmoniously combined into something distinctively Thai.\n|\n5 Night from $1,625*\n|Culinary Holiday includes:--\nThai Cooking Class with a Twist: You will depart the hotel for Or-Tor-Kor Market. Here you will explore with a guided tour of this traditional Thai vegetable market. Upon returning to the hotel and entering the Salathip Restaurant you can relax and enjoy a fun, hands-on cooking class and a delicious Thai luncheon with desssert in souvenir apron to remind you of your very special cooking class.\n|\n* Prices are per person, double occupancy for travel October 1 - 31, 2004. Prices are subject to change, availability, holiday/seasonal supplements, blackout dates and any restrictions that apply. Prices do not include international departure and immigration taxes of up to $103. Weekend surcharges apply.\n† Departure cities include: SFO, OAK, SJC, LAX, ONT, SNA, SEA, HNL, JFK, or EWR.\nCall for reservation",
    "url": "http://1vacation.com/thaifood.html"
}

Language Identification: After text extraction, the non-English texts are then filtered out by fastText language identifier with a threshold of 0.65. This step removes over 60% of the whole data.

Non-English Document Examples

Sample documents that are classified as non-English

Data sample: 3 of 99

{
    "text": "联系目录\n|First Name\n|Last Name\n|Phone\n|Department\n|Office\n|Joshua\n|Myer\n|(电子邮件保护)\n|918-343-7650\n|学生的成功\n|马卡姆大厅，244室\n|Meghan\n|Dodson\n|(电子邮件保护)\n|918-343-7525\n|咨询服务\n|Dr. 卡罗琳·泰勒中心，103A室\n|Justin\n|Noble\n|(电子邮件保护)\n|918-343-7989\n|Athletics\n|Bushyhead Fieldhouse, 201A室\n|Rebecca\n|Krouse\n|(电子邮件保护)\n|918-343-7511\n|学生的成功\n|马卡姆大厅，242室\n|James\n|Epperson\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|Cameron\n|Margaris\n|(电子邮件保护)\n|918-343-6815\n|Athletics\n|棒球/垒球复杂\n|Taylor\n|Good\n|(电子邮件保护)\n|918-343-6866\n|问责与学术\n|Tai\n|Blevins\n|(电子邮件保护)\n|918-343-7540\n|Admissions\n|Markham Hall\n|Nena\n|Roberts\n|(电子邮件保护)\n|918-825-6117\n|Pryor Campus\n|Pryor Campus\n|Tammy\n|Space\n|(电子邮件保护)\n|918-825-6014\n|Pryor Campus\n|Pryor Campus\n|Moriah\n|Rake\n|(电子邮件保护)\n|918-343-7893\n|Athletics\n|Bushyhead Fieldhouse, 102A室\n|Pepper\n|Ilyushenko\n|(电子邮件保护)\n|918-343-8391\n|Administration\n|迈耶大厅，117房间\n|Teilor\n|Hubbard\n|(电子邮件保护)\n|918-343-7757\n|Athletics\n|布什黑德运动场，205室\n|Regina\n|Terherst\n|(电子邮件保护)\n|918-343-6889\n|OMA Alumni\n|迈耶大厅，200房间\n|Joshua\n|Boyles\n|(电子邮件保护)\n|918-343-6826\n|TRiO Program\n|预备大厅，109A室\n|Nathan\n|Davenport\n|(电子邮件保护)\n|918-343-7509\n|Financial Aid\n|马卡姆大厅，122室\n|Amanda\n|Calhoun\n|(电子邮件保护)\n|918-343-7980\n|Admissions\n|马卡姆大厅，119室\n|Abhilash\n|Minukuri\n|(电子邮件保护)\n|918-343-7529\n|Technology & 司法研究\n|赫灵顿大厅，255室\n|Emily\n|Mealin\n|(电子邮件保护)\n|918-343-7645\n|护理与卫生专业学院\n|健康科学，161A室\n|Clay\n|McIntosh\n|(电子邮件保护)\n|918-343-7744\n|Fine Arts\n|贝尔德大厅，217F室\n|Dr. Nitindra\n|Pavuluri\n|(电子邮件保护)\n|918-343-7653\n|Technology & 司法研究\n|赫灵顿厅163室\n|Logan\n|Blunt\n|(电子邮件保护)\n|918-343-7565\n|Admissions\n|马卡姆大厅，118室\n|Fred\n|Dietz\n|(电子邮件保护)\n|918-343-7761\n|Admissions\n|马卡姆大厅，112室\n|Dr. 亚历山德拉(Ali)\n|Teel\n|(电子邮件保护)\n|918-343-7636\n|健康科学\n|健康科学，116室\n|Charlsie\n|Smith\n|(电子邮件保护)\n|918-343-7725\n|健康科学\n|健康科学，118室\n|Cody\n|Johnson\n|(电子邮件保护)\n|918-343-6864\n|Athletics\n|Bushyhead Fieldhouse, B005室\n|Dr. Donna\n|Sharp\n|(电子邮件保护)\n|918-343-7568\n|Psychology & Sociology\n|预备大厅203J室\n|Heather\n|Davis\n|(电子邮件保护)\n|918-343-7883\n|Athletics\n|Bushyhead Fieldhouse, 207室\n|Cameron\n|Kujath\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|Raenell\n|Wilson\n|(电子邮件保护)\n|918-343-6803\n|Admissions\n|马卡姆大厅，118室\n|Jamen\n|Helton\n|(电子邮件保护)\n|918-343-7288\n|Athletics\n|足球场，104D\n|Kira\n|Wicker\n|(电子邮件保护)\n|918-343-5201\n|Athletics\n|足球场，104J\n|Benjamin\n|Donals\n|(电子邮件保护)\n|918-343-7549\n|学生成功bb0问责制和学术\n|马卡姆大厅，238室\n|Jana\n|Tresher\n|(电子邮件保护)\n|918-338-8012\n|RSU巴特尔斯维尔\n|巴特尔斯维尔校区\n|Ashlyn\n|Brown\n|(电子邮件保护)\n|918-343-6827\n|TRIO Programs\n|预备大厅，109C室\n|Dr. Connie\n|Nelson\n|(电子邮件保护)\n|918-343-7620\n|健康科学\n|健康科学，161C室\n|Dr. Cassandra\n|Barrow\n|(电子邮件保护)\n|918-343-7597\n|健康科学\n|健康科学，161B室\n|Jimmy\n|Boone\n|(电子邮件保护)\n|918-343-7678\n|Athletics\n|Bushyhead Fieldhouse, B005室\n|Tiffany\n|Kelly\n|(电子邮件保护)\n|918-343-7727\n|Admissions\n|马卡姆大厅，116室\n|Jennifer\n|Wilson\n|(电子邮件保护)\n|918-343-7556\n|Admissions\n|马卡姆大厅，212室\n|Mitchell\n|Hubbard\n|(电子邮件保护)\n|918-343-7547\n|TRIO\n|预备大厅\n|Abigail\n|Clark\n|(电子邮件保护)\n|918-343-7573\n|Financial Aid\n|马卡姆大厅，128室\n|Michael\n|Davis\n|(电子邮件保护)\n|918-343-7534\n|行政计算服务\n|行政服务中心\n|Tanner\n|Tatro\n|(电子邮件保护)\n|918-343-7558\n|Bursar\n|马卡姆大厅，204室\n|Brian\n|Voris\n|(电子邮件保护)\n|918-343-7972\n|RSUTV\n|马卡姆大厅，144E室\n|John\n|Carle\n|(电子邮件保护)\n|918-343-6828\n|Accessibility & 残疾的资源\n|Dr. 卡洛琳·泰勒中心201J室\n|Jamie\n|Frederick\n|(电子邮件保护)\n|918-343-7612\n|总统办公室\n|迈耶大厅，106室\n|Hawken\n|Grubbs\n|(电子邮件保护)\n|918-343-7513\n|Technology & 司法研究\n|赫灵顿大厅，114室\n|Hayden\n|Purdum\n|(电子邮件保护)\n|918-343-7538\n|学术计算服务\n|预备大厅，212室\n|Dr. Tetyana\n|Kyrylova\n|(电子邮件保护)\n|918-343-7507\n|Technology & 司法研究\n|赫灵顿大厅，116室\n|Natalia\n|Sumner\n|(电子邮件保护)\n|918-343-7751\n|Admissions\n|马卡姆大厅108室\n|Kenny\n|Day\n|(电子邮件保护)\n|918-343-7771\n|通信 & Marketing\n|赫灵顿大厅，205B室\n|Tracey\n|Wallis\n|(电子邮件保护)\n|918-343-7823\n|三人组/人才搜寻\n|预备大厅209A室\n|Sam\n|McCombs\n|(电子邮件保护)\n|918-343-7849\n|RSUTV\n|马卡姆大厅，136室\n|Gerald\n|Bender\n|Technology & 司法研究\n|Herrington大厅\n|Tom\n|Fink\n|(电子邮件保护)\n|918-343-7570\n|通信 & Marketing\n|赫灵顿大厅，206室\n|Morgan\n|Anderson\n|(电子邮件保护)\n|918-343-7555\n|Financial Aid\n|马卡姆大厅126室\n|Sam\n|McCombs\n|RSU Public TV\n|Markham Hall\n|Teri\n|Bowers\n|(电子邮件保护)\n|918-343-7657\n|RSU Public TV\n|马卡姆大厅，226室\n|Rebekah\n|Warren\n|(电子邮件保护)\n|918-343-7857\n|English & Humanities\n|贝尔德大厅，215B室\n|Jeanice\n|Davis\n|(电子邮件保护)\n|918-343-6810\n|English & Humanities\n|Baird Hall\n|Mary\n|Utsler\n|(电子邮件保护)\n|918-343-7648\n|健康科学\n|健康科学大楼123室\n|RSU Libraries\n|918-343-7770\n|斯特拉顿泰勒图书馆\n|Dr. Kathy\n|Hoppe\n|(电子邮件保护)\n|918-343-7703\n|Psychology & Sociology\n|贝尔德大厅，202室\n|Laura\n|Paisley\n|(电子邮件保护)\n|918-343-6881\n|行政服务，邮件室\n|潘兴大厅，103室\n|Barry\n|Clark\n|(电子邮件保护)\n|918-343-6835\n|Admissions\n|(电子邮件保护)\n|Benjamin\n|Wilson\n|(电子邮件保护)\n|918-343-6837\n|Financial Aid\n|马卡姆大厅，121室\n|Julie\n|Fleetwood\n|(电子邮件保护)\n|918-343-7525\n|咨询服务\n|Dr. 卡罗琳泰勒中心\n|Sarah\n|Garrison\n|(电子邮件保护)\n|918-343-7581\n|Admissions\n|马卡姆大厅208室\n|Brianna\n|Stimson\n|(电子邮件保护)\n|918-343-7552\n|Registrar\n|马卡姆大厅，249室\n|Tosha\n|Hayes\n|(电子邮件保护)\n|918-343-7599\n|学生事务\n|Dr. 卡洛琳·泰勒中心201C室\n|College of\n|专业的研究\n|918-343-7608\n|专业进修学院\n|赫灵顿大厅，110室\n|餐饮服务\n|918-343-7843\n|餐饮服务\n|Matthew\n|Howard\n|(电子邮件保护)\n|918-343-7575\n|TRIO Programs\n|预备大厅\n|Nicholas\n|Dobbs\n|Campus Police\n|Pryor Campus\n|Robert \"Bob\"\n|Bates\n|918-343-7624\n|Campus Police\n|Campus Police\n|Shaylene\n|Chatham\n|(电子邮件保护)\n|918-343-7710\n|健康科学\n|健康科学160A室\n|Chris\n|Jones\n|(电子邮件保护)\n|918-343-7996\n|Athletics\n|足球场，104A室\n|Kimberly\n|Moody\n|(电子邮件保护)\n|918-343-7520\n|专业进修学院\n|健康科学，110A室\n|Austin\n|Clinton\n|(电子邮件保护)\n|918-343-7538\n|学术计算服务\n|预备大厅，212室\n|Olivia\n|Woody\n|(电子邮件保护)\n|918-343-8357\n|Development & Foundation\n|基金会校友中心\n|Esports\n|(电子邮件保护)\n|918-343-7970\n|学生事务\n|赫灵顿厅，149室\n|Dr. Tom\n|Carment\n|(电子邮件保护)\n|918-343-7619\n|Business\n|赫灵顿大厅，166室\n|Julia\n|Poole\n|(电子邮件保护)\n|918-343-7615\n|学术事务\n|Meyer Hall\n|Dorothy\n|Mullis\n|(电子邮件保护)\n|918-343-7702\n|TRiO\n|预备大厅，104室\n|Dr. Emily\n|Dial-Driver\n|(电子邮件保护)\n|918-343-7747\n|English & Humanities\n|贝尔德大厅215J室\n|Lisa\n|Rogers\n|(电子邮件保护)\n|918-343-7770\n|Library\n|图书馆，200室\n|Rance\n|Kingfisher\n|(电子邮件保护)\n|918-343-6867\n|Biology\n|洛什堡楼，103室\n|Jake\n|Hudspeth\n|(电子邮件保护)\n|918-343-7839\n|Athletics\n|Bushyhead Fieldhouse, 201A室\n|Dr. Sara K.\n|Moon-Seo\n|(电子邮件保护)\n|918-343-7813\n|Psychology & Sociology\n|预备大厅203J室\n|Barbara\n|Evans\n|(电子邮件保护)\n|918-343-7554\n|Financial Aid\n|马卡姆大厅，128室\n|Dr. Hannah\n|King\n|(电子邮件保护)\n|918-343-7697\n|Biology\n|洛什堡楼101室\n|Troy\n|Gerard\n|(电子邮件保护)\n|918-343-7519\n|专业研究学院\n|赫灵顿大厅，229B室\n|Dr. Chris\n|Shelton\n|(电子邮件保护)\n|918-343-7814\n|Mathematics & 物理科学\n|洛什堡楼，105室\n|Hannah\n|Xiong\n|(电子邮件保护)\n|918-343-7538\n|行政计算服务\n|行政服务中心126室\n|Bridgette\n|Nichols\n|(电子邮件保护)\n|918-343-7882\n|RSUTV\n|马卡姆大厅，132室\n|Brett\n|Rowh\n|(电子邮件保护)\n|918-825-6021\n|Pryor Campus\n|Pryor校区，113B室\n|Jason\n|Reavis\n|(电子邮件保护)\n|918-343-7542\n|Financial Aid\n|马卡姆大厅，130室\n|Bradley\n|Hammond\n|(电子邮件保护)\n|918-343-7852\n|Registrar\n|马卡姆大厅，236室\n|Dr. Michelle\n|Taylor\n|(电子邮件保护)\n|918-343-7835\n|Psychology & Sociology\n|贝尔德大厅，202室\n|Tonya\n|Ballone\n|(电子邮件保护)\n|918-343-7633\n|护理与卫生专业学院\n|健康科学，121室\n|Samantha\n|Rhea\n|(电子邮件保护)\n|918-343-7561\n|健康科学\n|健康科学，118室\n|Holden\n|Craig\n|(电子邮件保护)\n|918-343-7970\n|学生事务\n|赫灵顿厅，149室\n|咨询服务\n|(电子邮件保护)\n|918-343-7845\n|咨询服务\n|Dr. 卡罗琳·泰勒中心，103室\n|Sheila\n|Parker\n|(电子邮件保护)\n|918-343-7741\n|Biology | Math & 物理科学\n|斯特拉顿泰勒图书馆，101室\n|Noel\n|Cleveland\n|(电子邮件保护)\n|918-343-8345\n|学生的成功\n|马卡姆大厅，239室\n|Jeana Rae\n|Conn\n|(电子邮件保护)\n|918-343-7707\n|学生事务\n|Dr. 卡罗琳·泰勒中心，201G室\n|Katie\n|Anderson\n|(电子邮件保护)\n|918-343-7755\n|学生事务\n|Dr. 卡罗琳·泰勒中心，201K室\n|Laci\n|Henegar\n|(电子邮件保护)\n|918-343-5204\n|学生的成功\n|马卡姆大厅，246室\n|Dr. Amy\n|Evans\n|(电子邮件保护)\n|918-343-7953\n|Business\n|赫灵顿大厅，230B室\n|Donna\n|Asauskas\n|(电子邮件保护)\n|918-343-7730\n|测试中心\n|马卡姆大厅，223室\n|Rebekah\n|Inman\n|(电子邮件保护)\n|918-343-7812\n|健康科学\n|健康科学，123室\n|Michelle\n|Bussell\n|(电子邮件保护)\n|918-343-7826\n|健康科学\n|健康科学，215室\n|Sheryl\n|Klenovich\n|(电子邮件保护)\n|918-343-7643\n|健康科学\n|健康科学160B室\n|Tracy\n|Thrun\n|(电子邮件保护)\n|918-343-7740\n|美术/传播\n|贝尔德大厅217E室\n|Lisa\n|Bailey\n|(电子邮件保护)\n|918-343-7546\n|Admissions\n|万锦堂大堂\n|School of\n|Nursing & 健康的职业\n|918-343-7631\n|Nursing & 健康的职业\n|健康科学，106室\n|Jason\n|Czapansky\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|杰瑞·布朗森”\n|Smith\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|Ronnie\n|Roden\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|Rick\n|Parsley\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|Mark\n|Cleveland\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|查尔斯“大卫”\n|Sandusky\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|Michael\n|Osborn\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|Tommy\n|Terneus\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|Steve\n|Massey\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|Joe\n|Batt\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|Hayden\n|Bozarth\n|(电子邮件保护)\n|918-343-6865\n|English & Humanities\n|贝尔德大厅，207室\n|Travis\n|Peck\n|(电子邮件保护)\n|918-343-6816\n|RSU Alumni\n|基金会校友中心\n|Jordan\n|Brown\n|(电子邮件保护)\n|918-343-7548\n|Admissions\n|马卡姆大厅，114室\n|Michael\n|Allgood\n|(电子邮件保护)\n|918-343-7860\n|Budget & Accounting\n|行政服务中心103室\n|职业服务\n|(电子邮件保护)\n|918-343-6835\n|职业服务\n|Dr. 卡洛琳·泰勒中心2楼\n|Dr. Mark\n|Rasor\n|(电子邮件保护)\n|918-343-7836\n|Administration\n|迈耶大厅，121室\n|Matt\n|Kennedy\n|(电子邮件保护)\n|918-343-7689\n|Athletics\n|足球场，104N室\n|Steve\n|Valencia\n|(电子邮件保护)\n|918-343-7780\n|Development\n|基金会校友中心\n|Office of\n|第一年和转学经验\n|918-343-7583\n|学术事务\n|Kenneth\n|Neal\n|(电子邮件保护)\n|918-343-7647\n|RSU Public TV\n|马卡姆大厅，144B室\n|Savannah\n|Hayman\n|(电子邮件保护)\n|918-338-8016\n|巴特尔斯维尔校区\n|巴特尔斯维尔校区\n|Dr. Tom\n|Gerard\n|(电子邮件保护)\n|918-343-6805\n|Business\n|赫灵顿大厅，229A室\n|Amber\n|Sanchez\n|(电子邮件保护)\n|918-343-7644\n|健康科学\n|健康科学，210室\n|Bruce\n|Richardson\n|(电子邮件保护)\n|918-343-8390\n|Business\n|赫灵顿大厅，216室\n|Randall\n|Keirsey\n|(电子邮件保护)\n|918-343-7820\n|物理设施\n|物理设施\n|Dr. Mark\n|Peaden\n|(电子邮件保护)\n|918-343-7701\n|Biology\n|洛什堡楼，202室\n|Debbi\n|Oestmann\n|(电子邮件保护)\n|918-343-7668\n|Accountability & Academics\n|梅耶大厅，109室\n|Ethan\n|Williams\n|(电子邮件保护)\n|918-343-7538\n|学术计算服务\n|预备大厅，210室\n|Brian\n|Coley\n|(电子邮件保护)\n|918-343-7585\n|健康科学\n|健康科学，140室\n|Dr. Carla\n|Lynch\n|(电子邮件保护)\n|918-343-7735\n|健康科学\n|健康科学，108室\n|Erika\n|Nebergall\n|(电子邮件保护)\n|918-343-7731\n|Bursar\n|马卡姆大厅，204室\n|CPT Riannon\n|Frebel\n|(电子邮件保护)\n|918-343-7829\n|RSU金牌计划\n|学生退伍军人中心\n|1LT Logan\n|Gear\n|(电子邮件保护)\n|918-645-5433\n|RSU金牌计划\n|学生退伍军人中心\n|Dr. Jaeman\n|Son\n|(电子邮件保护)\n|918-343-7666\n|Business\n|赫灵顿厅211室\n|Kyle\n|Bent\n|(电子邮件保护)\n|918-343-7883\n|Athletics\n|Bushyhead Fieldhouse, 102C房间\n|Sarah\n|Adcock\n|(电子邮件保护)\n|918-343-7769\n|学生事务\n|预备大厅，107室\n|Dr. Cheyanne\n|Olson\n|(电子邮件保护)\n|918-343-7766\n|Biology\n|Loshbaugh大厅，208房间\n|Dr. David\n|Bath\n|(电子邮件保护)\n|918-343-6820\n|History & 政治科学\n|贝尔德大厅219J室\n|Anita\n|Horton\n|(电子邮件保护)\n|918-343-7560\n|Bursar\n|马卡姆大厅，202室\n|Bobbi\n|Gill\n|(电子邮件保护)\n|918-343-7590\n|健康科学\n|健康科学，109室\n|Angela\n|Coats\n|(电子邮件保护)\n|918-343-7649\n|RSUTV\n|马卡姆大厅，136室\n|Kendall\n|Ragsdale\n|(电子邮件保护)\n|918-343-7631\n|健康科学\n|健康科学，106室\n|Dawn\n|Childress\n|(电子邮件保护)\n|918-343-7862\n|物理设施\n|大学村A, 319室\n|Jacob\n|Skidmore\n|(电子邮件保护)\n|918-343-8394\n|餐饮服务\n|Dr. 卡罗琳·泰勒中心，114E室\n|Luz\n|Garrett\n|(电子邮件保护)\n|918-343-6810\n|English & Humanities\n|贝尔德大厅，215E室\n|Jamil\n|Haynes\n|(电子邮件保护)\n|918-343-7728\n|人力资源\n|行政服务中心122室\n|Cecily\n|Tubbs\n|(电子邮件保护)\n|918-343-7719\n|Library\n|斯特拉顿泰勒图书馆，306室\n|Susan\n|Hammons\n|(电子邮件保护)\n|918-343-6868\n|Dining & Food Services\n|Dr. 卡罗琳·泰勒中心，114A室\n|Chris\n|Fairchild\n|(电子邮件保护)\n|物理设施\n|巴特尔斯维尔测试\n|Center\n|(电子邮件保护)\n|918-338-8000\n|巴特尔斯维尔校区\n|1楼102套房\n|Claremore测试\n|Center\n|(电子邮件保护)\n|918-343-7730\n|Claremore校园\n|马卡姆大厅，223室\n|Pryor Testing\n|Center\n|(电子邮件保护)\n|918-825-6117\n|Pryor Campus\n|Pryor招生办公室\n|Rebekah\n|Chamberlain\n|(电子邮件保护)\n|918-343-7683\n|Psychology & Sociology\n|预备大厅，202室\n|Alaina\n|Sprague\n|(电子邮件保护)\n|918-343-5202\n|Registrar\n|马卡姆大厅，234室\n|Kristina\n|Long\n|(电子邮件保护)\n|918-343-7696\n|Budget & Accounting\n|行政服务中心，106室\n|Heather\n|Andris\n|(电子邮件保护)\n|918-343-7506\n|Budget & Accounting\n|行政服务中心102室\n|Stephen\n|Brown\n|(电子邮件保护)\n|918-343-7988\n|Athletics\n|布什黑德运动场200C室\n|Dan\n|Frick\n|(电子邮件保护)\n|918-343-7524\n|Technology & 司法研究\n|赫灵顿大厅，124室\n|Dr. Joshua\n|Ang\n|(电子邮件保护)\n|918-343-7810\n|Business\n|赫灵顿厅，250室\n|Accessibility & 残疾的资源\n|(电子邮件保护)\n|918-343-6828\n|Accessibility & 残疾的资源\n|Dr. 卡洛琳·泰勒中心2楼\n|Bruce\n|Hartley\n|(电子邮件保护)\n|918-343-7742\n|通信\n|贝尔德大厅，221E室\n|Firelei\n|Edmonds\n|(电子邮件保护)\n|918-343-8341\n|Admissions\n|马卡姆大厅，122室\n|Ashlee\n|Pitts\n|(电子邮件保护)\n|918-343-7881\n|Athletics\n|Bushyhead Fieldhouse, 203室\n|Junmo\n|Sung\n|(电子邮件保护)\n|918-343-7526\n|业务部\n|赫灵顿大厅，209室\n|Whitney\n|Hocutt\n|(电子邮件保护)\n|918-343-7894\n|运动/发展 & Foundation\n|Bushyhead Fieldhouse\n|司法行政 & Studies\n|918-343-7608\n|科技与司法研究\n|赫灵顿厅108号\n|Robert\n|Bass\n|(电子邮件保护)\n|918-343-6834\n|物理设施\n|物理设施\n|Jayne\n|McLoughlin\n|(电子邮件保护)\n|918-343-7971\n|Cameron\n|预备大厅，204室\n|Office of\n|Technology & 司法研究\n|918-343-7608\n|Main Contact\n|赫灵顿大厅，110室\n|Office of\n|学生健康中心\n|(电子邮件保护)\n|918-343-7614\n|学生健康中心\n|健康科学，154B室\n|Office of\n|公共关系\n|918-343-7771\n|Main Contact\n|赫灵顿大厅，206室\n|Office of\n|Psychology & Sociology\n|918-343-7683\n|Main Contact\n|预备大厅，202室\n|Office of\n|物理设备\n|(电子邮件保护)\n|918-343-7818\n|Main Contact\n|物理设施\n|Office of\n|Mathematics & 物理科学\n|918-343-6812\n|Main Contact\n|Office of\n|History & 政治科学\n|918-343-6811\n|Main Contact\n|贝尔德大厅219J室\n|Office of\n|Fine Arts\n|918-343-7740\n|Main Contact\n|贝尔德大厅，217室\n|Office of\n|登记管理\n|918-343-7552\n|Main Contact\n|马卡姆大厅，232室\n|Office of\n|English & Humanities\n|918-343-6810\n|Main Contact\n|贝尔德大厅，215E室\n|Office of\n|Development & Foundation\n|918-343-8357\n|Main Contact\n|基金会校友中心，106室\n|Office of\n|通信\n|918-343-6825\n|Main Contact\n|贝尔德大厅221室\n|业务部\n|918-343-7608\n|业务部\n|赫灵顿大厅，110室\n|Office of\n|OMA Alumni\n|918-343-6889\n|OMA Alumni\n|迈耶大厅，200房间\n|Office of\n|RSU Alumni\n|918-343-6816\n|RSU Alumni\n|基金会校友中心，104B室\n|Office of\n|Academics & Accountability\n|918-343-6866\n|Main Contact\n|迈耶大厅，114室\n|Kelly\n|Williams\n|(电子邮件保护)\n|918-343-7831\n|Bursar\n|马卡姆大厅，204室\n|Dr. Chris\n|Ratcliff\n|(电子邮件保护)\n|918-343-7984\n|Athletics\n|布什黑德运动场，202室\n|Kelly\n|Holmes\n|(电子邮件保护)\n|918-343-7853\n|TRiO EOC\n|预备大厅，102C室\n|Malori\n|Moss\n|(电子邮件保护)\n|918-343-6813\n|Athletics\n|棒垒球大楼，202室\n|Kim\n|Bagwell\n|(电子邮件保护)\n|918-343-7782\n|Athletics\n|足球Fieldhouse\n|Steven\n|Rosser\n|(电子邮件保护)\n|918-343-7686\n|Fine Arts\n|贝尔德大厅，217室\n|Rob\n|Turner\n|(电子邮件保护)\n|918-343-7739\n|Technology & 司法研究\n|赫灵顿大厅，214室\n|Diana\n|Trammell\n|(电子邮件保护)\n|TRiO Programs\n|Off Campus\n|Ram\n|Adhikari\n|(电子邮件保护)\n|918-343-7516\n|Mathematics & 物理科学\n|Loshbaugh大厅，100房间\n|Terry Sue\n|Laymon-Barnett\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|Bryan\n|Weygand\n|(电子邮件保护)\n|918-343-7995\n|Athletics\n|Robert (Mike)\n|Brower\n|(电子邮件保护)\n|918-343-7651\n|RSUTV\n|马卡姆大厅，134室\n|Office of\n|人力资源\n|(电子邮件保护)\n|918-343-7796\n|Main Contact\n|行政服务中心，120室\n|Bridget\n|Williams\n|(电子邮件保护)\n|918-343-7843\n|餐饮服务/索迪斯\n|Dr. 卡罗琳·泰勒中心，114C室\n|Stan\n|Peters\n|(电子邮件保护)\n|918-343-7624\n|Campus Police\n|Campus Police\n|Don\n|Arent\n|(电子邮件保护)\n|918-825-6034\n|Campus Police\n|巴特尔斯维尔校区\n|Dr. Brook\n|Purdum\n|(电子邮件保护)\n|918-343-7805\n|Business\n|赫灵顿大厅，230A室\n|Dr. Lori\n|O'Malley\n|(电子邮件保护)\n|918-343-7652\n|Psychology & Sociology\n|预备大厅203J室\n|Daniel\n|Wells\n|(电子邮件保护)\n|918-343-7844\n|Food Services\n|Dr. 卡罗琳·泰勒中心，114B室\n|Kaitlin\n|Crotty\n|(电子邮件保护)\n|918-343-7717\n|Library\n|斯特拉顿·泰勒图书馆，204楼\n|Jeffrey\n|Paden\n|(电子邮件保护)\n|918-343-6808\n|Athletics\n|布什黑德运动场，111室\n|Kevin\n|Shoemaker\n|(电子邮件保护)\n|918-343-7658\n|RSUTV\n|马卡姆大厅，132室\n|Daniel\n|Murphree\n|(电子邮件保护)\n|918-343-7655\n|RSUTV\n|马卡姆大厅，144D室\n|RSU\n|Bookstore\n|(电子邮件保护)\n|918-343-7847\n|Bookstore\n|Bookstore\n|Pryor\n|Campus Police\n|(电子邮件保护)\n|918-825-6034\n|Main Contact\n|普赖尔校园104室\n|Bartlesville\n|Campus Police\n|(电子邮件保护)\n|918-338-8020\n|Main Contact\n|招生处对面的一楼\n|James\n|Maltby\n|(电子邮件保护)\n|918-343-6809\n|RSUTV\n|马卡姆大厅，B08A室\n|Dr. Sonya\n|Munsell\n|(电子邮件保护)\n|918-343-7688\n|Psychology & Sociology\n|预备大厅203K室\n|Chris\n|Ruhl\n|(电子邮件保护)\n|918-343-7815\n|Technology & 司法研究\n|赫灵顿大厅，119室\n|Office of\n|学生事务\n|(电子邮件保护)\n|918-343-7579\n|Main Contact\n|Dr. 卡罗琳·泰勒中心201室\n|Writing\n|Center\n|(电子邮件保护)\n|918-343-7838\n|English & Humanities\n|贝尔德大厅，206室\n|Karl\n|Reynolds\n|(电子邮件保护)\n|918-343-7819\n|物理设施\n|物理厂房，102室\n|David\n|Brixey\n|(电子邮件保护)\n|918-343-7887\n|住宅生活\n|大学村会所，C107\n|Dr. Masoud\n|Saffarian\n|(电子邮件保护)\n|918-343-7969\n|Business\n|赫灵顿大厅，248室\n|Help\n|Desk\n|(电子邮件保护)\n|918-343-7538\n|学术计算服务\n|预备大厅，210室\n|Office of\n|Admissions\n|(电子邮件保护)\n|918-343-7546\n|Main Contact\n|马卡姆大厅，100房间\n|Office of\n|Mail Room & Print Shop\n|(电子邮件保护)\n|918-343-7788\n|Main Contact\n|Pershing Hall\n|Office of\n|学术事务\n|918-343-7615\n|Main Contact\n|迈耶大厅，110室\n|Dr. Susan\n|Willis\n|(电子邮件保护)\n|918-343-6802\n|文学院 & Sciences and 专业进修学院\n|赫灵顿大厅，109室\n|Dr. Keith\n|Martin\n|(电子邮件保护)\n|918-343-7706\n|Biology\n|Loshbaugh大厅，100房间\n|Eileen\n|Richardson\n|(电子邮件保护)\n|918-343-6885\n|卡梅隆大学\n|预备大厅，204D室\n|Dr. Frank\n|Elwell\n|(电子邮件保护)\n|918-343-7851\n|Psychology & Sociology\n|预备大厅，203D室\n|Office of\n|卡梅隆大学\n|918-343-7971\n|Main Contact\n|预备大厅，204室\n|Ronna\n|Hatley\n|(电子邮件保护)\n|918-343-6819\n|专业研究学院\n|赫灵顿大厅，107室\n|Kyla\n|Short\n|(电子邮件保护)\n|918-343-7792\n|住宅生活\n|大学村B俱乐部会所\n|Office of\n|住宅生活\n|(电子邮件保护)\n|918-343-7789\n|Main Contact\n|大学村B俱乐部会所\n|Office\n|书记官长\n|(电子邮件保护)\n|918-343-7552\n|Main Contact\n|马卡姆大厅，249室\n|Veronica\n|Rackley\n|(电子邮件保护)\n|918-343-7726\n|Registrar\n|马卡姆大厅，228室\n|James \"Randy\"\n|Riggs\n|(电子邮件保护)\n|918-343-7754\n|公共关系\n|赫灵顿大厅，206D室\n|Kelli\n|Fields\n|(电子邮件保护)\n|918-343-7994\n|公共关系\n|赫灵顿大厅，205C室\n|Christi\n|Mackey\n|(电子邮件保护)\n|918-343-7752\n|Psychology & Sociology\n|预备大厅，203C室\n|Dr. Brian\n|Andrews\n|(电子邮件保护)\n|918-343-7684\n|Psychology & Sociology\n|预备大厅，203B室\n|Lisa\n|Ramsey\n|(电子邮件保护)\n|918-825-6003\n|Pryor Campus\n|Pryor Campus\n|Pryor\n|Campus\n|918-825-6117\n|Pryor Campus\n|Pryor Campus\n|Dana\n|Best\n|(电子邮件保护)\n|918-343-7663\n|专业进修学院\n|赫灵顿大厅，105B室\n|George\n|Proctor\n|(电子邮件保护)\n|918-343-8351\n|物理设施\n|物理厂房，116室\n|Karyn\n|Krause\n|(电子邮件保护)\n|918-343-7818\n|物理设施\n|物理设施\n|Paul\n|Dunham\n|(电子邮件保护)\n|918-343-7708\n|物理设施\n|物理厂房103\n|Tammy\n|Ryan\n|(电子邮件保护)\n|918-343-6889\n|OMA Alumni\n|迈耶大厅，200房间\n|Dr. Danette\n|Boyle\n|(电子邮件保护)\n|918-343-6888\n|OMA Alumni\n|迈耶大厅，200房间\n|Marcy\n|Cox\n|918-338-8029\n|俄克拉荷马州早期定居计划\n|巴特尔斯维尔校区2楼\n|Dr. Kirk\n|Voska\n|(电子邮件保护)\n|918-343-7762\n|Mathematics & 物理科学\n|斯特拉顿泰勒图书馆，110室\n|Dr. Sukhitha\n|Vidurupola\n|(电子邮件保护)\n|918-343-7961\n|Mathematics & 物理科学\n|斯特拉顿·泰勒图书馆，107室\n|Dr. Min\n|Soe\n|(电子邮件保护)\n|918-343-7693\n|Mathematics & 物理科学\n|斯特拉顿泰勒图书馆，114室\n|Dr. Kasia\n|Roberts\n|(电子邮件保护)\n|918-343-7638\n|Mathematics & 物理科学\n|洛什堡楼101室\n|Kelly\n|Ewing\n|(电子邮件保护)\n|918-343-7716\n|Library\n|斯特拉顿泰勒图书馆，200室\n|Library\n|借还书处\n|918-343-7716\n|Main Contact\n|斯特拉顿泰勒图书馆200\n|Audrey\n|Baker\n|(电子邮件保护)\n|918-343-7721\n|Library\n|斯特拉顿泰勒图书馆\n|Catherine\n|Heimdale\n|(电子邮件保护)\n|918-343-6811\n|History & 政治科学\n|贝尔德大厅219B室\n|Bryan\n|Crain\n|(电子邮件保护)\n|918-343-7646\n|RSUTV\n|马卡姆大厅，144C室\n|Office of\n|RSU TV\n|918-343-7649\n|Main Contact\n|Markham Hall\n|Suzanne\n|Perry\n|(电子邮件保护)\n|918-343-7886\n|人力资源\n|行政服务中心，124室\n|Dr. James\n|Ford\n|(电子邮件保护)\n|918-343-7749\n|English & Humanities\n|贝尔德大厅，101B室\n|Dr. Sigismond\n|Wilson\n|(电子邮件保护)\n|918-343-7800\n|History & 政治科学\n|贝尔德大厅219K室\n|Dr. Quentin\n|Taylor\n|(电子邮件保护)\n|918-343-7667\n|History & 政治科学\n|贝尔德大厅219H室\n|Dr. Carolyn\n|Taylor\n|(电子邮件保护)\n|918-343-7627\n|History & 政治科学\n|贝尔德大厅，219G室\n|Dr. Steven\n|Housel\n|(电子邮件保护)\n|918-343-7811\n|History & 政治科学\n|贝尔德大厅，219C室\n|Dr. Kenneth\n|Hicks\n|(电子邮件保护)\n|918-343-7687\n|History & 政治科学\n|贝尔德大厅219L室\n|Dr. Paul\n|Hatley\n|(电子邮件保护)\n|918-343-7682\n|History & 政治科学\n|贝尔德大厅，219D室\n|Dr. Michael\n|Beauchamp\n|(电子邮件保护)\n|918-343-7746\n|History & 政治科学\n|贝尔德大厅219B室\n|Dr. Marla\n|Smith\n|(电子邮件保护)\n|918-343-6887\n|健康科学\n|健康科学，248B室\n|Dr. Amy\n|Richards\n|(电子邮件保护)\n|918-343-7641\n|健康科学\n|健康科学，248A室\n|Hillcat Hut\n|Café\n|918-343-7846\n|餐饮服务\n|Dr. 卡罗琳泰勒中心\n|Convenience\n|Store\n|918-343-7880\n|Food Services\n|百年纪念中心，106室\n|Dr. Michael\n|McKeon\n|(电子邮件保护)\n|918-343-7594\n|Fine Arts\n|贝尔德大厅217B室\n|Heather\n|Isaacs\n|(电子邮件保护)\n|918-343-7572\n|学生的成功\n|马卡姆大厅，232室\n|Dr. Hugh\n|Foley\n|(电子邮件保护)\n|918-343-7566\n|Fine Arts\n|贝尔德大厅，217D室\n|Bryce\n|Brimer\n|(电子邮件保护)\n|918-343-7611\n|Fine Arts\n|贝尔德大厅，224A室\n|Office of\n|Financial Aid\n|(电子邮件保护)\n|918-343-7553\n|Main Contact\n|Markham Hall\n|Scott\n|Reed\n|(电子邮件保护)\n|918-343-7588\n|English & Humanities\n|贝尔德大厅215K室\n|Dr. Matthew\n|Oberrieder\n|(电子邮件保护)\n|918-343-7743\n|English & Humanities\n|贝尔德大厅，215L室\n|Dr. Gioia\n|Kerlin\n|(电子邮件保护)\n|918-343-6894\n|English & Humanities\n|贝尔德大厅，215C室\n|Dr. Laura\n|Gray\n|(电子邮件保护)\n|918-343-7593\n|English & Humanities\n|贝尔德大厅215R室\n|Dr. Frank\n|Grabowski\n|(电子邮件保护)\n|918-343-7659\n|English & Humanities\n|贝尔德大厅，215S室\n|Dr. Sally\n|Emmons\n|(电子邮件保护)\n|918-343-7976\n|English & Humanities\n|贝尔德大厅，215P室\n|Renee\n|Cox\n|(电子邮件保护)\n|918-343-7978\n|English & Humanities\n|贝尔德大厅，215楼\n|Holly\n|Clay-Buck\n|(电子邮件保护)\n|918-343-7724\n|English & Humanities\n|健康科学，244B室\n|Bayone\n|Pettis\n|(电子邮件保护)\n|918-343-7761\n|教育机会中心\n|Off Campus OKC\n|Katie\n|Navarro\n|(电子邮件保护)\n|918-343-7709\n|教育机会中心\n|预备大厅，102D室\n|Keah\n|McCutchin\n|(电子邮件保护)\n|918-343-7761\n|教育机会中心\n|Off Campus OKC\n|Elizabeth\n|Gordon\n|(电子邮件保护)\n|918-343-7756\n|TRiO Program\n|预备堂106A室\n|Dr. Susan\n|Bedwell\n|(电子邮件保护)\n|918-343-7824\n|TRIO -教育机会中心\n|预备大厅，108室\n|Shonna\n|Payne\n|(电子邮件保护)\n|918-343-7775\n|Development & Foundation\n|基金会校友中心，110室\n|Tonni\n|Harrald\n|(电子邮件保护)\n|918-343-7767\n|Development & Foundation\n|基金会校友中心，112室\n|Clarice\n|Doyle\n|(电子邮件保护)\n|918-343-7882\n|RSUTV\n|Markham Hall\n|Sarah\n|Fennell\n|(电子邮件保护)\n|918-343-7538\n|行政计算服务\n|行政服务中心，121室\n|Catherine\n|Burns\n|(电子邮件保护)\n|918-343-7791\n|行政计算服务\n|行政服务中心，128室\n|Dr. Holly\n|Kruse\n|(电子邮件保护)\n|918-343-7879\n|通信\n|贝尔德大厅，221D室\n|Dr. Juliet\n|Evusa\n|(电子邮件保护)\n|918-343-7677\n|通信\n|贝尔德大厅，221B室\n|Tip\n|Crowley\n|(电子邮件保护)\n|918-343-7670\n|通信 | RSU Radio\n|马卡姆大厅，140室\n|Dr. David\n|Blakely\n|(电子邮件保护)\n|918-430-4309\n|通信\n|贝尔德大厅，221C室\n|Claremore\n|Campus Police\n|(电子邮件保护)\n|918-343-7624\n|Main Contact\n|Campus Police\n|Dr. David\n|Johnk\n|(电子邮件保护)\n|918-343-8352\n|Business\n|赫灵顿厅，161室\n|Dr. Todd\n|Jackson\n|(电子邮件保护)\n|918-343-7699\n|Business\n|赫灵顿大厅，106室\n|Office of\n|Hillcat Card\n|918-343-6884\n|Main Contact\n|马卡姆大厅，206室\n|Office of\n|Bursar\n|(电子邮件保护)\n|918-343-7558\n|Main Contact\n|马卡姆大厅，204A室\n|Nicole\n|Wigginton\n|(电子邮件保护)\n|918-343-7816\n|Budget & Accounting\n|行政服务中心\n|Christie\n|Lamberson\n|(电子邮件保护)\n|918-343-7790\n|Budget & Accounting\n|行政服务中心\n|Kimberly\n|Garland\n|(电子邮件保护)\n|918-343-7803\n|Bursar\n|马卡姆大厅，204室\n|Erin\n|Portine\n|(电子邮件保护)\n|918-343-7847\n|Bookstore\n|Bookstore\n|Dr. Craig\n|Zimmermann\n|(电子邮件保护)\n|918-343-6818\n|Biology\n|Loshbaugh大厅，205A房间\n|Dr. Jin\n|Seo\n|(电子邮件保护)\n|918-343-7841\n|Biology\n|洛什堡楼，209室\n|Department of\n|Biology\n|918-343-7695\n|Biology\n|洛什堡楼，210房间\n|Dr. Jae-Ho\n|Kim\n|(电子邮件保护)\n|918-343-7714\n|Biology\n|洛什堡楼，103室\n|Dr. Jerry\n|Bowen\n|(电子邮件保护)\n|918-343-7574\n|Biology\n|图书馆，105室\n|Ronda\n|Riden\n|(电子邮件保护)\n|918-338-8000\n|巴特尔斯维尔校区\n|巴特尔斯维尔，100号房\n|REDA\n|Maintenance-Basement\n|918-338-8085\n|巴特尔斯维尔校区\n|Basement\n|REDA\n|Maintenance\n|巴特尔斯维尔REDA建筑维修\n|918-338-8086\n|巴特尔斯维尔校区\n|206 Main\n|Bartlesville\n|Campus\n|918-338-8000\n|巴特尔斯维尔校区\n|Frank\n|Gage\n|巴特尔斯维尔高级建筑维修技术\n|918-338-8086\n|巴特尔斯维尔校区\n|Room 206\n|Larry\n|Elzo\n|(电子邮件保护)\n|918-338-8056\n|巴特尔斯维尔校区\n|巴特尔斯维尔校区，704室\n|Stephen\n|Davis\n|(电子邮件保护)\n|918-338-8005\n|巴特尔斯维尔校区\n|Room 226B\n|Andrea\n|Vaughan\n|(电子邮件保护)\n|918-343-7562\n|Athletics\n|棒球/垒球复杂\n|Dawn\n|Tatro\n|(电子邮件保护)\n|918-343-7884\n|Athletics\n|Bushyhead运动场201C室\n|Trey\n|Robertson\n|(电子邮件保护)\n|918-343-5203\n|Athletics\n|足球场104J室\n|Derek\n|Larkin\n|(电子邮件保护)\n|918-343-7995\n|Athletics\n|足球场，104L室\n|Chris\n|Klimas\n|(电子邮件保护)\n|918-343-7787\n|Athletics\n|棒垒球综合楼203室\n|Justin\n|Barkley\n|(电子邮件保护)\n|918-343-6804\n|Athletics\n|Bushyhead Field House, 203室\n|Curtis\n|Sparling\n|(电子邮件保护)\n|918-343-7722\n|Technology & 司法研究\n|赫灵顿厅108室\n|Connie\n|Wall\n|(电子邮件保护)\n|918-343-6880\n|邮件、档案 & 打印服务\n|潘兴大厅，111室\n|Don\n|Thompson\n|(电子邮件保护)\n|918-343-7733\n|邮件、档案 & 打印服务\n|潘兴大厅，102室\n|Amy\n|Edwards\n|(电子邮件保护)\n|918-343-7796\n|人力资源\n|行政服务中心，120室\n|Dr. Larry\n|Rice\n|(电子邮件保护)\n|918-343-7612\n|Administration\n|梅耶大厅，105室\n|Dr. Richard\n|Beck\n|(电子邮件保护)\n|918-343-7615\n|学术事务\n|迈尔大厅，111室\n|Dr. Mary\n|Millikin\n|(电子邮件保护)\n|918-343-7605\n|教务、问责 & Academics\n|迈耶大厅，113室\n|Brian\n|Reeves\n|(电子邮件保护)\n|918-343-7538\n|学术计算服务\n|预备大厅，220室\n|Denton\n|Brown\n|(电子邮件保护)\n|918-343-7538\n|学术计算服务\n|预备大厅，210室",
    "meta": {
        "lang": "zh",
        "lang_score": 0.5236985087394714,
        "url": "http://0n7nq1i1.bk1988.com/about/contact-directory/",
        "timestamp": "2023-11-28T09:10:54Z",
        "cc-path": "crawl-data/CC-MAIN-2023-50/segments/1700679099281.67/warc/CC-MAIN-20231128083443-20231128113443-00000.warc.gz"
    }
}

English Documents Scoring Lower than 0.65 Examples

Sample documents that are classified as English but with score less than 0.65

Data sample: 3 of 99

{
    "text": "|66\n|DRASKOVICS VALÉR\n|2\n|KAJÁRI ÁDÁM GERGŐ\n|5\n|FÖLDEÁK DOMINIK\n|6\n|ALAM SHER ASIF\n|7\n|DOBOS BOTOND TAMÁS\n|8\n|HORVÁTH PÉTER\n|8'\n|9\n|DRASKOVICS VILMOS\n|10\n|ZSÁMPÁR OLIVÉR\n|18'\n|11\n|BANICZ BUDA LAJOS\n|CSERÉK\n|1\n|GAÁL PÉTER\n|13\n|SALAMON BÁLINT\n|15\n|MADÁR MARCELL\n|18\n|KARDOS DOMINIK\n|19\n|AMREIN PATRIK\n|20\n|TÓVÁRI BERNÁT\n|VEZETŐEDZŐ",
    "meta": {
        "lang": "en",
        "lang_score": 0.2852953374385834,
        "url": "http://ada1bank.mlsz.hu/match?itemId=1498129&evad=52&szervezet=19&verseny=21268&fordulo=11",
        "timestamp": "2023-11-28T09:34:11Z",
        "cc-path": "crawl-data/CC-MAIN-2023-50/segments/1700679099281.67/warc/CC-MAIN-20231128083443-20231128113443-00000.warc.gz"
    }
}

URL Filtering: The following section details the decisions behind utilizing the UT1 blocklist. We chose to use the UT1 blocklist as a simple method for filtering out potentially harmful content such as adult content. We also excluded URLs that contained the digital version of the curated data (e.g. wikipedia.org) to avoid duplication.

URL Blocklist: Following RefinedWeb, we manually inspected the UT1 blocklist to reduce false positives like news articles, sex education, technical blogs, etc. Specifically, we randomly took 903M URLs and matched them with 4.6M domain names in the UT1 blocklist. Of note, 24 URLs were detected with more than 4k matches and are shown below.

List of 24 URLs with 4k+ Matches

24 URL domains with more than 4k matches

{
    "blog.libero.it": "some articles are erotic Poetry",
    "bust.com": "sex education, non porn type of content",
    "chicagoreader.com": "news articles",
    "discord.com": "tech blogs",
    "filedn.com": "cloud storage, web hosting type",
    "hotnessrater.com": "can't access",
    "ibb.co": "image file hosting",
    "imgur.com": "youtube shots type website",
    "izismile.com": "adult wedsite",
    "jungefreiheit.de": "german news articles",
    "marktplaza.nl": "resale marketplace",
    "pbase.com": "file hosting",
    "racejack.s40.xrea.com": "gambling",
    "rapidgator.net": "file sharing",
    "rutube.ru": "porn",
    "servimg.com": "image hsting",
    "telegra.ph": "some text hosting website",
    "thechive.com": "porn",
    "=3.com": "can't access",
    "thoughtcatalog.com": "Magazine",
    "turbobit.net": "file hosting",
    "videa.hu": "videos like youtube",
    "weheartit.com": "controversy new articles",
    "xnxx.com.se": "porn"
}

We manually removed the following 6 domains from the UT1 blocklist so that they will not be removed from our dataset.

6 URLS Manually Removed from the Blocklist

6 url domains that are removed from the blocklist

{
    "bust.com": "sex education, non porn type of content",
    "chicagoreader.com": "news articles",
    "discord.com": "tech blogs",
    "jungefreiheit.de": "german news articles",
    "marktplaza.nl": "resale marketplace",
    "telegra.ph": "some text hosting website"
}

Blocked Document Examples from the URL Blocklist (WARNING: MAY CONTAIN OFFENSIVE MATERIAL)

Sample documents whose urls are blocked by the refined url blocklist

Data sample: 3 of 42

{
    "text": "Check out\nNaughty Americans\nto meet them all!\nCheck out\nNaughty Americans\nto meet them all!\nRelated Pictures\n© 2015 - 2023 www.nylonburg.com",
    "meta": {
        "lang": "en",
        "lang_score": 0.8580492734909058,
        "url": "http://nylonburg.com/galleries/dsgfsgfg-102611.html",
        "timestamp": "2023-01-26T22:20:54Z",
        "cc-path": "crawl-data/CC-MAIN-2023-06/segments/1674764494826.88/warc/CC-MAIN-20230126210844-20230127000844-00000.warc.gz"
    }
}

Excluded High Quality Sources: To avoid duplication with our high-quality curated datasets, we exclude the following domains from our dataset.

TxT360 Excluded URLs

curated url domains that are excluded from our dataset

[
    "https://stackexchange.com/",
    "https://www.ncbi.nlm.nih.gov/pmc/",
    "https://arxiv.org/",
    "https://github.com/",
    "https://storage.courtlistener.com/",
    "https://bulkdata.uspto.gov/",
    "https://pubmed.ncbi.nlm.nih.gov/",
    "https://www.opensubtitles.org/",
    "https://www.wikipedia.org/",
    "https://irclogs.ubuntu.com/",
    "https://www.statmt.org/",
    "https://news.ycombinator.com/",
    "https://www.youtube.com/",
    "https://philpapers.org/"
]

TxT360 Excluded URLs Example Documents

Sample documents whose urls are in our curated url domain list

Data sample: 0 of 1

{
    "text": "Objectives: To explore similarities and differences in challenges to maternal health and evidence implementation in general across several low- and middle-income countries (LMICs) and to identify common and unique themes representing barriers to and facilitators of evidence implementation in LMIC health care settings.\nStudy design: Secondary analysis of qualitative data.\nSetting: Meeting reports and articles describing projects undertaken by the authors in five LMICs on three continents were analyzed. Projects focused on identifying barriers to and facilitators of implementation of evidence products: five World Health Organization maternal health guidelines, and a knowledge translation strategy to improve adherence to tuberculosis treatment. Data were analyzed using thematic content analysis.\nResults: Among identified barriers to evidence implementation, a high degree of commonality was found across countries and clinical areas, with lack of financial, material, and human resources most prominent. In contrast, few facilitators were identified varied substantially across countries and evidence implementation products.\nConclusion: By identifying common barriers and areas requiring additional attention to ensure capture of unique barriers and facilitators, these findings provide a starting point for development of a framework to guide the assessment of barriers to and facilitators of maternal health and potentially to evidence implementation more generally in LMICs.\nKeywords: Barriers; Evidence implementation; Evidence tools; Facilitators; Guidelines; Knowledge products.\nCopyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.",
    "meta": {
        "lang": "en",
        "lang_score": 0.9207570552825928,
        "url": "https://pubmed.ncbi.nlm.nih.gov/26931284/",
        "timestamp": "2023-11-28T09:33:25Z",
        "cc-path": "crawl-data/CC-MAIN-2023-50/segments/1700679099281.67/warc/CC-MAIN-20231128083443-20231128113443-00000.warc.gz"
    }
}

Line-Level Removal

Before filtering low-quality documents, we perform the line-level removal to remove low-quality lines. This ensured that computing quality signals would align with the final kept texts.

Terminal Punctuation: The terminal punctuation has been used in C4and Dolmato remove lines that do not end with a terminal punctuation mark (i.e., “.”, “?”, “!”, or “"”). However, we found it could be too aggressive to remove these lines, especially when the text extraction tool “trafilatura”.

For instance, in the CommonCrawl file CC-MAIN-20230126210844-20230127000844-00000.warc.jsonl, the terminal punctuation rule led to the removal of 56,292 additional lines, resulting in the complete exclusion of 2,203 documents from a total of 13,560 documents (16.25%). Accordingly, we choose to not use terminal punctuation as a signal to remove lines.

Terminal Punctuation Filtering Examples

Sample documents with lines that are removed by the rule of terminal punctuation

Data sample: 0 of 99

{
    "text": "Financial information\nThe past year ushered in a series of challenges for the companies in the fuel sector. The decisions made by us have demonstrated that we are able to take rapid steps to adapt to a demanding environment and ensure the desired profitability for our projects.\nSegment performance\nThe segmental management model we have implemented enhances management efficiency, delivering cost and revenue synergies across the organization.\nLetter from the Vice-President of the Board\n2014 ushered in a series of challenges for the companies in the fuel sector. The decisions made by the LOTOS Group have demonstrated that we are able to take rapid steps to adapt to a demanding environment and ensure the desired profitability for our projects.\nBusiness environment\nThe key factor that had a strong impact on both the global and Polish petroleum markets in 2014, with significant consequences for the LOTOS Group’s performance, was the price of crude oil, which also determined the price of petroleum products.\nStrategic objectives\nThe LOTOS Group’s Strategy is designed to strengthen our position as a strong, innovative and efficient business which plays a major role in ensuring national energy security.\nBusiness model\nOur operations consist in crude oil production and processing, as well as wholesale and retail sale of petroleum products, among which are: fuels (unleaded gasoline, diesel oil and light fuel oil), heavy fuel oil, bitumens, aviation fuel, naphtha, propane-butane LPG and base oils.\nRisk and opportunities\nAt the LOTOS Group, we identify a range of diverse risks, which may affect all areas of our business. The key risks in terms of their impact on our operations are the financial risks as well as risks affecting the exploration and production area. In the analysis of the risks, we also factor in issues related to sustainable development.\nKey data 2014\nWith revenue of ca. PLN 28.5bn in 2014, we rank fourth in the group of 500 largest businesses in Poland.\nIntegrated Annual Report 2014\nKey objectives\nThe concept of corporate social responsibility is a part of the vision of the LOTOS Group’s operations, and is reflected in both the business strategy and the corporate social responsibility strategy. In 2012, the Board of Grupa LOTOS approved the revised corporate social responsibility strategy for the LOTOS Group, effective – just like the business strategy – until 2015.\nThe primary goal of the LOTOS Group’s CSR strategy is to support the organization in meeting the objectives provided for in its business strategy by optimum use of the organization’s resources and capabilities to generate economic and social value for the benefit of the LOTOS Group and its environment.\nTo ensure successful delivery of that goal, the social, environmental, ethical and human rights concerns included in the CSR strategy were incorporated into the LOTOS Group’s core operations and business strategy. In this way, we have created a mechanism to:\n- Maximise the building of shared value for the shareholders, our other stakeholders and society as a whole,\n- Identify, prevent and mitigate the possible negative effects of our operations.\nThe efforts undertaken by the LOTOS Group in the social and business spheres, in our relations with key stakeholders and in corporate governance are aimed principally to:\n- Ensure compliance with the law and ethical standards,\n- Increase our positive contribution to social development,\n- Mitigate possible adverse impacts of our operations and the associated risks,\n- Maximise our chances for sustainable development over the long term.\nThe LOTOS Group’s CSR strategy until 2015 defined the key objectives to be achieved in individual areas of activity. For each of these objectives, a set of targets and action plans has been developed to support the achievement of the results envisaged in the strategy.\n- In the area of investment in human resources our objective is to ensure the availability of highly qualified staff required to successfully implement the business strategy and enhance the corporate culture based on adopted values.\n- As regards health and safety improvement, our priority is to increase the awareness and involvement in work safety improvement among the management staff, employees and contractors.\n- As regards integration with the local community, our principal goal is to undertake initiatives that help to ensure lasting solutions to social and environmental issues vital to our local communities.\n- In the area of management of natural resources in the production process, we seek to reduce environmental risk and continually minimise the environmental impact of the LOTOS Group’s operations.\n- In terms of ethics and the prevention of misconduct, we seek to improve our management by ensuring ethical conduct and the transparency of business processes, as well as by protecting the organization against misconduct.\n- Our strategic goal with respect to partnership relations with the market environment is to build lasting customer relationships by focusing on understanding customers’ needs and ensuring expected product quality and safety.\n- As regards energy sector security, our objective is to support initiatives designed to enhance energy sector security in a socially and environmentally responsible manner.\n- As regards communication, we aim to ensure that communication with employees is timely and appropriate to their various needs. We also seek to build organizational culture based on multi-directional, open communication, including through the development of a system of public consultations within the LOTOS Group.\n2014 was the third full year of implementation of the LOTOS Group’s corporate social responsibility strategy, which, just like our business strategy, spans the period to 2015.\nJowita Twardowska\nCommunication and CSR Director, Grupa LOTOS\nTherefore, we are about to update our objectives concerning social and environmental impacts, corporate governance and human rights protection. This will be an excellent opportunity for us to review our more than eight-year performance, since analyses underlying the development of our first CSR strategy were started in early 2007, as part of a series of important changes driven by the growth of the LOTOS Group as an integrated oil corporation.\nOur approach to CSR is long-term and comprehensive, since it has become an element of the management process. The synergy of its business and social aspects has been ensured through the development of detailed operational plans and measures of the CSR strategy performance against targets in all of its key areas. Performance against targets is supervised by leaders of particular areas, reporting to the Board. For the purpose of performance reporting, we have developed a method to monitor the implementation of the CSR strategy, similar to that used for analysing the effects of our business strategy. Our CSR practices, similarly to practices in other key management areas, are additionally assessed for maturity, and evaluated by the management on a regular basis during our yearly ‘CSR Day’. Recent analyses show a high level of achievement of the CSR targets, in excess of 90%.\nOur approach consistently strengthens the LOTOS Group’s reputation as a desirable employer, trading partner of choice, and trusted capital market participant, as well as a responsible neighbour, a business in symbiosis with its environment, committed to improving the well-being of its neighbourhood, resolving issues and responding to the challenges that emerge in the vicinity of its plants.\nWe strongly support the approach set out in the PN-ISO 26000 standard, where corporate social responsibility is defined as an element of the management process which takes into consideration a company’s responsibility for the environment in which it operates. We identify, assess, measure and mitigate our social and environmental impacts. It is worth noting here that the precautionary measures we take depend on the results of analyses of how particular risks affect the functioning of the organization and its surroundings. Therefore, we measure not only our impacts, but also the impacts of external factors on the operation and effectiveness of the LOTOS Group.\nWe perceive our involvement with the affairs of the environment in which we operate as our duty and commitment towards the stakeholders. We believe in building the LOTOS Group value while catering to the needs and expectations of local communities. In fact, one of the eight pillars of the LOTOS Group’s CSR strategy is integration with local communities. We understand it as contribution to ensuring lasting solutions to social and environmental issues vital to our local communities.\nTo translate this approach into effective action, we need to identify actual needs and expectations of our key stakeholders. And this is how the LOTOS Group works − we consult our decisions and priorities with local communities. A case in point is the consultation of our CSR strategy with the stakeholders. Also, everyone can share their views on our integrated annual reports or report an incident that involves a potential violation of our Code of Ethics.\nTransparent ongoing communication, being open to public dialogue and transparent reporting on the implementation of the CSR strategy are also vital to creating lasting relations with the stakeholders.\nLOTOS’ sound financial and market position proves that we successfully bring benefits to the organization and simultaneously build value for our environment.",
    "meta": {
        "lang": "en",
        "lang_score": 0.9504454135894775,
        "url": "http://2014.raportroczny.lotos.pl/en/business-strategy-and-model/key-objectives",
        "timestamp": "2023-01-26T21:40:47Z",
        "cc-path": "crawl-data/CC-MAIN-2023-06/segments/1674764494826.88/warc/CC-MAIN-20230126210844-20230127000844-00000.warc.gz"
    },
    "quality_signals": {
        "fraction_of_duplicate_lines": 0.0,
        "fraction_of_characters_in_duplicate_lines": 0.0,
        "fraction_of_characters_in_most_common_ngram": [
            [
                2,
                0.013622291021671827
            ],
            [
                3,
                0.014860681114551083
            ],
            [
                4,
                0.010526315789473684
            ]
        ],
        "fraction_of_characters_in_duplicate_ngrams": [
            [
                5,
                0.057024957740877
            ],
            [
                6,
                0.04916673575988724
            ],
            [
                7,
                0.04391969985241558
            ],
            [
                8,
                0.03882981207865606
            ],
            [
                9,
                0.035149433565077694
            ],
            [
                10,
                0.031157233018185446
            ]
        ],
        "fraction_of_words_corrected_in_lines": 0.006012024048096192,
        "fraction_of_lines_ending_with_ellipsis": 0.0,
        "fraction_of_lines_starting_with_bullet_point": 0.24561403508771928,
        "fraction_of_lines_with_toxic_words": 0.0,
        "num_of_lines_with_toxic_words": 0,
        "num_of_toxic_words": 0,
        "word_count": 1488,
        "mean_word_length": 5.426747311827957,
        "num_of_sentences": 54,
        "symbol_to_word_ratio": 0.0,
        "fraction_of_words_with_alpha_character": 0.9791666666666666,
        "num_of_stop_words": 320,
        "num_of_paragraphs": 0,
        "has_curly_bracket": false,
        "has_lorem_ipsum": false
    },
    "removed_lines_by_terminal_punctuation": [
        "Financial information",
        "Segment performance",
        "Letter from the Vice-President of the Board",
        "Business environment",
        "Strategic objectives",
        "Business model",
        "Risk and opportunities",
        "Key data 2014",
        "Integrated Annual Report 2014",
        "Key objectives",
        "To ensure successful delivery of that goal, the social, environmental, ethical and human rights concerns included in the CSR strategy were incorporated into the LOTOS Group’s core operations and business strategy. In this way, we have created a mechanism to:",
        "- Maximise the building of shared value for the shareholders, our other stakeholders and society as a whole,",
        "The efforts undertaken by the LOTOS Group in the social and business spheres, in our relations with key stakeholders and in corporate governance are aimed principally to:",
        "- Ensure compliance with the law and ethical standards,",
        "- Increase our positive contribution to social development,",
        "- Mitigate possible adverse impacts of our operations and the associated risks,",
        "Jowita Twardowska",
        "Communication and CSR Director, Grupa LOTOS"
    ]
}

"Word "Javascript" In C4,the authors remove any line with the word "Javascript" since they found that many of the scraped pages contained warnings stating that Javascript should be enabled. However, this filtering strategy is too strict, which will filter out many lines that are really talking about “Javascript”.

In our pipeline, we propose to refine the strategy by adding one more keyword to the word "javascript" to avoid false positives. The additional keyword could be any one of “enable” / “disable” / “require” / “activate” / “browser”.

Javascript Documents Filtered by C4 but Kept in TxT360

Sample documents that are removed by original C4 javascript rule but are kept after our refinement

Data sample: 0 of 12

{
    "text": "What's new in the Microsoft FluentUI library for Blazor versions 1.3 and 1.4\nIn this post I'll give you an overview of what's new and changed in versions 1.3 and 1.4 of the Microsoft Fluent UI for Blazor library. Me and the team were so busy with adding exciting new features that I didn't have time to blog about them earlier. I'll describe the features both from an end-user as from a code perspective.\nIn short, the two big end-user additions (besides some bug fixes) are:\n- Fluent UI System Icons support\n- Design Token support\nAnd the repository and code changes:\n- Add missing xml comments\n- Add\nFocusAsyncmethods to\nFluent UI System Icons support\nThe Fluent UI System Icons are a (still growing) collection of familiar, friendly and modern icons from Microsoft. At the moment there are more than 2020 distinct icons available in both filled and outlined versions and in various sizes. In total the collections consists of well over 11k icons in SVG format. All of them have been added to the library in an easy to use way by addition of the\n<FluentIcon> component. Just putting this in your .razor page:\n<FluentIcon Name=@FluentIcons.Accessibility Size=IconSize.Size32 Filled=true />\nWill give you this:\nCouple of things to unwrap here on the parameters used:\nNameis a string. To make it easier to select an icon, all names have been added as constants. IntelliSense will help you find the right one. It is also using the new .NET 6\nEditorRequiredfeature. If you don't pass a value for the\nName, Visual Studio will point that out to you in design time and raise a compile error when building.\nSizeis using an\nenumthat holds all the possible valid sizes. Note the not all sizes are available for all icons. If you get an error at run-time, supply a different size or remove the parameter to fall back to the default size (\nFilledis a bool to choose a filled (true) or a regular/outlined (false) version of an icon\nOther parameters (not shown in the example above) for this component are:\nUseAccentColor (bool). This parameter defaults to\ntrueand determines if the accent color is used for the fill or the outline when rendering the icon. When setting this to false, the icon will be rendered in black.\nNeutralCultureName (string). Some icons offer alternative versions for specific languages. By supplying the two letter neutral language code, you can indicate that you would like to use that specific version of an icon. If there is no language specific version, the component will fall back to rendering the original version. Example:\nNeutral @FluentIcons.TextBold iconsFilled:Regular:French @FluentIcons.TextBold icons (\nSlot (string). With the slot parameter you can indicate where an icon needs to be rendered in the context of another component. For example when combining a <FluentButton> with a <FluentIcon>, you can use the slot to put the icon in font of the button text:\n<FluentButton @onclick=\"HandleSearch\">Search<FluentIcon Name=\"@FluentIcons.Search\" Size=\"@IconSize.Size16\" Filled=false Slot=\"start\"></FluentIcon></FluentButton>(note that the icon component is inserted after the button text) . This will render:\nThe temporary demo site has a page to search through all the available icons and sizes.\nIn earlier and other libraries you will often find icons being included by means of fonts. This comes with the disadvantage that the whole font needs to be downloaded even if you are only using 1 icon. I therefore stepped away from using that method in this library and opted for taking the SVG route. Because the icons are SVG files that are stored in the\nwwwroot folder in a RCL (Razor Class Library), they are being treated like ordinary static files on the server. They don’t get downloaded until requested and only the icons you are actually using will be downloaded. Sounds like a win-win to me!\nDesign Token support\nThe Fluent UI Web Components are built on FAST's Adaptive UI technology, which enables design customization and personalization, while automatically maintaining accessibility. This is accomplished through setting various \"Design Tokens\". In previous versions of this library, the only way to manipulate the design tokens was through using the\n<FluentDesignSystemProvider> component. This Blazor component (and it's underlying Web Component) exposed a little over 60 variables that could be used to change things like typography, color, sizes, UI spacing, etc. FAST has been extended a while ago and now has a much more granular way of working with individual design tokens instead of just through a design system provider model. See https://docs.microsoft.com/en-us/fluent-ui/web-components/design-system/design-tokens for more information on how Design Tokens work.\nIn total there are now over 160 distinct tokens defined in the Adaptive UI model and as of version 1.4 of this library you can use all these in Blazor as well! The implementation has been in the works for multiple months but I think the end result is quite flexible. It allows for usage both from code as in a declarative way in your .razor pages. The two ways of working with design tokens are described below (taken from the repository readme):\nOption 1: Using Design Tokens from C# code\nGiven the following .razor page fragment:\n<FluentButton @ref=\"ref1\" Appearance=\"Appearance.Filled\">A button</FluentButton> <FluentButton @ref=\"ref2\" Appearance=\"Appearance.Filled\">Another button</FluentButton> <FluentButton @ref=\"ref3\" Appearance=\"Appearance.Filled\">And one more</FluentButton> <FluentButton @ref=\"ref4\" Appearance=\"Appearance.Filled\" @onclick=OnClick>Last button</FluentButton>\nYou can use Design Tokens to manipulate the styles from C# code as follows:\n[Inject] private BaseLayerLuminance BaseLayerLuminance { get; set; } = default!; [Inject] private AccentBaseColor AccentBaseColor { get; set; } = default!; [Inject] private BodyFont BodyFont { get; set; } = default!; [Inject] private StrokeWidth StrokeWidth { get; set; } = default!; [Inject] private ControlCornerRadius ControlCornerRadius { get; set; } = default!; private FluentButton? ref1; private FluentButton? ref2; private FluentButton? ref3; private FluentButton? ref4; protected override async Task OnAfterRenderAsync(bool firstRender) { if (firstRender) { //Set to dark mode await BaseLayerLuminance.SetValueFor(ref1!.Element, (float)0.15); //Set to Excel color await AccentBaseColor.SetValueFor(ref2!.Element, \"#185ABD\".ToSwatch()); //Set the font await BodyFont.SetValueFor(ref3!.Element, \"Comic Sans MS\"); //Set 'border' width for ref4 await StrokeWidth.SetValueFor(ref4!.Element, 7); //And change conrner radius as well await ControlCornerRadius.SetValueFor(ref4!.Element, 15); StateHasChanged(); } } public async Task OnClick() { //Remove the wide border await StrokeWidth.DeleteValueFor(ref4!.Element); }\nAs can be seen in the code above (with the ref4.Element), it is posible to apply multiple tokens to the same component.\nFor Design Tokens that work with a color value, you must call the ToSwatch() extension method on a string value or use one of the Swatch constructors. This makes sure the color is using a format that Design Tokens can handle. A Swatch has a lot of commonality with the System.Drawing.Color struct. Instead of the values of the components being between 0 and 255, in a Swatch they are expressed as a value between 0 and 1.\nThe Design Tokens are manipulated through JavaScript interop working with an ElementReference. There is no JavaScript element until after the component is rendered. This means you can only work with the Design Tokens from code after the component has been rendered in OnAfterRenderAsync and not in any earlier lifecycle methods.\nOption 2: Using Design Tokens as components\nThe Design Tokens can also be used as components in a .razor page directely. It looks like this:\n<BaseLayerLuminance Value=\"(float?)0.15\"> <FluentCard BackReference=\"@context\"> <div class=\"contents\"> Dark <FluentButton Appearance=\"Appearance.Accent\">Accent</FluentButton> <FluentButton Appearance=\"Appearance.Stealth\">Stealth</FluentButton> <FluentButton Appearance=\"Appearance.Outline\">Outline</FluentButton> <FluentButton Appearance=\"Appearance.Lightweight\">Lightweight</FluentButton> </div> </FluentCard> </BaseLayerLuminance>\nTo make this work, a link needs to be created between the Design Token component and its child components. This is done with the BackReference=\"@context\" construct.\nOnly one Design Token component at a time can be used this way. If you need to set more tokens, use the code approach as described in Option 1 above.\nBesides these two new options, the original\n<FluentDesignSystemProvider> component is still there and can be used as always. There are no plans to remove this anytime soon.\nXML Comments\nAll components and all component parameters now have xml comments. This means that tools who support this, like Visual Studio IntelliSense, will show you information about methods and parameters when editing your razor pages and code, making it a bit easier to discover and understand functionallity:\nFocusAsync to FluentInputBase\nIt was not possible before to programmatically set focus to an\n<input> derived component like the\n<FluentTextField> or\n<FluentNumberField>. The base class\n<FluentInputBase> has now been extended to expose this method.",
    "meta": {
        "lang": "en",
        "lang_score": 0.8278772234916687,
        "url": "http://baaijte.net/blog/microsoft-fast-components-fluentui-1.4/",
        "timestamp": "2023-01-26T21:28:43Z",
        "cc-path": "crawl-data/CC-MAIN-2023-06/segments/1674764494826.88/warc/CC-MAIN-20230126210844-20230127000844-00000.warc.gz"
    },
    "quality_signals": {
        "fraction_of_duplicate_lines": 0.0,
        "fraction_of_characters_in_duplicate_lines": 0.0,
        "fraction_of_characters_in_most_common_ngram": [
            [
                2,
                0.013624321937681342
            ],
            [
                3,
                0.005676800807367226
            ],
            [
                4,
                0.006307556452630251
            ]
        ],
        "fraction_of_characters_in_duplicate_ngrams": [
            [
                5,
                0.01006550163121823
            ],
            [
                6,
                0.006917204437133578
            ],
            [
                7,
                0.004486008356999439
            ],
            [
                8,
                0.0022173651366847224
            ],
            [
                9,
                0.0
            ],
            [
                10,
                0.0
            ]
        ],
        "fraction_of_words_corrected_in_lines": 0.004379562043795621,
        "fraction_of_lines_ending_with_ellipsis": 0.0,
        "fraction_of_lines_starting_with_bullet_point": 0.0625,
        "fraction_of_lines_with_toxic_words": 0.0,
        "num_of_lines_with_toxic_words": 0,
        "num_of_toxic_words": 0,
        "word_count": 1364,
        "mean_word_length": 5.81158357771261,
        "num_of_sentences": 68,
        "symbol_to_word_ratio": 0.0021994134897360706,
        "fraction_of_words_with_alpha_character": 0.9648093841642229,
        "num_of_stop_words": 187,
        "num_of_paragraphs": 0,
        "has_curly_bracket": false,
        "has_lorem_ipsum": false
    }
}

Other Rules from RefinedWeb: We also adopt rules from RefinedWeb to remove lines if they satisfy any of the following criteria:

The line is only composed of uppercase characters,
the line is only composed of numerical characters
the line matches the pattern “r'^\d+\s+likes$
the line only contains one word.

Documents Filtered using RefinedWeb Rules.

Sample documents with lines that are removed by the RefinedWeb rules

Data sample: 0 of 119

{
"text": "German website for infos about \"Active Thermitic Material...\" launched\nWe launched a new website here for the german speaking part of the world with a short overview on the latest study and a easy link to keep in mind. (even if the name is not 100% scientifically correct)\nHere it is:\n- Sitting-Bull's blog\n- Login to post comments\n- Login to post comments\n- 1 vote\nhere is the translation from german into english thanks to\nJohn A MITCHELL\nHerblay FRANCE\nbonsoir ,\nhere is the translation from german into english thanks to\nhttp://translate.google.com/translate_t?hl=fr&sl=fr#\nhttp://translate.google.com/translate?js=n&prev=_t&hl=fr&ie=UTF-8&u=http...\nYours\nJohn\n- Login to post comments\n- 1 vote\nMy translation\n(Because automated translations of difficult texts are often messy :-)\n\"On September 11th, 2001, the New York skyline changed dramatically. Now it has been scientifically proven, that this happened with the aid of explosives.\"\nIndependent scientists from Copenhagen University, multiple universities in the USA and an employee of an Australian company have investigated Ground Zero dust in the form of a two-year research paper. They have, inter alia, verified the remnants of an explosive in the nano-thermite category.\nUpon looking at the collapsing buildings on TV , the Danish scientist Dr. Harrit became doubtful. He couldn't explain, how the building collapses could have occurred in this amazing symmetry. In the recent years already numerous scientists have criticized, that although skyscrapers the scale of the Twin Towers burn after an airplane impact, they do not collapse in the manner and way observed. They assume that even the impact of the planes and also the heat from the fires together do not suffice to compromise the complete high-rise. In fact, that's how, one way or another, so many people died - the people on the floors above and below were, depending on their distance to the explosion, harmed little or not at all. In comparable accidents it came, not lastly due to kerosine, to considerable explosions, but these buildings remained standing. You should also look at the statements of Frank DeMartini in this context. Especially the pictures of the imploding WTC 7 in the afternoon of September 11 are strongly reminiscent of controlled demolition of buildings, which you see on the TV now and then.\nThe chemist Niels H. Harrit also couldn't understand, how the third building, WTC 7, without any effect of an airplane could collapse hours later. In only 6.5 seconds the tower is officially reduced to dust. A third plane crash wasn't listed in this sector, and also nothing is known of additional terrorist attacks - WTC 7 was only hit by falling debris of other buildings. How did we get here?\nThe official investigations have taken almost seven years, and claim in effect that the failure of a sole column in WTC 7, column number 79, triggered this accident. If this is correct, all high-rises should have a security check-up.\nWhen Harrit saw the the images on TV, the scientist of the University of Copenhagen suspected criminal involvement as the cause for this occurrence. However, in contrast to many other people, he didn't blindly trust the statements of reporters. Instead, he researched on his own. And tested dust residue. In the process, a relatively new substance was discovered, which is only known since the mid-nineties. So-called nano- or super-thermite.\nThermite is the common name for a mixture of iron oxide and aluminium powder. It's regularly used in welding. While the material is stable at room temperature, one can ignite it, provided it is exposed to adequate activation energy. The ensuing chemical reaction heats all substances to 3000 centrigrade, and all components become liquid because of the temperatures reached. Because burning thermite doesn't require external oxygen, the chemical reaction can't be 'choked'. It can be ignited under virtually all conditions and continues to burn without oxygen. Attempts to extinguish with water lead to a worsening of the situation, amongst other things explosive hydrogen mixtures are created.\nNanotechnology played a key role in this crime, however it was executed. It demagnified the thermite aluminum particles present in thermite, so that larger reactive surface areas proportional to substance volume formed, and therefore even faster reaction speeds were accomplished. Materials such as this are only available to the universities researching them or the military. In the opinion of the researcher, such materials are never a normal building component, so under normal circumstances they should have never been found in the rubble. It's also unclear, whether the nano-thermite was primarily employed to heat the columns or also to explode the building. Both are thus possible, the substance is used to propel rockets and contains more energy than conventional dynamite. In a television interview on a Danish tv-channel (see below) Dr. Harrit said, on the basis of the residue of iron-rich microspheres, a byproduct of a thermite reaction, that also came to be after ignition of the red-grey chips, that he assumes 10 to 100 tons of explosives saw use on September 11th.\nThe international team of researchers refuses to speculate, how the discovered nano-thermite could have been put into place. This void is filled (among others) by scientists Jim Hoffman and Gordon Ross, which have postulated theories. Most likely both properties of the substance nano-thermite, which is employed for military purposes, were used, in other words pressure/volume work as well as high temperatures. The nano-thermite reaction can melt iron very quickly but can also result in explosions. Thus, the nano-thermite most probably melted respectively weakened the load-bearing columns due to the extreme heat of the chemical reaction. The detonations with this explosive were precisely timed, following through floor by floor from top to bottom. Some recordings of the day of the catastrophe and also various eyewitness statements already brought numerous indications for exactly this theory.\nFrom the beginning, the plane crashes themselves and thus locating the terrorist backers were focal points. How though, can one predict such a tragedy, when one doesn't plan it themselves? How did the people, who managed to accomplish the explosive demolitions, manage to take control over multiple planes and steer them into the buildings? How did it happen that a third building collapsed in record time without being affected by a plane crash? Who is behind this and why? Would it have been possible, without the terrible events of this day, to pass all these new security laws in the USA? Many new questions arise in this background.\nOn this website we want to present this research as well as German translations - as well as television and radio interviews of researchers involved. If you have questions to the authors of the paper, please use the contact form. Insofar as they are authentic, important questions, they will be forwarded to, and answered by, the team of authors.",
"meta": {
"lang": "en",
"lang_score": 0.9624165296554565,
"url": "http://911blogger.com/news/2009-05-12/german-website-infos-about-active-thermitic-material-launched",
"timestamp": "2023-11-28T11:00:38Z",
"cc-path": "crawl-data/CC-MAIN-2023-50/segments/1700679099281.67/warc/CC-MAIN-20231128083443-20231128113443-00000.warc.gz"
},
"removed_lines_by_refinedweb_rules": [
"http://translate.google.com/translate_t?hl=fr&sl=fr#",
"http://translate.google.com/translate?js=n&prev=_t&hl=fr&ie=UTF-8&u=http...",
"Yours",
"John"
]
}

Toxic Lines: When manually inspecting the data, we found that there are some adult ads in the beginning or end of the document (with a sample shown below), which are hard to remove via document-level filtering strategies. Inspired by this, we develop line-level detoxification using a bad word list from LDNOOBW (+ rule: word length < 10 + the line is in the first 3 lines or in the last 3 lines) to remove toxic lines. Specifically, we do not only consider the bad words from English but also consider the bad words from other languages.

Toxic Line Examples (WARNING: MAY CONTAIN OFFENSIVE MATERIAL)

Sample documents with toxic lines

{
    "text": "Welcome to Sylor ! this is one of the digital agency for in design street, newyork.\nWelcome to Sylor ! this is one of the digital agency for in design street, newyork.\nWelcome to Sylor ! this is one of the digital agency for in design street, newyork.\nVisual Designer\nNow were up in the big leagues getting' our We finally got a piece of the pie.\nVisual Designer\nNow were up in the big leagues getting' our We finally got a piece of the pie.\n\n男女操逼视频免费 | 2zb0dp.cha0095.cn fp0.pawmoble.cn",
    "meta": {
        "lang": "en",
        "lang_score": 0.9283909201622009,
        "url": "http://3ab.l2pvbvb.cn/portfolio.html",
        "timestamp": "2022-05-16T05:34:05Z",
        "cc-path": "crawl_data/CC-MAIN-2022-21/segments/1652662509990.19/warc/CC-MAIN-20220516041337-20220516071337-00000.warc.gz"
    }
}

Document-Level Filtering

In this section, we introduce each quality signal used to filter out low-quality documents.

Quality Signals Used For Filtering

Overview of all the quality signals that are used for filtering

[
    "fraction_of_duplicate_lines",
    "fraction_of_characters_in_duplicate_lines",
    "fraction_of_characters_in_most_common_ngram",
    "fraction_of_characters_in_duplicate_ngrams",
    "fraction_of_words_corrected_in_lines",
    "fraction_of_lines_ending_with_ellipsis",
    "fraction_of_lines_starting_with_bullet_point",
    "fraction_of_lines_with_toxic_words",
    "num_of_lines_with_toxic_words",
    "num_of_toxic_words",
    "word_count",
    "mean_word_length",
    "num_of_sentences",
    "symbol_to_word_ratio",
    "fraction_of_words_with_alpha_character",
    "num_of_stop_words",
    "num_of_paragraphs",
    "has_curly_bracket",
    "has_lorem_ipsum"
]

Similar to previous sections, we will present sample documents filtered out by the given quality signals. Most quality signals were initially introduced by Gopher and subsequently adopted by later studies (. However, we observed that, despite following the same descriptions, the implementation of each quality signal can vary significantly among different dataset pipelines, resulting in disparate outcomes for the same quality signals. In our pipeline, we referenced earlier implementations that were publicly available such as Dolma, DataTrove, and RedPajama V2, and selected the most suitable method based on manual inspections.

Repetition-based Heuristics: Many documents contain repeated sequences, potentially due to crawling errors or low-quality sources. In line with previous work, we choose to remove any document with excessive line, paragraph, or n-gram repetitions.

Fraction of Characters in Repeated Lines: Following Gopher, we remove documents containing multiple, short duplicate passages, as well as those with few, but longer duplicate passages. To achieve this goal, we calculate over the document both the fraction of passages that are duplicates, and the fraction of characters contained within those duplicated passages.

Implementations from Dolma

words = text.split() word_count = len(words) character_count = sum(len(word) for word in words) ... lines = text.split(" ") line_count = len(lines) ... line_counts = Counter(lines) attrs.fraction_of_duplicate_lines = sum(count for line, count in line_counts.items() if count > 1) / max( line_count, 1 ) attrs.fraction_of_characters_in_duplicate_lines = sum( len(line) * count for line, count in line_counts.items() if count > 1 ) / max(character_count, 1)

Implementations from DataTrove

def find_duplicates(x: list[str]) -> tuple[int, int]: unique_x = set() duplicate_chars = 0 duplicate_elements = 0 for element in x: if element in unique_x: duplicate_chars += len(element) duplicate_elements += 1 else: unique_x.add(element) return duplicate_elements, duplicate_chars ... self.paragraph_exp = re.compile(r" {2,}") self._line_splitter = re.compile(" +") ... paragraphs = self.paragraph_exp.split(text.strip()) paragraphs_duplicates, char_duplicates = find_duplicates(paragraphs) if self.dup_para_frac and paragraphs_duplicates / len(paragraphs) > self.dup_para_frac: return False, "dup_para_frac" if self.dup_para_char_frac and char_duplicates / len(text) > self.dup_para_char_frac: return False, "dup_para_char_frac" lines = self._line_splitter.split(text) line_duplicates, char_duplicates = find_duplicates(lines) if self.dup_line_frac and line_duplicates / len(lines) > self.dup_line_frac: return False, "dup_line_frac" if self.dup_line_char_frac and char_duplicates / len(text) > self.dup_line_char_frac: return False, "dup_line_char_frac"

After evaluating the implementations of Dolma and DataTrove (note: RedPajama V2 does not implement these two quality signals), we have made the following decisions:

Passage Separation: Our manual review of the data revealed that documents extracted using trafilatura do not feature more than one newline symbol separating passages. Testing the splitting pattern "\n(2,)" on 10,000 sample documents resulted in no more than one split. Consequently, we decided to disregard the distinction between lines and paragraphs in our implementation, opting instead to use a single newline symbol to segment the text into passages.

First Occurrence: In line with DataTrove's implementation, we chose to exclude the first occurrence. This more conservative strategy helps retain a larger number of documents.

Character Count: We adjusted the method in Dolma for counting characters within lines by excluding whitespace. This modification ensures consistency with the overall document character count calculation.

TxT360 Implementation

words = text.split() word_count = len(words) character_count = sum(len(word) for word in words) ... lines = text.split(" ") line_count = len(lines) line_counts = Counter(lines) attrs.fraction_of_duplicate_lines = ( sum((count - 1) for line, count in line_counts.items() if count > 1) / line_count ) attrs.fraction_of_characters_in_duplicate_lines = ( sum(sum(len(w) for w in line.split()) * (count - 1) for line, count in line_counts.items() if count > 1) / character_count

Excessive Line and Character Repetition Filtered Examples

Sample documents filtered by excessive line repetitions / characters in repeated lines

Data sample: 0 of 99

{
    "text": "This article is made up of亚博体育 ，AI learns through the Internet and automatically writes, does not represent our position, reprinted, contact the author and indicate the source：http://www.afconthefield.com/292619lk.html\nThis article is made up of亚博体育 ，AI learns through the Internet and automatically writes, does not represent our position, reprinted, contact the author and indicate the source：http://www.afconthefield.com/292619lk.html",
    "meta": {
        "lang": "en",
        "lang_score": 0.726877748966217,
        "url": "http://afconthefield.com/292619lk.html",
        "timestamp": "2023-01-26T22:53:49Z",
        "cc-path": "crawl-data/CC-MAIN-2023-06/segments/1674764494826.88/warc/CC-MAIN-20230126210844-20230127000844-00000.warc.gz"
    },
    "quality_signals": {
        "fraction_of_duplicate_lines": 0.5,
        "fraction_of_characters_in_duplicate_lines": 0.5,
        "fraction_of_characters_in_most_common_ngram": [
            [
                2,
                0.057291666666666664
            ],
            [
                3,
                0.06770833333333333
            ],
            [
                4,
                0.08854166666666667
            ]
        ],
        "fraction_of_characters_in_duplicate_ngrams": [
            [
                5,
                0.837068443367656
            ],
            [
                6,
                0.8166409861325116
            ],
            [
                7,
                0.7967770814682185
            ],
            [
                8,
                0.7765830346475507
            ],
            [
                9,
                0.7532467532467533
            ],
            [
                10,
                0.7272124627113026
            ]
        ],
        "fraction_of_words_corrected_in_lines": 0.0,
        "fraction_of_lines_ending_with_ellipsis": 0.0,
        "fraction_of_lines_starting_with_bullet_point": 0.0,
        "fraction_of_lines_with_toxic_words": 0.0,
        "num_of_lines_with_toxic_words": 0,
        "num_of_toxic_words": 0,
        "word_count": 54,
        "mean_word_length": 7.111111111111111,
        "num_of_sentences": 1,
        "symbol_to_word_ratio": 0.0,
        "fraction_of_words_with_alpha_character": 1.0,
        "num_of_stop_words": 16,
        "num_of_paragraphs": 0,
        "has_curly_bracket": false,
        "has_lorem_ipsum": false,
        "orig_text_has_dup_lines": false
    }
}

Fraction of Characters in the Most Common N-grams (n=2,3,4): Following Gopher, we remove documents with a high portion of n-grams. For each n ∈ (2, 3, 4), we calculate the fraction of characters contained within the most frequently-occurring n-gram.

Implementations from Dolma

def all_ngram_counts(words) -> List[Tuple[int, CounterType[Tuple[str, ...]]]]: return [(n, Counter(list(zip(*[words[i:] for i in range(n)])))) for n in range(2, 11)] ... all_counts = all_ngram_counts(words) count_most_common_ngrams = (2, 3, 4) for n, ngram_counts in all_counts: if not ngram_counts: continue if n in count_most_common_ngrams: most_common_ngram, count = ngram_counts.most_common(1)[0] value = count * sum(len(w) for w in most_common_ngram) / max(character_count, 1) attrs.fraction_of_characters_in_most_common_ngram.append((n, value))

Implementations from RedPajama-V2

class Base_RPS_Frac_Chars_In_Top_NGram(RPSBase): # noqa # Base class for calculating the fraction of characters in the top N-gram. # This operates on the lower-cased, punctation removed content. NGRAM_SIZE: int = None __slots__ = [] def __call__(self, document: Document) -> SignalType: if self.NGRAM_SIZE is None: raise NotImplementedError( "NGRAM_SIZE must be set in the subclass" ) # get the most common ngram most_common_ngram = Counter( # fetch the ngrams from the document if they exist, otherwise # compute them getattr(document, f"norm_self.NGRAM_SIZEgrams", None) or form_ngrams(iter(document.normalized_words), self.NGRAM_SIZE) ).most_common(1) if len(most_common_ngram) == 0: return [(0, len(document), 0.0)] ngram, count = most_common_ngram[0] if count <= 1: return [(0, len(document), 0.0)] total_chars = sum(len(w) for w in document.normalized_words) score = sum(len(w) for w in ngram) * count / total_chars score = round(score, PRECISION) return [(0, len(document), score)]

Implementations from DataTrove

def get_n_grams(words: list[str], n: int) -> list[str]: return [" ".join(words[i : i + n]) for i in range(len(words) - n + 1)] def find_top_duplicate(x: list[str]) -> int: counter = Counter() for element in x: counter[element] += 1 top_n_gram = counter.most_common(1)[0] return len(top_n_gram[0]) * top_n_gram[1] ... for n, n_frac in self.top_n_grams: n_grams = get_n_grams(words, n) if not n_grams: continue top_char_length = find_top_duplicate(n_grams) if top_char_length / len(text) > n_frac: return False, f"top_n_gram"

There are almost no contradictions between each implementations of fractions of characters in the most common n-gram. The main process involves counting the occurrences of each n-gram and selecting the most common one. The fraction is then determined by dividing the number of characters in the most common n-gram by the total number of characters. One minor difference is that Dolma and DataTrove calculate the fraction of the most common n-gram even if it only appears once, while RedPajama V2 skips this case. We choose to follow Dolma and DataTrove by not skipping cases where the most common n-gram occurs only once. In practice, documents affected by this rule — where the most common n-gram exceeds a given threshold and occurs only once — tend to be short.

TxT360 Implementation

def all_ngram_counts_new(words) -> List[Tuple[int, CounterType[Tuple[str, ...]]]]: return [(n, list(zip(*[words[i:] for i in range(n)]))) for n in range(2, 11)] ... all_counts = all_ngram_counts_new(words) count_most_common_ngrams = (2, 3, 4) for n, ngram_counts in all_counts: if not ngram_counts: continue if n in count_most_common_ngrams: most_common_ngram, count = Counter(ngram_counts).most_common(1)[0] value = count * sum(len(w) for w in most_common_ngram) / character_count attrs.fraction_of_characters_in_most_common_ngram.append((n, value))

Documents Filtered Using Most Common n-Grams (n=2,3,4)

Sample documents filtered by the fraction of characters in the most common n-grams (n=2,3,4)

Data sample: 0 of 30

{
    "text": "Garage Door Services Garage Door Track Replacement Residential Garage Door Garage Door Garage Door Maintenance Garage Door Repair Near me\nRecent Posts\n- Preventing Commercial Garage Door Crises With Regular Maintenance And Inspections\n- When Your Business Needs Help: A Guide To Emergency Commercial Garage Door Services\n- What kind of training and expertise should I expect from technicians providing emergency commercial garage door services?\n- What are the most common commercial garage door emergencies and how can they be addressed?\n- How Important Is It For Businesses To Have A Go-To Emergency Commercial Garage Door Service?",
    "meta": {
        "lang": "en",
        "lang_score": 0.7771062254905701,
        "url": "http://palmsgaragedoors.com/project/garage-door-track-replacement-los-gatos/attachment/garage-door-track-replacement-los-gatos-5/",
        "timestamp": "2023-11-28T10:52:42Z",
        "cc-path": "crawl-data/CC-MAIN-2023-50/segments/1700679099281.67/warc/CC-MAIN-20231128083443-20231128113443-00000.warc.gz",
        "url_score": 0.0
    },
    "quality_signals": {
        "url_score": 0.0,
        "fraction_of_duplicate_lines": 0.0,
        "fraction_of_characters_in_duplicate_lines": 0.0,
        "fraction_of_duplicate_paragraphs": 0.0,
        "fraction_of_characters_in_duplicate_paragraphs": 0.0,
        "fraction_of_characters_in_most_common_ngram": [
            [
                2,
                0.20522388059701493
            ],
            [
                3,
                0.1865671641791045
            ],
            [
                4,
                0.1623134328358209
            ]
        ],
        "fraction_of_characters_in_duplicate_ngrams": [
            [
                5,
                0.0
            ],
            [
                6,
                0.0
            ],
            [
                7,
                0.0
            ],
            [
                8,
                0.0
            ],
            [
                9,
                0.0
            ],
            [
                10,
                0.0
            ]
        ],
        "fraction_of_words_corrected_in_lines": 0.0,
        "fraction_of_lines_ending_with_ellipsis": 0.0,
        "fraction_of_lines_starting_with_bullet_point": 0.7142857142857143,
        "fraction_of_lines_with_toxic_words": 0.0,
        "num_of_lines_with_toxic_words": 0,
        "num_of_toxic_words": 0,
        "word_count": 97,
        "mean_word_length": 5.525773195876289,
        "num_of_sentences": 3,
        "symbol_to_word_ratio": 0.0,
        "fraction_of_words_with_alpha_character": 0.9484536082474226,
        "num_of_stop_words": 12,
        "num_of_paragraphs": 0,
        "has_curly_bracket": false,
        "has_lorem_ipsum": false,
        "orig_text_has_dup_lines": false
    }
}

Fraction of Characters in Duplicated N-grams (n=5,...,10): Following Gopher, we remove documents with a high portion of n-grams. For each n ∈ (5, ..., 10), we calculate the fraction of characters contained within all duplicate n-grams, taking care not to count characters that occur in overlapping n-grams more than once.

Implementations from Dolma

def all_ngram_counts(words) -> List[Tuple[int, CounterType[Tuple[str, ...]]]]: return [(n, Counter(list(zip(*[words[i:] for i in range(n)])))) for n in range(2, 11)] ... all_counts = all_ngram_counts(words) for n, ngram_counts in all_counts: if not ngram_counts: continue if n in count_most_common_ngrams: ... else: ng_char_count = sum(count * sum(len(w) for w in ng) for ng, count in ngram_counts.items()) value = sum( count * sum(len(w) for w in ng) for ng, count in ngram_counts.items() if count > 1 ) / max(ng_char_count, 1) attrs.fraction_of_characters_in_duplicate_ngrams.append((n, value))

Implementations from RedPajama-V2

class Base_RPS_Frac_Chars_In_Dupe_NGrams(RPSBase): # noqa # Base class for calculating the fraction of characters in # duplicate word N-grams. This operates on the lower-cased, # punctation removed content. The function also ensures that # characters in overlapping ngrams are only counted once. NGRAM_SIZE: int = None __slots__ = [] def __call__(self, document: Document) -> SignalType: if self.NGRAM_SIZE is None: raise NotImplementedError( "NGRAM_SIZE must be set in the subclass" ) if len(document.normalized_words) < self.NGRAM_SIZE: return [(0, len(document), 0.0)] # fetch the ngrams from the document if they exist, otherwise # compute them doc_n_grams = ( getattr(document, f"norm_self.NGRAM_SIZEgrams", None) or tuple(form_ngrams( iter(document.normalized_words), self.NGRAM_SIZE )) ) # keep only ngrams which occur at least twice ngram_dupes = ngram for ngram, count in Counter(doc_n_grams).items() if count > 1 duplicated_grams = np.zeros(len(document.normalized_words), dtype=int) i = 0 for ngram in doc_n_grams: if ngram in ngram_dupes: duplicated_grams[i: i + self.NGRAM_SIZE] = 1 i += 1 word_lengths = np.array(list(map(len, document.normalized_words))) chars_duped = np.sum(word_lengths * duplicated_grams) total_chars = np.sum(word_lengths) if total_chars == 0: return [(0, len(document), 0.0)] score = float(chars_duped / total_chars) score = round(score, PRECISION) return [(0, len(document), score)]

Implementations from DataTrove

def find_all_duplicate(words: list[str], n: int) -> int: n_words = len(words) unique = set() repeated_chars, idx = 0, 0 while idx < n_words - n + 1: n_gram = "".join(words[idx : idx + n]) if n_gram in unique: repeated_chars += len(n_gram) idx += n else: unique.add(n_gram) idx += 1 assert repeated_chars <= len("".join(words)) return repeated_chars ... for n, n_frac in self.dup_n_grams: n_duplicates_char = find_all_duplicate(words, n) if n_duplicates_char / len(text) > n_frac: return False, f"duplicated_n_grams"

For the computation of fraction of characters in duplicate n-gram, Dolma uses the number of characters in all n-grams (with overlapping) as the denominator, and uses the number of characters in all duplicated n-grams (with overlapping) as the numerator.

RedPajama V2 uses the number of all characters in (the words of) the document (without overlapping) as the denominator, and uses the number of characters that are recognized as part of the duplicate n-gram as the numerator.

Datatrove uses the number of all characters in the document (including white spaces, without overlapping) as the denominator, and uses the number of characters that are recognized as duplicate n-gram as the numerator. However, there is a mismatch in DataTrove’s calculation, as the number of characters in the duplicated n-grams excludes white spaces, while the total character count of the document does not.

We decided to use the RedPajama V2 implementation but skip the 1st occurrence of the duplicate n-gram.

TxT360 Implementation

def get_dup_ngram_frac(n, doc_n_grams, text): # fetch the ngrams from the document if they exist, otherwise compute them # doc_n_grams = list(zip(*[words[i:] for i in range(n)])) duplicated_grams = np.zeros(len(text.split()), dtype=int) unique_ngrams = set() for i, ngram in enumerate(doc_n_grams): if ngram in unique_ngrams: duplicated_grams[i: i + n] = 1 else: unique_ngrams.add(ngram) word_lengths = np.array(list(map(len, text.split()))) chars_duped = np.sum(word_lengths * duplicated_grams) total_chars = np.sum(word_lengths) return float(chars_duped / total_chars) def all_ngram_counts_new(words) -> List[Tuple[int, CounterType[Tuple[str, ...]]]]: return [(n, list(zip(*[words[i:] for i in range(n)]))) for n in range(2, 11)] ... all_counts = all_ngram_counts_new(words) count_most_common_ngrams = (2, 3, 4) for n, ngram_counts in all_counts: if not ngram_counts: continue if n in count_most_common_ngrams: ... else: score = get_dup_ngram_frac(n, ngram_counts, text) attrs.fraction_of_characters_in_duplicate_ngrams.append((n, score))

Comparison of Coding Implementations

Considering n = 5 and the sample sentence: "word_a word_b word_c word_d word_e word_f word_g word_a word_b word_c word_d word_e word_f word_g word_a word_b word_c" In Dolma's implementation, there are 13 5-grams in total with 6 duplicated 5-grams. The resulting fraction of characters in duplicate 5-gram is 6/13. In RedPajama's V2 implementation, there are 17*6 characters in total and 14*6 characters that are contained in duplicate 5-grams. The fraction is 14/17. In DataTrove's implementation, there are 17*6 + 16(white spaces) characters in total and 10 duplicated 5-grams after excluding the first occurrence. The resulting fraction number is 10*6/(17*6+16). In our implementation, there are 17*6 characters in total with 10*6 characters that are duplicated after excluding the first occurence. This results in a fraction of 10/17.

Documents Filtered by Duplicated n-Grams (n=5,...,10)

Sample documents filtered by the fraction of characters in duplicated n-grams (n=5,...,10)

Data sample: 0 of 99

{
    "text": "We use cookies to enable essential functionality on our website, and analyze website traffic. By clicking Accept you consent to our use of cookies. Read about how we use cookies.\nWe use cookies to enable essential functionality on our website, and analyze website traffic. Read about how we use cookies.\nThese cookies are strictly necessary to provide you with services available through our websites. You cannot refuse these cookies without impacting how our websites function. You can block or delete them by changing your browser settings, as described under the heading \"Managing cookies\" in the Privacy and Cookies Policy.\nThese cookies collect information that is used in aggregate form to help us understand how our websites are being used or how effective our marketing campaigns are.",
    "meta": {
        "lang": "en",
        "lang_score": 0.928011953830719,
        "url": "http://aboutsources.com/formal/",
        "timestamp": "2023-11-28T08:41:54Z",
        "cc-path": "crawl-data/CC-MAIN-2023-50/segments/1700679099281.67/warc/CC-MAIN-20231128083443-20231128113443-00000.warc.gz",
        "url_score": 0.0
    },
    "quality_signals": {
        "url_score": 0.0,
        "fraction_of_duplicate_lines": 0.0,
        "fraction_of_characters_in_duplicate_lines": 0.0,
        "fraction_of_duplicate_paragraphs": 0.0,
        "fraction_of_characters_in_duplicate_paragraphs": 0.0,
        "fraction_of_characters_in_most_common_ngram": [
            [
                2,
                0.029985007496251874
            ],
            [
                3,
                0.035982008995502246
            ],
            [
                4,
                0.041979010494752625
            ]
        ],
        "fraction_of_characters_in_duplicate_ngrams": [
            [
                5,
                0.15742128935532235
            ],
            [
                6,
                0.15742128935532235
            ],
            [
                7,
                0.1199400299850075
            ],
            [
                8,
                0.1199400299850075
            ],
            [
                9,
                0.1199400299850075
            ],
            [
                10,
                0.1199400299850075
            ]
        ],
        "fraction_of_words_corrected_in_lines": 0.0,
        "fraction_of_lines_ending_with_ellipsis": 0.0,
        "fraction_of_lines_starting_with_bullet_point": 0.0,
        "fraction_of_lines_with_toxic_words": 0.0,
        "num_of_lines_with_toxic_words": 0,
        "num_of_toxic_words": 0,
        "word_count": 126,
        "mean_word_length": 5.2936507936507935,
        "num_of_sentences": 9,
        "symbol_to_word_ratio": 0.0,
        "fraction_of_words_with_alpha_character": 1.0,
        "num_of_stop_words": 21,
        "num_of_paragraphs": 0,
        "has_curly_bracket": false,
        "has_lorem_ipsum": false,
        "orig_text_has_dup_lines": false
    }
}

Line-wise Heuristics: Some line-wise information could also be helpful to distinguish low-quality and high-quality documents. Following RefinedWeb, we remove the document if the corrected lines represent more than 5% of words. In line with previous works, we remove the documents if more than 30% of the lines end with an ellipsis or more than 90% of lines start with a bullet point.

Ellipsis Symbol Identification Implemetations

Dolma:

ELLIPSIS_SYMBOLS = ("…")

RedPajamaV2:

ELLIPSIS_SYMBOLS = ("...", "…")

DataTrove:

ELLIPSIS_SYMBOLS = ("...", "…")

TxT360:

ELLIPSIS_SYMBOLS = ("...", "…", "[...]", "[…]")

Bullet Point Identification Implemetations

Dolma:

BULLET_POINTS = ("*", "-"

RedPajamaV2:

BULLET_POINT_SYMBOLS = ( "•", # bullet point "‣", # triangular bullet point "▶", # black right pointing triangle "◀", # black left pointing triangle "◦", # white bullet point "■", # black square "□", # white square "▪", # black small square "▫", # white small square "–", # en dash )

DataTrove:

BULLET_POINT_SYMBOLS = ("•" , "-")

TxT360:

BULLET_POINT_SYMBOLS = ( "•", # • bullet point "‣", # ‣ triangular bullet point "▶", # ▶ black right pointing triangle "◀", # ◀ black left pointing triangle "◦", # ◦ white bullet point "■", # ■ black square "□", # □ white square "▪", # ▪ black small square "▫", # ▫ white small square "-", # - en dash "–", # – dash "—", # — zh dash "*", # * star )

Documents Filtered by Line-Wise Heuristics

Sample documents that are filtered out by line-wise heuristics

Data sample: 0 of 99

{
    "text": "Month: September 2019\n2 Things To Consider Before Accepting A Job Offer\nYou spend several days to write an impressive resume. You mailed it to several companies and now you are patiently waiting for their response. After…\nTips for choosing the Right School for your Child\nSo, your child has finally reached the school-going age? Well, that is good news. Did you know that the type of school you choose for…\nDoes neurofeedback help people with ADHD? Find out here\nSadly, Attention Deficit Hyperactivity Disorder or ADHD affects a lot of people around the world. They usually lose their impulse control, their attention is always…",
    "meta": {
        "lang": "en",
        "lang_score": 0.9607195258140564,
        "url": "http://cadcamperformance.com/2019/09",
        "timestamp": "2023-11-28T09:49:03Z",
        "cc-path": "crawl-data/CC-MAIN-2023-50/segments/1700679099281.67/warc/CC-MAIN-20231128083443-20231128113443-00000.warc.gz",
        "url_score": 0.0
    },
    "quality_signals": {
        "url_score": 0.0,
        "fraction_of_duplicate_lines": 0.0,
        "fraction_of_characters_in_duplicate_lines": 0.0,
        "fraction_of_duplicate_paragraphs": 0.0,
        "fraction_of_characters_in_duplicate_paragraphs": 0.0,
        "fraction_of_characters_in_most_common_ngram": [
            [
                2,
                0.03435114503816794
            ],
            [
                3,
                0.03625954198473282
            ],
            [
                4,
                0.03816793893129771
            ]
        ],
        "fraction_of_characters_in_duplicate_ngrams": [
            [
                5,
                0.0
            ],
            [
                6,
                0.0
            ],
            [
                7,
                0.0
            ],
            [
                8,
                0.0
            ],
            [
                9,
                0.0
            ],
            [
                10,
                0.0
            ]
        ],
        "fraction_of_words_corrected_in_lines": 0.0,
        "fraction_of_lines_ending_with_ellipsis": 0.42857142857142855,
        "fraction_of_lines_starting_with_bullet_point": 0.0,
        "fraction_of_lines_with_toxic_words": 0.0,
        "num_of_lines_with_toxic_words": 0,
        "num_of_toxic_words": 0,
        "word_count": 105,
        "mean_word_length": 4.9904761904761905,
        "num_of_sentences": 7,
        "symbol_to_word_ratio": 0.02857142857142857,
        "fraction_of_words_with_alpha_character": 0.9809523809523809,
        "num_of_stop_words": 21,
        "num_of_paragraphs": 0,
        "has_curly_bracket": false,
        "has_lorem_ipsum": false,
        "orig_text_has_dup_lines": false
    }
}

Statistics-based Heuristics: We summarize other statistics-based rules originated from Gopher in this section. The statistics can be used include:

the word count in the document
the mean word length
the number of sentences
the symbol-to-word ratio
the fraction of alphabetic words
and the number of stop words

Specifically, we remove any document which satisfies any of the following criteria:

it contains less than 50 words or more than 100,000 words
its mean word length is outside the range of 3 to 10
it contains less than 3 sentences
its symbol-to-word ratio is greater than 0.1
the words that contain at least one alphabetic character are less than 80% of the whole words
it contains less than two of the stop words (the, be, to, of, and, that, have, with

Word Count Filters

Implementations from Dolma

words = text.split() word_count = len(words)

Implementations from RedPajama-V2

# the normalized content: lowercased and punctuation removed self._normalized_content = normalize(content) self._normalized_words = tuple(self._normalized_content.split()) self._num_normalized_words = len(self._normalized_words) ... def normalize( text: str, remove_punct: bool = True, lowercase: bool = True, nfd_unicode: bool = True, white_space: bool = True ) -> str: #Normalize the text by lowercasing and removing punctuation. # remove punctuation if remove_punct: text = text.translate(TRANSLATION_TABLE_PUNCTUATION) # lowercase if lowercase: text = text.lower() if white_space: text = text.strip() text = re.sub(r"\s+", " ", text) # NFD unicode normalization if nfd_unicode: text = unicodedata.normalize("NFD", text) return text

Implementations from DataTrove

words = self.tokenizer.word_tokenize(text) n_words = len(words) non_symbol_words = [w for w in words if any(ch not in PUNCTUATION_SET for ch in w)] n_non_symbol_words_words = len(non_symbol_words)

Both Dolma and RedPajama V2 split texts into words using white spaces and newline symbols. However, DataTrove employs a tokenizer to split texts into words and ignore punctuation, resulting in a higher word count compared to simple `text.split()`. We decided to use simple `len(text.split())` to compute the word count.

Mean Word Length: There is minimal variation among existing pipeline implementations. We simply compute the mean word length as follows:

words = text.split() word_count = len(words) character_count = sum(len(word) for word in words) mean_word_length = character_count / word_count

It's worth noting that Dolma used the median word length instead of the mean:

from statistics import median median_word_length = median(len(word) for word in words)

Number of Sentences: The only publicly available implementation of this quality signal is from RedPajama V2, which uses regular expressions to split text into sentences.

Implementations from RedPajama-V2

class RPS_Doc_Num_Sentences(RPSBase): # noqa ##The number of sentences in the content. This is calculated using the regex r'[^.!?]+[.!?]*' SENT_PATTERN = re.compile(r'[^.!?]+[.!?]*', flags=re.UNICODE) __slots__ = () def __call__(self, document: Document) -> SignalType: ##count the number of sentences in the content using regex score = float(len(self.SENT_PATTERN.findall(document.raw_content))) return [(0, len(document), score)]

However, we found that this approach can mistakenly interpret periods in URLs as sentence endings. To address this, we opted to use `nltk.tokenize.sent_tokenize` for more accurate sentence splitting.

TxT360 Implementation

from nltk.tokenize import sent_tokenize ... def count_sentences(text): sentences = sent_tokenize(text) return len(sentences) ... attrs.num_of_sentences = count_sentences(text)

Symbol to Word Ratio: Following RedPajama-V2 and DataTrove, we use the symbols of ("#", "...", "…"). We calculate the ratio as the number of symbols divided by the total number of words.

Implementations from Dolma

SYMBOLS = ("#", "…") ... attrs.symbol_to_word_ratio = sum(1 for word in words if any(s in word for s in SYMBOLS)) / max( word_count, 1 )

Implementations from RedPajama-V2

class RPS_Doc_Symbol_To_Word_Ratio(RPSBase): # noqa ##The ratio of symbols to words in the content. This is analogous to ##the signal used in Gopher. Symbols are defined "#", "...", and "…". SYMBOLS = ("#", "...", "…") __slots__ = () def __call__(self, document: Document) -> SignalType: num_words = document.num_raw_words if num_words == 0: return [(0, len(document), None)] # count the number of symbols in the content num_symbols = float(sum( document.raw_content.count(x) for x in self.SYMBOLS )) score = num_symbols / num_words score = round(score, PRECISION) return [(0, len(document), score)]

Implementations from DataTrove

if self.max_symbol_word_ratio and text.count("#") / n_words > self.max_symbol_word_ratio: return False, "gopher_too_many_hashes" if self.max_symbol_word_ratio and (text.count("...") + text.count("…")) / n_words > self.max_symbol_word_ratio: return False, "gopher_too_many_ellipsis"

TxT360 Implementation

SYMBOLS = ("#", "...", "…") ... symbol_pattern = re.compile("|".join(re.escape(symbol) for symbol in SYMBOLS)) ... attrs.symbol_to_word_ratio = sum(1 for word in words if symbol_pattern.search(word)) / word_count

Fraction of Alphabetic Words

Implementations from Dolma

attrs.fraction_of_words_with_alpha_character = sum( 1 for word in words if any(c.isalpha() for c in word) ) / max(word_count, 1)

Implementations from RedPajama-V2

class RPS_Doc_Frac_No_Alph_Words(RPSBase): # noqa ALPH_REGEX = re.compile(r"[a-zA-Z]") __slots__ = () def __call__(self, document: Document) -> SignalType: num_words = document.num_raw_words if num_words == 0: return [(0, len(document), None)] num_words_with_alpha = float(sum( int(self.ALPH_REGEX.search(word) is not None) for word in document.raw_words )) score = 1.0 - num_words_with_alpha / num_words score = round(score, PRECISION) return [(0, len(document), score)]

Implementations from DataTrove

# that 80 % of words in a document contain at least one alphabetic character if ( self.max_non_alpha_words_ratio and sum([any((c.isalpha() for c in w)) for w in words]) / n_words < self.max_non_alpha_words_ratio ): return False, "gopher_below_alpha_threshold"

Both Dolma and DataTrove use `char.isalpha()` to detect whether a word contains alphabetic characters while RedPajama-V2 employs regular expressions for this purpose. We opt to use regular expressions since `char.isalpha()` can also match words in other languages as long as they are not punctuations.

Number of Stop Words: The implementations across existing pipelines are largely identical. We adopt them and apply them to our pipeline.

STOP_WORDS = ('the', 'be', 'to', 'of', 'and', 'that', 'have', 'with') ... stop_words_pattern = re.compile("|".join(re.escape(symbol) for symbol in STOP_WORDS)) ... attrs.num_of_stop_words = sum(1 for word in words if stop_words_pattern.search(word))

TxT360 Implementation

Documents Filtered by Statistics-Based Heuristics

Sample documents that are filtered out by statistics-based heuristics

Data sample: 0 of 99

{
    "text": "You should open your own Breakfast-Diner. I would be your everyday-guest, even If I had to move away from germany for that :D\nmums mums mums!\nIvy: Hahaha! Good idea. :)Melwa: :)\nSkicka en kommentar",
    "meta": {
        "lang": "en",
        "lang_score": 0.8643032908439636,
        "url": "http://365daysofbreakfast.blogspot.com/2011/07/monday.html",
        "timestamp": "2023-11-28T09:40:02Z",
        "cc-path": "crawl-data/CC-MAIN-2023-50/segments/1700679099281.67/warc/CC-MAIN-20231128083443-20231128113443-00000.warc.gz",
        "url_score": 0.0
    },
    "quality_signals": {
        "url_score": 0.0,
        "fraction_of_duplicate_lines": 0.0,
        "fraction_of_characters_in_duplicate_lines": 0.0,
        "fraction_of_duplicate_paragraphs": 0.0,
        "fraction_of_characters_in_duplicate_paragraphs": 0.0,
        "fraction_of_characters_in_most_common_ngram": [
            [
                2,
                0.05521472392638037
            ],
            [
                3,
                0.07975460122699386
            ],
            [
                4,
                0.10429447852760736
            ]
        ],
        "fraction_of_characters_in_duplicate_ngrams": [
            [
                5,
                0.0
            ],
            [
                6,
                0.0
            ],
            [
                7,
                0.0
            ],
            [
                8,
                0.0
            ],
            [
                9,
                0.0
            ],
            [
                10,
                0.0
            ]
        ],
        "fraction_of_words_corrected_in_lines": 0.0,
        "fraction_of_lines_ending_with_ellipsis": 0.0,
        "fraction_of_lines_starting_with_bullet_point": 0.0,
        "fraction_of_lines_with_toxic_words": 0.0,
        "num_of_lines_with_toxic_words": 0,
        "num_of_toxic_words": 0,
        "word_count": 35,
        "mean_word_length": 4.6571428571428575,
        "num_of_sentences": 5,
        "symbol_to_word_ratio": 0.0,
        "fraction_of_words_with_alpha_character": 0.9714285714285714,
        "num_of_stop_words": 3,
        "num_of_paragraphs": 0,
        "has_curly_bracket": false,
        "has_lorem_ipsum": false,
        "orig_text_has_dup_lines": false
    }
}

Additional Filters: Following C4, we remove any page where the phrase “lorem ipsum” appeared since some pages had placeholder “lorem ipsum” text.

Documents Containing 'lorem ipsum'

Sample documents containing 'lorem ipsum'

Data sample: 0 of 10

{
    "text": "Our story\nBe your own kind of beautiful\nWe are the biggest day Spa chain in US. We are available at more than 65 areas, crosswise over 19 urban areas in US. We revive more than 1500 visitors consistently. In a limited capacity to focus 8 years, our image has turned out to be synonymous with wellbeing and salon administrations of worldwide gauges. Our group of master rejuvenators offer altered wellbeing arrangements according to the necessities of every person.\nOur Mission\nProducing the highest quality products\nStart with something pure, something good for you, and something that makes you feel pampered like a princess. We’re talking about clean beauty gift sets, of course – and we’ve got a bouquet of beauties for yourself or someone you love.\nWilliam JacobRelax Massage\nSofia JonesRelax Massage\nSome of our offers\n30% Off Holiday Gift Sets\nOur Blog\nAt vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.",
    "meta": {
        "lang": "en",
        "lang_score": 0.8810704350471497,
        "url": "https://alchemiespa.com/home-4/?add-to-cart=166&add_to_wishlist=176",
        "timestamp": "2023-11-28T10:28:53Z",
        "cc-path": "crawl-data/CC-MAIN-2023-50/segments/1700679099281.67/warc/CC-MAIN-20231128083443-20231128113443-00000.warc.gz",
        "url_score": 0.0
    },
    "quality_signals": {
        "url_score": 0.0,
        "fraction_of_duplicate_lines": 0.0,
        "fraction_of_characters_in_duplicate_lines": 0.0,
        "fraction_of_duplicate_paragraphs": 0.0,
        "fraction_of_characters_in_duplicate_paragraphs": 0.0,
        "fraction_of_characters_in_most_common_ngram": [
            [
                2,
                0.012062726176115802
            ],
            [
                3,
                0.016887816646562123
            ],
            [
                4,
                0.016887816646562123
            ]
        ],
        "fraction_of_characters_in_duplicate_ngrams": [
            [
                5,
                0.0
            ],
            [
                6,
                0.0
            ],
            [
                7,
                0.0
            ],
            [
                8,
                0.0
            ],
            [
                9,
                0.0
            ],
            [
                10,
                0.0
            ]
        ],
        "fraction_of_words_corrected_in_lines": 0.0,
        "fraction_of_lines_ending_with_ellipsis": 0.0,
        "fraction_of_lines_starting_with_bullet_point": 0.0,
        "fraction_of_lines_with_toxic_words": 0.0,
        "num_of_lines_with_toxic_words": 0,
        "num_of_toxic_words": 0,
        "word_count": 170,
        "mean_word_length": 4.876470588235295,
        "num_of_sentences": 9,
        "symbol_to_word_ratio": 0.0,
        "fraction_of_words_with_alpha_character": 0.9647058823529412,
        "num_of_stop_words": 35,
        "num_of_paragraphs": 0,
        "has_curly_bracket": false,
        "has_lorem_ipsum": true,
        "orig_text_has_dup_lines": false
    }
}

Curated Sources Processing

What This Section Contains

This section provides a complete discussion on the filtering applied to the 14 curated sources that comprise the non-Common Crawl data section of TxT360. The section is split into the following topic areas:

Curated Sources Data Processing Summary
Individual Filtering Discussion for Each Source
Estimated Reading Time: 25 minutes

Domain Specific Curated Sources

While massive amount of data can be crawled and obtained from the Internet, there are certain sources contain data in additional formats (e.g. PDF documents), or organized and published as official dumps (e.g. Wikipedia). We refer to these sources as curated sources. These dataset often comprises high-quality data that contain domain-specificity, such as academic publications or domain specific discussions. TxT360 was strongly influenced by The Pile regarding both inclusion of the dataset and filtering techniques.

These sources, such as Arxiv, Wikipedia, and Stack Exchange, provide high quality data. And as mentioned above, they are excluded from the web dataset via URL matching. Details about each of the sources are provided below.

TxT360 respects the copyright of the data sources and have not included the controversial data that was used in The Pile like YouTube and Opensubtitles, Reddit threads, and book3.

Filtering Steps and Definitions

Data preprocessing is a crucial step in the data science pipeline. It involves cleaning and transforming raw data into a format that is suitable for analysis. This process includes handling missing values, normalizing data, encoding categorical variables, and more.

The Language Filter removes documents in unwanted languages. This step improves data quality by removing irrelevant documents.

The Minimum Word Count Filter sets a threshold for required words within a document. This step filters out low-quality or incomplete documents. However, this step may remove documents that contain valuable information so a proper analysis is important for each data source.

The Unigram Log Probability Filter calculates the log probability of each unigram to measure the significance of individual words. This step quantifies the importance of individual words but may not capture the semantic meaning of words. To calculate the average log word probability, we use word frequencies extracted from the 1T Web-gram corpus. Specifically, we use the available list created by Rachel Tatman.

Data Processing for S2ORC

The formatting of the S2ORC dataset required special filters to be applied. These filters were not applied to the other data sources.

The Title and Abstract Filter extracts information from the title and abstract. This step provides additional information for analysis but may introduce bias in the analysis.

The Majority Language Filter identifies the majority language in the dataset. This step displays the distribution of languages in the dataset to enable language-specific analysis and insights.

The Paragraph Count Filter counts the number of paragraphs in each document. This step helps to analyze the structure and length of documents which can be a useful heuristic for document complexity.

The Frequency Filter calculates the frequency of each word in the dataset. This step serves to identify important words and topics in the dataset but may be sensitive to noise and outliers.

Filtering Discussion on All Curated Sources

Below is a detail recount of how each dataset was extracted and filtered. If specific challenges were found with a dataset, they are included and discussed to the best of our abilities. The figure below provides a global view of the document filtering results. ~8% of documents were removed during these three steps.

This section continues below with the specific filtering steps taken for all 14 curated datasets.

Wikipedia

Wikipedia is an encyclopedia form of high-quality text data used for language modeling. We have included filtered and deduplicated versions of complete Wikipedia data directly provided by the Wikipedia Foundation for more than 350 languages.

Download and Extraction: The Wikimedia dataset was downloaded from the official snapshot on Huggingface: https://huggingface.co/datasets/wikimedia/wikipedia/tree/main. Thehuggingface dataset.to_json function was used to convert the original parquet format to the jsonl format.

Filtering: Manual inspection of the dataset demonstrated high quality content. Only one filter was used to remove articles with few words. Based normal sentence constructs, the article was kept if it contained 10 or more words. Any article with fewer than 10 words was removed.

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
Wikipedia	61614907	0.00%	1.86%	0.00%	0.31%	97.84%

Wikipedia Filtering Examples

Wikipedia

Data sample: 0 of 9

{
    "id": 1,
    "url": "https://sq.wikipedia.org/wiki/Faqja%20kryesore",
    "title": "Faqja kryesore",
    "text": "\nFaqja kryesore"
}

ArXiv

The ArXiv dataset is a vast collection of preprint research papers primarily in Mathematics, Computer Science, and Physics. Established in 1991, it offers high-quality text and mathematical knowledge, making it an invaluable resource for academic and scientific research. ArXiv papers are typically written in LaTeX, a popular typesetting system for these fields. We have extracted the information from latex and converted it into a text format.

Download and Extraction: All the data was downloaded in original latex format from ArXiv official S3 repo: s3://arxiv/src. We aim to encode the downloaded data in UTF-8 format, and when necessary, utilize the chardet library to infer the appropriate encoding. After that, we use Pandoc to extract information from the latex files into markdown format. The command we use ispandoc <raw_tex_path> -s -o <output_markdown_path> -f latex+raw_tex -t markdown_mmd [--lua-filter <lua_filter_path>]. Finally, all markdowns were combined to create jsonl files.

Unique Data Preparation Challenges:

When converting LaTeX files into Markdown using Pandoc, it is crucial to account for different data formats to minimize information loss while also filtering out noisy content in LaTeX. Below, we outline our considerations and methods for handling various data types during this conversion process:

Tables: The process for handling tables follows three main approaches. First, tables compatible with Pandoc’s built-in formats are directly converted into standard Markdown tables. Notably, LaTeX’s '\multicolumn' and '\multirow' commands can be successfully translated into valid Markdown tables. Second, tables unsupported by Pandoc’s native functionality, such as deluxetable or other complex LaTeX types, are preserved in their original LaTeX format to maintain the integrity of complex structures. Third, only a few remaining tables have been converted to HTML web tables.
Mathematical Expressions: Inline mathematical expressions are rendered in Markdown. More complex equations remain unchanged, e.g., presented as '\begin{aligned}' blocks, to ensure accuracy and readability.
Figures: All figures are removed during the conversion process. Placeholder figures might not contribute to the paper’s data quality and, as such, have been omitted to streamline the output.
Section Headers: Section headers are converted into markdown format, using leading '#' symbols to represent the heading levels.
References: References are removed. Although they may be informative, references often introduce formatting inconsistencies or add little value compared to the core content of the paper.

Filters Applied: multiple filters are used here after manually verifying output of all the filters as suggested by peS2o dataset

Language Filter: any language other than English are discarded
Minimum Word Count Filter: less than 500 words (not inclusive) are discarded
Unigram Log Probability Filter Threshold: -20
Note: the Frequency Filter was calculated but not applied. The most frequent word in the paper consists of alpha characters only, and it appears in less than 7.5% of the document. Words are obtained by splitting the text on whitespace.

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
ArXiv	1911867	2.22%	5.65%	0.07%	0.00%	92.06%

ArXiv Filtering Examples

Data sample: 3 of 9

Raw format

{
    "text": "\\documentclass{amsart}\n\\newcommand\\hmmax{0}\n\\newcommand\\bmmax{0}\n\n\\usepackage[utf8]{inputenc}\n\\usepackage[top=1.25in, bottom=1.25in, left=1.1in, right=1.1in]{geometry, mathtools}\n\\usepackage[all]{xy}\n\\usepackage[toc,page]{appendix}\n\\usepackage{amsthm, amsmath, amssymb, amscd, latexsym, multicol, verbatim, enumerate, graphicx,xy, color}\n\\usepackage{mathrsfs}\n\\usepackage{tikz,stackrel}\n\\usepackage{mathdots}\n\\usepackage[all]{xy}\n\\usetikzlibrary{calc}\n\\usetikzlibrary{matrix,arrows,decorations.pathmorphing}\n\\usetikzlibrary{intersections}\n\\usetikzlibrary{decorations.markings}\n\\usetikzlibrary{external}\n%\\usepackage[external]{includetikz}\n\\tikzexternalize[\n    mode=graphics if exists,\n    prefix=Pictures/\n    ]\n\\usepackage{setspace}\n%\\linespread{1.2}\n\n\\newcommand{\\inputtikz}[1]{%\n  \\tikzsetnextfilename{#1}%\n  \\input{Pictures/#1.tikz}%\n}\n\\usepackage{tikz-cd}\n\\usetikzlibrary{decorations.markings}\n\\makeatletter\n\\tikzcdset{\nopen/.code={\\tikzcdset{hook, circled};},\nclosed/.code={\\tikzcdset{hook, slashed};},\n%open'/.code={\\tikzcdset{hook', circled};},\n%closed'/.code={\\tikzcdset{hook', slashed};},\ncircled/.code={\\tikzcdset{markwith={\\draw (0,0) circle (.375ex);}};},\nslashed/.code={\\tikzcdset{markwith={\\draw[-] (-.4ex,-.4ex) -- (.4ex,.4ex);}};},\nmarkwith/.code={\n\\pgfutil@ifundefined{tikz@library@decorations.markings@loaded}%\n{\\pgfutil@packageerror{tikz-cd}{You need to say %\n\\string\\usetikzlibrary{decorations.markings} to use arrow with markings}{}}{}%\n\\pgfkeysalso{/tikz/postaction={/tikz/decorate,\n/tikz/decoration={\nmarkings,\nmark = at position 0.5 with\n{#1}}}}},\n}\n\\makeatother\n\\makeatletter\n\\tikzset{\n%open/.code={\\tikzset{hook-latex, circled};},\n%closed/.code={\\tikzset{hook-latex, slashed};},\n%open'/.code={\\tikzset{hook', circled};},\n%closed'/.code={\\tikzset{hook', slashed};},\ncircled/.code={\\tikzset{markwith={\\draw (0,0) circle (.375ex);}};},\nslashed/.code={\\tikzset{markwith={\\draw[-] (-.4ex,-.4ex) -- (.4ex,.4ex);}};},\nmarkwith/.code={\n\\pgfutil@ifundefined{tikz@library@decorations.markings@loaded}%\n{\\pgfutil@packageerror{tikz-cd}{You need to say %\n\\string\\usetikzlibrary{decorations.markings} to use arrow with markings}{}}{}%\n\\pgfkeysalso{/tikz/postaction={/tikz/decorate,\n/tikz/decoration={\nmarkings,\nmark = at position 0.5 with\n{#1}}}}},\n}\n\\makeatother\n\n\\makeatletter\n\\providecommand{\\leftsquigarrow}{%\n  \\mathrel{\\mathpalette\\reflect@squig\\relax}%\n}\n\\newcommand{\\reflect@squig}[2]{%\n  \\reflectbox{$\\m@th#1\\rightsquigarrow$}%\n}\n\\makeatother\n\n\n%\\AtBeginEnvironment{tikzcd}{\\tikzexternaldisable} %%%%%%%%%%%%%%%%%%%%%%%%%%  Does not work....\n%\\AtEndEnvironment{tikzcd}{\\tikzexternalenable}\n\\usepackage[english]{babel}\n\\usepackage{stmaryrd}\n\\usepackage{mathbbol}\n\\usepackage{float}\n\\usepackage{comment}\n\\usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue, breaklinks=true]{hyperref}\n%\\usepackage[usenames, dvipsnames]{xcolor}\n\\usepackage{theoremref}\n\\usepackage{caption}\n\\usepackage{youngtab}\n\\usepackage{ragged2e}\n\\usepackage{relsize}\n%\\usepackage[autoscale]{youngtab}\n\\usepackage{anyfontsize}\n\\usepackage[utf8]{inputenc}\n\\usepackage{bm}\n%\\usepackage{bbm}\n\\usepackage{scalerel}\n\\usepackage{marvosym}\n\\DeclareFontFamily{U}{mathx}{\\hyphenchar\\font45}\n\\DeclareFontShape{U}{mathx}{m}{n}{\n      <5> <6> <7> <8> <9> <10>\n      <10.95> <12> <14.4> <17.28> <20.74> <24.88>\n      mathx10\n      }{}\n\\DeclareSymbolFont{mathx}{U}{mathx}{m}{n}\n\\DeclareFontSubstitution{U}{mathx}{m}{n}\n\\DeclareMathAccent{\\widecheck}{0}{mathx}{\"71}\n\\DeclareMathAlphabet{\\mathbbm}{U}{bbm}{m}{n}\n\\usepackage{verbatim}\n\\usepackage{pdflscape}\n\n\\usepackage{array} \n\\newcolumntype{C}{>{$}c<{$}}\n\n\\newenvironment{pf}{\\proof}{\\endproof}\n\\newcounter{cnt}\n%\\newenvironment{enumerit}{\\begin{list}{{\\hfill\\rm(\\roman{cnt})\\hfill}}{%\n%\\settowidth{\\labelwidth}{{\\rm(iv)}}\\leftmargin=\\labelwidth%\n%\\advance\\leftmargin by \\labelsep\\rightmargin=0pt\\usecounter{cnt}}}{\\end{list}} \\makeatletter\n%\\def\\mydggeometry{\\makeatletter\\dg@YGRID=1\\dg@XGRID=20\\unitlength=0.003pt\\makeatother}\n\\makeatother\n\\theoremstyle{remark}\n\\numberwithin{equation}{section}\n\n\\setlength{\\marginparwidth}{2cm}\n\n\n\\theoremstyle{definition}\n\\newtheorem*{theorem*}{Theorem}\n\\newtheorem*{definition*}{Definition}\n\\newtheorem{theorem}{Theorem}[section]\n\\newtheorem{definition}[theorem]{Definition}\n\\newtheorem{proposition}[theorem]{Proposition}\n\\newtheorem*{proposition*}{Proposition}\n\\newtheorem{lemma}[theorem]{Lemma}\n\\newtheorem{corollary}[theorem]{Corollary}\n\\newtheorem{remark}[theorem]{Remark}\n\\newtheorem{notation}[theorem]{Notation}\n\\newtheorem{example}[theorem]{Example}\n\\newtheorem*{example*}{Example}\n\\newtheorem{conjecture}[theorem]{Conjecture}\n\\newtheorem*{thm}{Theorem}\n%\\newtheorem*{thm}[theorem]{Theorem}\n%\\newtheorem*{conj}[theorem]{Conjecture}\n\\newtheorem{assumption}{Assumption}\n\\newtheorem{condition}{Condition}\n\\newtheorem{question}[theorem]{Question}\n\n\\newcommand{\\xx}{\\mathbbm{x} }\n\\newcommand{\\wN}{N_\\prin}\n\\newcommand{\\wM}{M_\\prin}\n\\newcommand{\\wpp}{\\widetilde{p}}\n\\newcommand{\\tm}{\\widetilde{m}}\n\\newcommand{\\bL}{\\vb{L}}\n\\DeclareMathOperator{\\Sing}{Sing}\n\\newcommand{\\scat}{\\mathfrak{D}}\n\\newcommand{\\mono}{m}\n\\DeclareMathOperator{\\Supp}{Supp}\n\n\n\\newcommand{\\tc}[2]{\\textcolor{#1}{#2}}\n\\newcommand{\\lrp}[1]{\\left(#1\\right)}\n\\newcommand{\\lrb}[1]{\\left[#1\\right]}\n\\newcommand{\\lrm}[1]{\\left|#1\\right|}\n\\newcommand{\\lrc}[1]{\\left\\{#1\\right\\}}\n\\newcommand{\\lra}[1]{\\left\\langle{#1}\\right\\rangle}\n\\newcommand{\\lrbb}[1]{\\llbracket{ #1 }\\rrbracket }\n\\newcommand{\\Q}{\\mathbb{Q} }\n\\newcommand{\\R}{\\mathbb{R} }\n\\newcommand{\\C}{\\mathbb{C} }\n\\newcommand{\\bbS}{\\mathbb{S} }\n\\newcommand{\\A}{\\mathbb{A} }\n\\newcommand{\\PP}{\\mathbb{P} }\n\\newcommand{\\Gm}{\\mathbb{G}_m }\n\\newcommand{\\Z}{\\mathbb{Z} }\n\\newcommand{\\F}{\\mathbb{F} }\n\\newcommand{\\kk}{\\mathbbm{k} }\n\\newcommand{\\lex}{\\mathrm{lex}}\n\\newcommand{\\wall}{\\mathfrak{d}}\n%\\newcommand{\\XX}{\\mathbb{X} }  %Commenting because I think we only define the open part.  This way it will throw an error if we forget and leave off the o.\n\\newcommand{\\XXo}{\\mathbb{X}^\\circ }\n\\newcommand{\\XXot}{\\widetilde{\\mathbb{X}}^\\circ }\n%\\newcommand{\\XXd}{\\widecheck{\\mathbb{X}} }  %Commenting because I think we only define the open part.  This way it will throw an error if we forget and leave off the o.\n\\newcommand{\\XXdo}{\\widecheck{\\mathbb{X}}^\\circ }\n\\newcommand{\\XXdot}{\\widetilde{\\widecheck{\\mathbb{X}}}^\\circ }\n\\newcommand{\\WGHKK}{\\mathcal{W}_{\\vartheta}}\n\\newcommand{\\WMR}{\\mathcal{W}_{q} }\n\\newcommand{\\WMRone}{\\mathcal{W}_{q=1} }\n\\newcommand{\\cone}{\\mathrm{cone}}\n\\newcommand{\\matching}{\\mathtt{M}}\n\\newcommand{\\rop}{\\varrho^{\\op}}\n\\newcommand{\\Lr}{{}^{L}\\varrho}\n%\\newcommand{\\orT}{\\oversetcustom{\\longrightarrow}{\\mathbb{T}^r}}\n\n\\newcommand{\\Lp}{p^\\vee}\n\\newcommand{\\LA}{{}^{L}\\cA}\n\\newcommand{\\LAp}{{}^{L}\\cAp}\n\\newcommand{\\LX}{{}^{L}\\cX}\n\\newcommand{\\LN}{{}^{L}N}\n\\newcommand{\\LM}{{}^{L}M}\n\\newcommand{\\LtM}{{}^{L}\\widetilde{M}}\n\\newcommand{\\LNc}{{}^{L}N^\\circ}\n\\newcommand{\\LMc}{{}^{L}M^\\circ}\n\\newcommand{\\LGam}{{}^{L}\\Gamma}\n\\newcommand{\\LNuf}{{}^{L}\\Nuf}\n\\newcommand{\\LNufc}{{}^{L}\\Nuf^\\circ}\n\\newcommand{\\LK}{{}^{L}K}\n\\newcommand{\\LH}{{}^{L}H}\n\\newcommand{\\Lpi}{{}^{L}\\pi}\n\\newcommand{\\Lseed}{{}^{L}\\seed}\n\n\n\\newcommand{\\cAo}{\\cA^{\\mathrm{op}}}\n\\newcommand{\\cXo}{\\cX^{\\mathrm{op}}}\n\\newcommand{\\LcAo}{{}^L\\cA^{\\mathrm{op}}}\n\\newcommand{\\LcXo}{{}^L\\cX^{\\mathrm{op}}}\n\\newcommand{\\DA}{\\mathcal{D}_{\\cA}}\n\\newcommand{\\DAo}{\\mathcal{D}_{\\cA^{\\mathrm{op}}}}\n\\newcommand{\\DAp}{\\mathcal{D}_{\\cAp}}\n\\newcommand{\\DApo}{\\mathcal{D}_{\\cAp^{\\mathrm{op}}}}\n\\newcommand{\\po}{{p^{\\mathrm{op}}}}\n\\newcommand{\\Lpo}{{\\Lp^{\\mathrm{op}}}}\n\\newcommand{\\DLA}{\\mathcal{D}_{\\LA}}\n\\newcommand{\\DLAo}{\\mathcal{D}_{\\LcAo}}\n\n\\newcommand{\\FG}{\\overset{\\rm FG}{\\vee}}\n\n\\newcommand{\\cXe}{\\cX_{\\bf 1}}\n\\newcommand{\\cXeH}{\\cX_{{\\bf 1}_{T_{H^*_{\\cX}}}}}\n\\newcommand{\\cXpeH}{\\lrp{\\cXp}_{{\\bf 1}_{T_{H^*_{\\cXp}}}}}\n\\newcommand{\\cAm}{\\cA^{\\vee}}\n\\newcommand{\\cXm}{\\cX^{\\vee}}\n\\newcommand{\\cVm}{\\cV^{\\vee}}\n\\newcommand{\\cAHAm}{(\\cA/T_{H_{\\cA}})^{\\vee}}\n\\newcommand{\\cXeHm}{\\lrp{\\cXeH}^{\\vee}}\n\\newcommand{\\cApm}{\\cAp^{\\vee}}\n\\newcommand{\\Km}{K^{\\vee}}\n\\newcommand{\\Kcm}{(\\Km)^\\circ}\n\\newcommand{\\cXem}{\\lrp{\\cX_{\\bf 1}}^{\\vee}}\n\\DeclareMathOperator{\\net}{net}\n\\newcommand{\\Xnet}{\\cX}\n%\\makeatletter\n%\\newcommand{\\oversetcustom}[2]{%\n%  {\\mathop{#2}\\limits^{\\vbox to -2\\ex@{\\kern-\\tw@\\ex@\n%   \\hbox{\\scriptsize #1}\\vss}}}}\n%\\makeatother\n\n\\makeatletter\n\\newcommand{\\oversetcustom}[3][0ex]{%\n  \\mathrel{\\mathop{#3}\\limits^{\n    \\vbox to#1{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle#2$}\\vss}}}}\n\\makeatother\n\n\\newcommand{\\cF}{\\mathcal{F}}\n\\newcommand{\\T}{\\mathbb{T}}\n\\newcommand{\\orT}{\\oversetcustom{\\longrightarrow}{\\mathbb{T}_r}}\n\\newcommand{\\by}{\\mathbf{y}}\n\\newcommand{\\bp}{\\mathbf{p}}\n\\newcommand{\\gv}{\\mathbf{g} }\n\\newcommand{\\cv}{\\mathbf{c} }\n\\newcommand{\\cval}{\\nu^{\\Phi}_{\\seed}}\n\\newcommand{\\vb}[1]{\\mathbf{#1}}\n\\newcommand{\\cA}{\\mathcal{A} }\n\\newcommand{\\cAp}{\\mathcal{A}_{\\mathrm{prin}}}\n\\newcommand{\\cXp}{\\mathcal{X}_{\\mathrm{prin}}}\n\\newcommand{\\cAps}[1]{\\mathcal{A}_{\\mathrm{prin},{#1}}}\n\\newcommand{\\cApG}{\\mathcal{A}_{\\Gamma_{\\mathrm{prin}}}}\n\\newcommand{\\cApGdual}{\\mathcal{A}_{\\Gamma^{\\vee}_{\\mathrm{prin}}}}\n\\newcommand{\\cXG}{\\mathcal{X}_{\\Gamma}}\n\\newcommand{\\cXGdual}{\\mathcal{X}_{\\Gamma^{\\vee}} }\n\\newcommand{\\cXpG}{\\mathcal{X}_{\\Gamma_{\\mathrm{prin}}}}\n\\newcommand{\\cXpGdual}{\\mathcal{X}_{\\Gamma^{\\vee}_{\\mathrm{prin}}}}\n\n\n\n\\newcommand{\\Aprinf}{\\mathscr{A}_{\\text{prin}}}\n\\newcommand{\\Xprinf}{\\mathscr{X}_{\\text{prin}}}\n\\newcommand{\\prin}{{\\mathrm{prin}} }\n\\newcommand{\\cX}{\\mathcal{X} }\n\\newcommand{\\cV}{\\mathcal{V} }\n\\newcommand{\\cU}{\\mathcal{U} }\n\\newcommand{\\cVp}{\\mathcal{V}_{\\mathrm{prin}} }\n%\\newcommand{\\cXp}{\\mathcal{X}_{\\mathrm{prin}} }\n\\newcommand{\\cXps}[1]{\\mathcal{}_{\\mathrm{prin},{#1}} }\n\\newcommand{\\ssO}{\\mathcal{O} }\n\\newcommand{\\B}{\\mathcal{B} }\n\\newcommand{\\lb}{\\mathcal{L} }\n\\newcommand{\\trop}{\\mathrm{trop} }\n\\newcommand{\\Trop}{\\mathrm{Trop} }\n\\newcommand{\\tf}{\\vartheta }\n\\newcommand{\\gp}{\\mathrm{gp} }\n\\newcommand{\\Xcom}{\\overline{\\cX} }\n\\newcommand{\\Xsp}{\\widehat{\\cX} }\n\\newcommand{\\bX}{\\mathbf{X} }\n\\newcommand{\\eq}[2]{\\begin{equation}\\label{#2} \\begin{split} #1  \\end{split} \\end{equation}}\n\\newcommand{\\eqn}[1]{\\begin{equation*} \\begin{split} #1 \\end{split} \\end{equation*}}\n\\newcommand{\\cc}{{\\Delta^+} }\n\\newcommand{\\ccF}{{\\Delta^+_F} }\n\\newcommand{\\TV}[1]{ {\\tv(#1)} }\n\\newcommand{\\Xfam}{\\mathscr{X} }\n\\newcommand{\\Xfsp}{\\widehat{\\Xfam} }\n\\newcommand{\\Xfams}[1]{\\mathscr{X}_{#1} }\n\\newcommand{\\Xfsps}[1]{\\widehat{\\Xfam}_{#1} }\n\\newcommand{\\Xt}{\\widetilde{X} }\n\\newcommand{\\Gr}[2]{\\Grass_{#1}\\lrp{#2} }\n\\newcommand{\\Nuf}{N_{\\text{uf}}}\n\\newcommand{\\Iuf}{I_{\\text{uf}}}\n\\newcommand{\\Ifr}{I_{\\text{fr}}}\n\\newcommand{\\CGr}[2]{\\operatorname{CGr}_{#1}\\lrp{#2} }\n\\newcommand{\\Gkn}{\\Grass_k\\lrp{\\C^n}}\n\\newcommand{\\Gknt}{\\widetilde{\\Grass_k\\lrp{\\C^n}}}\n\\newcommand{\\Gnmkn}{\\Grass_{n-k}\\lrp{\\C^n}}\n\\newcommand{\\Gnmknt}{\\widetilde{\\Grass_{n-k}\\lrp{\\C^n}}}\n\\newcommand{\\XTZ}{\\cX^\\trop\\lrp{\\Z} }\n\\newcommand{\\XTR}{\\cX^\\trop\\lrp{\\R} }\n\\DeclareMathOperator{\\wei}{wt}\n\n\\newcommand{\\sk}[2]{\\lrc{ #1 , #2 } }\n\\newcommand{\\ddeg}{{\\operatorname*{deg}}}\n\\newcommand{\\lie}{\\mathfrak}\n%\\newcommand{\\Gr}{{\\operatorname*{Gr}}}\n\\newcommand{\\wt}{\\operatorname*{wt}}\n\\newcommand{\\sspan}{\\operatorname*{span}}\n\\newcommand{\\id}{\\operatorname*{id}}\n\\newcommand{\\inn}{\\operatorname*{in}}\n\\newcommand{\\End}{\\operatorname*{End}}\n\\newcommand{\\St}{\\operatorname*{St}}\n\\newcommand{\\sgn}{\\operatorname*{sgn}}\n\\newcommand{\\blambda}{\\mathbf{\\lambda}}\n\\newcommand{\\bmu}{\\mathbf{\\mu}}\n\n\\newcommand{\\cham}{\\mathcal{G}}\n\\newcommand{\\chams}[1]{\\mathcal{G}_{#1}}\n\\newcommand{\\chamdual}{\\mathcal{C}}\n\\newcommand{\\chamduals}[1]{\\mathcal{C}_{#1}}\n\n\n\\newcommand{\\dd}{\\operatorname*{d}}\n\\newcommand{\\n}{\\mathfrak{n}}\n\\newcommand{\\D}{\\displaystyle}\n\\newcommand{\\ra}{\\rightarrow}\n\\newcommand{\\ve}{\\varepsilon}\n\\newcommand{\\vp}{\\varphi}\n\n\\newcommand{\\s}{\\sigma}\n\\newcommand{\\od}{\\mathrm{od}}\n\\newcommand{\\ev}{\\mathrm{ev}}\n\n\\newcommand{\\seed}{\\textbf{s}}\n\n\\newcommand{\\seedprin}{\\seed_{\\prin}}\n%\\DeclareMathOperator{\\trop}{trop}\n\\DeclareMathOperator{\\Sym}{Sym}\n\\DeclareMathOperator{\\init}{in}\n\\DeclareMathOperator{\\mult}{mult}\n\\DeclareMathOperator{\\Proj}{Proj}\n\\DeclareMathOperator{\\val}{val}\n\n%\\newcommand{\\Bbbk}{\\mathbb{k}}\n\\newcommand{\\bt}{\\mathbf{t}}\n\\newcommand{\\bb}{\\mathbf{b}}\n\\newcommand{\\bc}{\\mathbf{c}}\n\\newcommand{\\bd}{\\mathbf{d}}\n\\newcommand{\\be}{\\mathbf{e}}\n\\newcommand{\\bbf}{\\mathbf{f}}\n\n\\newcommand{\\ml}{\\mathcal{L}}\n\\newcommand{\\ff}{\\mathcal{F}}\n\\newcommand{\\pp}{\\mathcal{P}}\n\\newcommand{\\bs}{\\mathbf{s}}\n\\newcommand{\\mc}{\\mathbb{C}}\n\\newcommand{\\mk}{\\mathbb{K}}\n\\newcommand{\\bN}{\\mathbb{N}}\n\\newcommand{\\bi}{\\hspace{0.25em} \\mathbf{i}}\n\n\n\\newcommand{\\sideways}[1]{\\rotatebox[origin=c]{90}{#1}}  %These shortcuts are being ignored by the compiler for some reason...\n\\newcommand{\\sidei}{\\rotatebox[origin=c]{90}{$\\vb{i}$}}\n\\newcommand{\\sidemu}{\\rotatebox[origin=c]{90}{$\\bm{\\mu}$}}\n\n\\newsavebox{\\mybox}\n\\newcommand{\\back}[1]{%\n  \\ThisStyle{\\ifmmode%\n    \\savebox{\\mybox}{$\\SavedStyle#1$}%\n    \\reflectbox{\\usebox{\\mybox}}%\n  \\else%\n    \\savebox{\\mybox}{#1}%\n    \\reflectbox{\\usebox{\\mybox}}%\n  \\fi%\n}}\n\n\n\\newcommand{\\yd}[1]{ \\scaleto{\\Yng({#1})\\mathstrut}{5pt} }\n\n\\DeclareMathOperator{\\im}{im}\n\\DeclareMathOperator{\\Img}{Im}\n\\DeclareMathOperator{\\st}{s.t.}\n%\\DeclareMathOperator{\\id}{id}\n\\DeclareMathOperator{\\Sp}{span}\n\\DeclareMathOperator{\\supp}{Supp}\n\\DeclareMathOperator{\\Ext}{Ext}\n\\DeclareMathOperator{\\Aut}{Aut}\n\\DeclareMathOperator{\\GL}{GL}\n\\DeclareMathOperator{\\SL}{SL}\n\\DeclareMathOperator{\\PGL}{PGL}\n\\DeclareMathOperator{\\PSL}{PSL}\n\\DeclareMathOperator{\\Mat}{Mat}\n\\DeclareMathOperator{\\Tor}{Tor}\n\\DeclareMathOperator{\\Hom}{Hom}\n%\\DeclareMathOperator{\\End}{End}\n\\DeclareMathOperator{\\sign}{sign}\n\\DeclareMathOperator{\\Ad}{Ad}\n\\DeclareMathOperator{\\Spec}{Spec}\n%\\DeclareMathOperator{\\Proj}{Proj}\n\\DeclareMathOperator{\\chara}{char}\n\\DeclareMathOperator{\\gr}{gr}\n\\DeclareMathOperator{\\Grass}{Gr}\n\\DeclareMathOperator{\\diag}{diag}\n\\DeclareMathOperator{\\ord}{ord}\n\\DeclareMathOperator{\\cmid}{mid}\n\\DeclareMathOperator{\\up}{up}\n\\DeclareMathOperator{\\can}{can}\n\\DeclareMathOperator{\\Conf}{Conf}\n\\DeclareMathOperator{\\vol}{vol}\n\\DeclareMathOperator{\\Cox}{Cox}\n\\DeclareMathOperator{\\Pic}{Pic}\n\\newcommand{\\UT}{\\text{UT} }\n\\DeclareMathOperator{\\tv}{TV}\n\\DeclareMathOperator{\\lcm}{lcm}\n%\\DeclareMathOperator{\\min}{min}\n%\\DeclareMathOperator{\\max}{max}\n\\DeclareMathOperator{\\flow}{Flow}\n\\newcommand{\\Yng}{\\Yboxdim4pt \\yng}\n\\newcommand{\\Yngs}{\\Yboxdim2pt \\yng}\n\\DeclareMathOperator{\\Star}{Star}\n\\DeclareMathOperator{\\Tr}{T}\n\\DeclareMathOperator{\\tr}{t}\n\\DeclareMathOperator{\\op}{op}\n\\DeclareMathOperator{\\Cl}{Cl}\n\n\\newcommand{\\rank}{\\operatorname{rank}}\n\\newcommand{\\uf}{\\operatorname{uf}}\n\n\\newcommand{\\Grec}[2]{G^{\\mathrm{rec}}_{#1,#2} }\n\n\\DeclareMathOperator{\\conv}{conv}\n\\DeclareMathOperator{\\bconv}{conv_{BL}}\n\\DeclareMathOperator{\\NewtT}{Newt_{\\tf}}\n\n\n\\newcommand{\\ie}{{\\em i.e. }}\n\\newcommand{\\cf}{{\\em cf.}\\ }\n\\newcommand{\\eg}{{\\em e.g.}\\ }\n\n\\DeclareFontFamily{U}{musix}{}\n\\DeclareFontShape{U}{musix}{m}{n}{<-> s*[1.01] musix11}{}\n\\newcommand{\\mathbassclef}{%\n  \\text{%\n    % as high as an uppercase letter\n    \\vphantom{O,}%\n    % the clef extends below the baseline, so we raise and smash it\n    \\raisebox{.412\\height}[0pt][0pt]{\\usefont{U}{musix}{m}{n}\\symbol{73}}%\n  }%\n}\n\\newcommand{\\base}{\\mathbassclef}\n\n\\newcommand{\\Np}{N_{\\prin}}\n\\newcommand{\\Mp}{M_{\\prin}}\n\\newcommand{\\Kp}{K_{\\prin}}\n\\newcommand{\\Npc}{N^\\circ_{\\prin}}\n\\newcommand{\\Mpc}{M^\\circ_{\\prin}}\n\\newcommand{\\Kpc}{K^\\circ_{\\prin}}\n\n\n\n\n\\tikzexternalize[prefix=Pictures/]\n\n\\newcommand\\tim[1]{{\\color{purple} \\sf \\Bicycle Tim: [#1]}}\n\\newcommand\\lara[1]{{\\color{cyan} \\sf $\\clubsuit$ Lara: [#1]}}\n\\newcommand\\alfredo[1]{{\\color{blue} \\sf $\\infty$ Alfredo: [#1]}}\n\\newcommand\\mandy[1]{{\\color{magenta} \\sf $\\nabla$ Mandy: [#1]}}\n\n\\title{Newton--Okounkov bodies and minimal models for cluster varieties}\n\n\n\n\\author{Lara Bossinger, Man-Wai Cheung, Timothy Magee and Alfredo N\\'ajera Ch\\'avez}\n\\date{\\today}\n\n\\address{\nInstituto de Matem\\'aticas Unidad Oaxaca, \nUniversidad Nacional Aut\\'onoma de M\\'exico,\nLe\\'on 2, altos, \nCentro Hist\\'orico,\n68000 Oaxaca,\nMexico}\n\\email{lara@im.unam.mx}\n\n\\address{\nSchool of Mathematics, Kavli IPMU (WPI), UTIAS, The University of Tokyo, Kashiwa, Japan, 277-8583}\n\\email{manwai.cheung@ipmu.jp}\n\n\n\\address{Department of Mathematics, King's College London, Strand, London WC2R 2LS, UK}\n\\email{timothy.magee@kcl.ac.uk}\n\n\n\\address{\nConsejo Nacional de Ciencia y Tecnología - Instituto de Matem\\'aticas Unidad Oaxaca, \nUniversidad Nacional Aut\\'onoma de M\\'exico,\nLe\\'on 2, altos, \nCentro Hist\\'orico,\n68000 Oaxaca,\nMexico}\n\\email{najera@im.unam.mx}\n\n\\begin{document}\n\n\\maketitle\n\n\n\\begin{abstract}\nLet $Y$ be a (partial) minimal model of a scheme $V$ with a cluster structure (of type $\\cA$, $\\cX$ or of a quotient of $\\cA$ or a fibre of $\\cX$). Under natural assumptions, for every choice of seed  we associate a Newton--Okounkov body to every divisor on $Y$ supported on $Y \\setminus V$ and show that these Newton--Okounkov bodies are positive sets in the sense of Gross, Hacking, Keel and Kontsevich \\cite{GHKK}. This construction essentially reverses the procedure in loc. cit. that generalizes the polytope construction of a toric variety to the framework of cluster varieties.\n\nIn a closely related setting, we consider cases where $Y$ is a projective variety whose universal torsor $\\UT_Y$ is a partial minimal model of a scheme with a cluster structure of type $\\cA$. If the theta functions parametrized by the integral points of the associated superpotential cone form a basis of the ring of algebraic functions on $\\UT_Y$ and the action of the torus $T_{\\text{Pic}(Y)^*}$ on $\\UT_Y$ is compatible with the cluster structure, then for every choice of seed we associate a Newton--Okounkov body to every line bundle on $Y$. We prove that any such Newton--Okounkov body is a positive set and that $Y$ is a minimal model of a quotient of a cluster $\\cA$-variety by the action of a torus. \n\nOur constructions lead to the notion of the intrinsic Newton--Okounkov body associated to a boundary divisor in a partial minimal model of a scheme with a cluster structure. This notion is intrinsic as it relies only on the geometric input, making no reference to the auxiliary data of a valuation or a choice of seed.\nThe intrinsic Newton--Okounkov body lives in a real tropical space rather than a real vector space. \nA choice of seed gives an identification of this tropical space with a vector space, and in turn of the intrinsic Newton--Okounkov body\nwith a usual Newton--Okounkov body associated to the choice of seed.\nIn particular, the Newton--Okounkov bodies associated to seeds are related to each other by tropicalized cluster transformations providing a wide class of examples of Newton-Okoukov bodies exhibiting a wall-crossing phenomenon in the sense of Escobar--Harada \\cite{EH20}.\n\nThis approach includes the partial flag varieties that arise as minimal models of cluster varieties (for example full flag varieties and Grassmannians). For the case of Grassmannians, our approach recovers, up to interesting unimodular equivalences, the Newton--Okounkov bodies constructed by Rietsch--Williams in \\cite{RW}. \n\n\n\n\\end{abstract}\n\n\\tableofcontents\n\n\\section{Introduction}\n\n\\subsection{Overview} Cluster varieties are certain schemes constructed by gluing a (possibly infinite) collection of algebraic tori using distinguished birational maps called cluster transformations.\nThese schemes were introduced in \\cite{FG_Teich,FG_cluster_ensembles} and can be studied from many different points of view. \nThey are closely related to cluster algebras and $Y$-patterns defined by Fomin and Zelevinsky in \\cite{FZ_clustersI, FZ_clustersIV}. \nIn this paper we approach them from the perspectives of birational and toric geometry, mainly following \\cite{GHK_birational,GHKK}.\nIn \\cite{GHKK}, the authors show that certain sets called \\emph{positive polytopes} can be used to produce compactifications of cluster varieties and toric degenerations of such compactifications. \nIn the trivial case where the cluster variety in question is just a torus, a positive lattice polytope is simply a usual convex lattice polytope and this construction produces the toric variety associated to such a polytope.\nOne of the main goals of this paper is to reverse this construction in a systematic way and understand this process from the view-point of Newton--Okounkov bodies. \nWe also study the wall-crossing phenomenon for Newton--Okounkov bodies arising from cluster structures.\nWe treat independently the case of the Grassmannians as, in this context, we compare the Newton--Okounkov bodies we construct with those constructed in \\cite{RW} and explore some consequences.\nMoreover, throughout the text we systematically consider not only cluster varieties but also quotients and fibres associated to them (see \\S\\ref{sec:quotients-fibres} for the precise definitions of these quotients and fibres). For simplicity, in this introduction our main focus is on cluster varieties. \nWe fix once and for all an algebraically closed field $\\Bbbk$ of characteristic zero. \nUnless otherwise stated, all the schemes we consider are over $\\Bbbk$.\n\n\\subsection{The tropical spaces} Let $\\cV $ be a cluster variety. By definition, $\\cV$ is endowed with an atlas of algebraic tori of the form \n\\[\n\\cV= \\bigcup_{\\seed} T_{L;\\seed},\n\\]\nwhere $L$ is a fixed lattice, $T_{L; \\seed} $ is a copy of the algebraic torus $T_L= \\Spec(\\Bbbk[L^*])$ associated to $L$ (so $L^*=\\text{Hom}(L, \\Z)$) and the tori in the atlas are parametrized by \\emph{seeds $\\seed$ for} $\\cV$. \nWe will exploit the fact that $\\cV$ is a log-Calabi--Yau variety. \nThis property implies that $\\cV$ is endowed with a canonical up-to-scaling volume form $\\Omega$. \nMoreover, recall that a cluster variety is of one of the types: $\\cA$ or $\\cX$.\n\nJust like in toric geometry where one can consider the dual torus $T_L^{\\vee}:=T_{L^*}$, the \\emph{dual} of $\\cV$ is a cluster variety $\\cV^{\\vee}$ whose defining atlas consists of tori of the form $T^\\vee_L$.\nIt is well known that the ring $H^{0} (T_L,\\mathcal{O}_{T_{L}})$ of algebraic functions on $T_L$ has a distinguished basis --the set of characters of $T_L$-- parametrized by $L^*$. \nFor nearly 10 years it was conjectured that this fact can be generalized for $\\cV$ using this notion of duality.\nIn order to state such a generalization, we consider the integral tropicalization of $\\cV^{\\vee}$, which we denote by $\\Trop_{\\Z}(\\cV^{\\vee})$.\nThe precise definition of $\\Trop_{\\Z}(\\cV^{\\vee})$\n can be found in \\S\\ref{ss:tropicalization}.\n For this introduction the key fact that we need is that a prime divisor $D$ on a variety birational to $\\cV^\\vee $ determines a point of $\\Trop_{\\Z}(\\cV^{\\vee})$ if $\\Omega $ has a pole along $D$.\nIn \\cite{FG_cluster_ensembles} Fock--Goncharov conjectured that $H^{0}(\\cV, \\mathcal{O}_\\cV)$ has a canonical vector space basis parametrized by $\\Trop_{\\Z}(\\cV^{\\vee})$.\nAlthough false in general, this conjecture does hold in many of the cases of wide interest.\nIn \\cite{GHKK} the authors linked this conjecture to the log Calabi--Yau mirror symmetry conjecture \\cite[Cnjecture 0.6]{GHK_logCY}, suggesting that the canonical basis proposed by Fock--Goncharov is the \\emph{theta basis}.\nAs we would like to be as close to toric geometry as possible we systematically assume that the full Fock--Goncharov conjecture holds for the cluster variety $ \\cV$ under consideration.\nSo, under under the assumption that the full Fock--Goncharov conjecture holds for $\\cV$, one may consider $\\Trop_{\\Z}(\\cV^{\\vee})$ as replacing $L^*$ and the characters of $T_L$ are replaced by the theta functions on $\\cV$.\nMoreover, the real vector space $L^*\\otimes \\R$ is replaced by the real tropicalization $ \\Trop_{\\R}(\\cV^{\\vee})$ and convex polyhedra inside $L^*\\otimes \\R$\nare replaced by positive sets in the real tropical space $\\Trop_{\\R}(\\cV^{\\vee})\\supset \\Trop_{\\Z}(\\cV^{\\vee})$ (see \\S\\ref{ss:tropicalization} and Definition \\ref{def:positive_set} for the definitions of $\\Trop_{\\R}(\\cV^{\\vee})$ and of positive set, respectively). \n\n\nBesides the trivial case where $\\cV$ is just a torus (and hence $\\cV^{\\vee}$ is just the dual torus), the tropical spaces $\\Trop_{\\Z}(\\cV^{\\vee})$ and $\\Trop_{\\R}(\\cV^{\\vee})$ do not possess a linear structure (there is no natural notion of addition in these spaces and only multiplication by positive scalars makes sense).\nHowever, in certain situations these tropical spaces do contain subsets where addition and scalar multiplication make sense, which we call \\emph{linear subsets}. \nIn any case, every choice of seed $\\seed^\\vee$ for $\\cV^{\\vee}$ gives rise to a bijection $\\mathfrak{r}_{\\seed^\\vee}:\\Trop_{\\R}(\\cV^{\\vee}) \\longrightarrow \\R^d$ that restricts to a bijection $\\Trop_{\\Z}(\\cV^{\\vee}) \\overset{\\sim}{\\longrightarrow} \\Z^d$, where $d$ is the dimension of both $\\cV$ and $\\cV^\\vee$. \nIn general, different seeds lead to different bijections. \nWhen we fix one such identification $\\mathfrak{r}_{\\seed^\\vee}$ and talk about linear subsets of $\\Z^d$ and positive subsets of $\\R^d$, what we mean is that the inverse image of such a set under $\\mathfrak{r}_{\\seed^\\vee}$ has the given property.\n\n\\subsection{Positive Newton--Okounkov bodies and minimal models} Newton--Okounkov bodies are convex closed sets in real vector spaces. Their systematic study was developed by Lazarsfeld--Musta\\c{t}\\u{a} \\cite{LM09} and Kaveh--Khovanskii \\cite{KK12} based on the work of Okounkov \\cite{Oko96,Oko03}. \nThis concept is a far reaching generalization of both the Newton polytope of a Laurent polynomial and the polytope of a polarized projective toric variety. \nIn \\cite{KK12} the authors introduced Newton--Okounkov bodies for Cartier divisors on irreducible varieties.\nIn this paper we consider Newton--Okounkov bodies associated to Weil divisors in the setting of minimal models for cluster varieties.\nMore precisely, let $D$ be a Weil divisor on a $d$-dimensional normal variety $Y$ admitting a non-zero global section, that is, the space $H^0(Y, \\mathcal{O}(D))$ is non-zero, where $\\mathcal{O}(D)$ is the coherent sheaf associated to $D$. \nThe section ring of $D$ is a graded ring \n\\[\nR(D)=\\bigoplus_{k\\in \\Z_{\\geq 0}}{R}_k(D) \n\\] \nwhose $k$-th homogeneous component is the vector space $R_k(D)=H^0(Y, \\mathcal{O}(kD)) \\subset \\Bbbk (Y)$. \nFix a non-zero element $\\tau\\in R_1(D)$, and suppose we are given a total order on $\\Z^d$ and a valuation $\\nu: \\Bbbk(Y)^* \\to \\Z^d$. \nThen the Newton--Okounkov body associated to this data is:\n\\eqn{\n\\Delta_\\nu(D,\\tau) := \\overline{\\conv\\Bigg( \\bigcup_{k\\geq 1}  \\lrc{\\frac{\\nu\\lrp{f/\\tau^k}}{k} \\mid f\\in R_k(D)\\setminus \\{0\\} } \\Bigg) }\\subseteq \\R^d.\n}\n\nGiven a cluster variety $\\cV$, our first goal is to use its cluster structure to construct Newton--Okounkov bodies associated to divisors in compactifications of $\\cV$, generalizing the construction of the polytope of a torus invariant divisors on a toric variety.\nHence, we need to establish the class of compactifications of $\\cV $, the divisors therein and the valuations we consider.\n\nWe begin discussing valuations obtained from the cluster structure. \nIn case $\\cV$ is a cluster $\\cA$-variety, this is closely related to the work of Fujita and Oya \\cite{FO20}. \nHowever, our approach includes the cases where $\\cV$ is a cluster $\\cX$-variety, a quotient of a cluster $\\cA$-variety, or a fibre of a cluster $\\cX$-variety.\nIn order to be able to use the cluster structure of $\\cV$ to construct a valuation on $ \\Bbbk(\\cV)$ certain conditions (depending on whether $\\cV$ is of type $\\cA$ or of type $\\cX$) need to be fulfilled. \nFor instances, if $\\cV$ is of type $\\cA$, a sufficient condition is that the rectangular matrix $\\widetilde{B}$ determining the cluster structure of $\\cV$ has full rank\\footnote{A weaker condition is enough to construct a valuation for a given seed, see Remark \\ref{rem:dom_order}.}; if $\\cV$ is of type $\\cX$ we need that the full Fock--Goncharov conjecture holds for $\\cX$ (as we are assuming), see \\S\\ref{sec:cluster_valuations} for more details, including the cases of quotients of $\\cA$ and fibres of $\\cX$.\nIn case the necessary conditions are satisfied then for every $\\seed$ for $\\cV$ we have a cluster valuation\n\\[\n\\nu_\\seed: \\Bbbk(\\cV) \\setminus\\{0\\} \\to (\\Z^d, <_{\\seed}).\n\\]\nThe total order $<_{\\seed}$ on  $\\Z^d$ depends also on the type of $\\cV$.\nMoreover, in case $\\cV$ is of type $\\cA$ in the literature this valuation is generally denoted by $\\gv_{\\seed} $ and called a $\\gv$-\\emph{vector valuation} as it is closely related to the $\\gv$-vectors associated to cluster monomials introduced in \\cite{FZ_clustersIV}.\nIn case $\\cV$ is of type $\\cX$ the associated cluster valuation has not been systematically defined yet in the literature to the best of our knowledge. \nIn this case we also denote $\\nu_{\\seed}$ by $\\cv_{\\seed}$ and call it a $\\cv$-\\emph{vector valuation} since this valuation is closely related to the $\\cv$-vectors associated to $Y$-variables introduced in \\cite{NZ} and more generally to {\\bf c}-vectors of theta functions on $\\cX$ defined in \\cite{BFMNC}, and currently investigated in \\cite{ML23}. \nIn any case, for every seed $\\seed$ the theta basis of $H^0(\\cV, \\mathcal{O}_{\\cV})$ is adapted for the cluster valuation $\\nu_{\\seed}$.\nIn particular, if $ Y$ is a variety birational to $\\cV $ and $D$ is a divisor in $Y$, then, upon a choice of non-zero section $\\tau \\in R_1(D)$ and a seed $\\seed$, we can construct a Newton--Okounkov body $\\Delta_{\\nu_\\seed}(D,\\tau) $. \nWe are primarily interested in conditions ensuring that such a Newton--Okounkov body is a positive set.\nOn the one hand this is a condition that needs to be satisfied if one seeks to reverse Gross--Hacking--Keel--Kontsevich's construction of a compactification of a cluster variety from a positive set.\nOn the other hand, we are further interested in describing how the change of seed affects the Newton--Okounkov body and positivity plays the key role in understanding this. \n If $\\Delta_{\\nu_\\seed}(D,\\tau)$ is positive then any other $\\Delta_{\\nu_{\\seed'}}(D,\\tau)$ is obtained from $\\Delta_{\\nu_\\seed}(D,\\tau)$ by a composition of tropicalized cluster transformations. \n This will be discussed in more detail in the next subsection of the introduction.\nIn order to be able to show that $\\Delta_{\\nu_\\seed}(D,\\tau) $ is positive we restrict the class of compactifications of $\\cV$, the divisors we consider, and the sections we choose.\n\nOne can define a partial minimal model for $\\cV$\\footnote{Throughout the text we consider more generally (partial) minimal models for schemes $V$ with a cluster structure given by a birational map $\\cV\\dashrightarrow V$.} is an inclusion $\\cV \\subset Y$ such that $Y$ is normal and $\\Omega$ has a simple pole along every irreducible divisorial component of the boundary $D=Y \\setminus \\cV$, see \\cite[Remark~1.3]{GHK_birational}. It is a minimal model if $Y$ is projective over $\\Bbbk$.\nThese are the kind of (partial) compactifications of $\\cV $ we consider.\nThe main reason for this is that any prime divisor supported on $D$ determines a primitive point of $\\Trop_{\\Z}(\\cV)$.  \nLet $D'$ be a divisor supported on $D$. \nWe say that $R(D')$ has a {\\it graded theta basis} if for each $k$ \nthe set of theta functions on $\\cV$ contained in $H^0(Y,\\mathcal{O}(kD'))$ forms a basis (see Definition~\\ref{def:graded_theta_basis}).\nThen we can prove the following result.\n\n\\begin{theorem*}\n(Theorem \\ref{NO_bodies_are_positive})\nLet $D'$ be a Weil divisor supported on the boundary $D$ of the minimal model $\\cV \\subset Y $ such that $R(D')$ has a graded theta basis. Let $\\tau\\in R_1(D')$ be such that $\\nu_{\\seed}(\\tau) $ belongs to a linear subset of $ \\Z^d$. Then the Newton--Okounkov body  $\\Delta_{\\nu_{\\seed}}(D',\\tau)$ is a positive polytope.\n\\end{theorem*}\n\nIn Lemma~\\ref{lem:graded_theta_basis} we provide sufficient conditions ensuring that $R(D)$ has a graded theta basis. \nMoreover, the work of Mandel \\cite{Man19} provides conditions ensuring that a line bundle on a cluster $\\cX$-variety has a graded theta basis.\n\nWe further study another setting where we can use cluster structures to construct Newton--Okounkov bodies and show that they are positive polytopes:\nsuppose that $Y$ is a normal projective variety such that its Picard group is free and finitely generated.\nThe universal torsor of $Y$ is a scheme $\\UT_Y$ whose ring of algebraic  functions is isomorphic to the direct sum of all the spaces of sections associated to all (isomorphism classes of) line bundles over $Y$.\nWe assume that $\\UT_Y$ is a partial minimal model of a cluster $\\cA$-variety, which we denote by $\\cA \\subset \\UT_Y$.\nFor example, we encounter this situation frequently in the study of homogeneous spaces, where moreover the ring of global functions on $\\UT_Y$ has a representation theoretic interpretation due to the Borel--Weil--Bott Theorem (Remark~\\ref{rmk:borel weil bott}).\nThis fact is commonly used when constructing Newton--Okounkov bodies in Lie theory, see e.g. \\cite{FFL15} and the references therein.\n\nLet $D_1, \\dots, D_s$ be the irreducible divisorial components of $D= \\UT_Y \\setminus \\cV$ and let $\\tf^{\\cA^{\\vee}}_{i}$ be the theta function on $\\cA^{\\vee}$ parametrized by the point in $\\Trop_{\\Z}(\\cA)$ associated to $D_i$.\nThe (theta) superpotential\\footnote{If $Y$ is Fano, then this should be considered as the superpotential used for mirror symmetry purposes.} associated to the inclusion $\\cA \\subset \\UT_Y$ is\n\\[\nW_{\\UT_Y} = \\sum_{i=1}^s \\tf^{\\cA^{\\vee}}_i.\n\\]\nThe associated superpotential cone is the subset $\\Xi_{\\UT_Y}$ of $\\Trop_{\\R}(\\cV^{\\vee})$ where the tropicalized superpotential takes non-negative values.\nGiven a choice of seed $ \\seed^\\vee$ for $\\cA^\\vee$, $\\Xi_{\\UT_Y}$ is identified with a polyhedral cone $\\Xi_{\\UT_Y, \\seed^\\vee}\\subset \\R^d$.\nAs discussed in \\cite{GHKK}, in many cases the integral points of $\\Xi_{\\UT_Y}$ parametrize the set of theta functions on $\\cA$ that extend to $\\UT_Y$. \nThis happens for example if $\\cA$ has \\emph{theta reciprocity} (see Definition \\ref{def:theta_reciprocity}), a condition that is conjectured to be true in situations more general than ours.\nEven stronger, in many of the examples arising in nature the integral points of $\\Xi_{\\UT_Y}$ parametrize a basis of $H^0(\\UT, \\mathcal{O}_{\\UT_Y})$.\nIn \\cite{GHKK}, Gross--Hacking--Keel--Kontsevich give criteria ensuring that this is satisfied.\nThese conditions hold true in many cases of interest in representation theory, as was proven in several papers including \\cite{BF, FO20,  GKS_polyhedral,GKS_typeA, GKS_string, Mag20, SW18} and \\cite[\\S9]{GHKK}.\nMoreover, for special choices of seeds $\\seed^{\\vee} $, in these cases the cone $\\Xi_{\\UT_Y, \\seed^\\vee}$ agrees with known polyhedral cones such as the Gelfand--Tsetlin cone, string cones or the Knudson--Tao hive cone.\nMuch of the inspiration of this paper is due to the representation theoretic results that precede it.\nIn the case where the integral points of $\\Xi_{\\UT_Y}$ parametrize the set of theta functions on $\\cA$ that extend to $\\UT_Y$, we can restrict a {\\bf g}-vector valuation $\\gv_\\seed $ from $\\Bbbk(\\cA)$ to $H^0(\\UT_Y, \\mathcal{O}_{\\UT_Y})$. Therefore, given a line bundle $\\lb$ on $Y$  we can construct a Newton--Okounkov body $\\Delta_{\\gv_\\seed}(\\lb)$ in a similar way as before.\nIn order to show that $\\Delta_{\\gv_{\\seed}}(\\lb)$ is a positve polytope we need to consider torus actions on $ \\cA$ and fibrations of $\\cA^{\\vee}$ over a torus as we now explain. \n\nThe universal torsor $\\UT_Y$ is endowed with the action of the torus $T_{\\text{Pic}(Y)^*}$ associated to the dual of the Picard group of $Y$. \nWe first need this torus action to preserve $\\cA$ \nand that the induced action on $\\cA$ is cluster in the sense of \\thref{k_and_pic} (roughly speaking this means that the restricted action can be identified with the action induced by the choice of a sublattice of the kernel of $\\widetilde{B}$).\nIn such situations we have a cluster fibration \n\\[\nw:\\cA^{\\vee}\\to T_{\\text{Pic}(Y)}.\n\\]\nRecall that the choice of seed gives rise to the identification  $\\mathfrak{r}_{\\seed^\\vee}:\\Trop_{\\R}(\\cA^{\\vee}) \\to \\R^d$. \nThe tropicalization of $w$ expressed using such an identification is a linear map $w^T:\\R^d \\to \\text{Pic}(Y)\\otimes \\R$. \nUnder the conditions above the Newton--Okounkov body $\\Delta_{\\gv_{\\seed}}(\\lb)$ can be described as a slicing of the superpotential cone. More precisely, we have the following result (see Definition \\ref{def:quotient_fibre}).\n\\begin{theorem*}\n(\\thref{thm:k_and_pic})\nAssume that the theta functions on $\\cV$  parametrized  by the integral points of $\\Xi_{\\UT_Y}$ form a basis of $H^0(\\UT_Y, \\mathcal{O}_{\\UT_Y})$. \nIf the action of $ T_{\\text{Pic}(Y)^*}$ restricts to a cluster action of $ T_{\\text{Pic}(Y)^*}$ on $ \\cA$ then for any class $[\\lb]\\in \\text{Pic}(Y) $ the Newton--Okounkov body $\\Delta_{{\\bf g}_{\\seed}}(\\lb)$ can be describe as\n\\[\n\\Delta_{{\\bf g}_{\\seed}}(\\lb)=\\Trop_{\\R}(w)^{-1}([ \\lb ])\\cap \\Xi_{\\UT_Y, \\seed}.\n\\]\nIn particular, $\\Delta_{{\\bf g}_{\\seed}}(\\lb)$ is a positive subset of $\\Trop_{\\R}(\\cV^{\\vee})$ and $Y$ is a minimal model of the quotient of $\\cA$ by the action of $T_{\\text{Pic}(Y)^*}$.\n\\end{theorem*}\n\nThe case where $Y$ is the Grassmannian $\\text{Gr}_{n-k}(\\C^n)$ fits the framework above so it is possible to use the cluster $\\cA$ structure to construct Newton--Okounkov bodies associated to arbitrary line bundles over $\\text{Gr}_{n-k}(\\C^n)$. \nWe show that the Newton--Okounkov bodies we construct are unimodular to the Newton--Okounkov bodies constructed for $\\text{Gr}_{n-k}(\\C^n)$ by Rietsch and Williams in \\cite{RW} using the cluster $\\cX$ structure on Grassmannians (see Theorem~\\ref{thm: val and gv}).\nMoreover, the flow valuations of \\cite{RW} are instances of $\\bf c$-vector valuations.\n\nThis comparison result already has interesting consequences related to toric degenerations:\n\\begin{enumerate}\n    \\item Given a rational polytopal Newton--Okounkov body $\\Delta$ for a (very ample) line bundle $\\lb$ over $Y$ Anderson's main result in \\cite{An13} applies and it yields a toric degeneration of $Y$ to a toric variety (whose normalization is) defined by $\\Delta$. As the semigroup algebras of the {\\bf g}-vector valuations are saturated, no normalization is necessary.\n    \\item The construction of Gross--Hacking--Keel--Kontsevich in \\cite[\\S8]{GHKK} associates to a positive polytope $P$ a minimal model $\\cV\\subset Y$ and moreover, using Fomin--Zelevinsky's principal coefficients, a toric degneration of $Y$ to the toric variety defined by $P$.\n    As our Newton--Okounkov bodies are positive polytopes, this construction applies in our setting.\n\\end{enumerate}\nThe identification of the Newton--Okounkov bodies constructed by Rietsch--Williams and our Newton--Okounkov bodies constructed from {\\bf g}-vectors implies the following result.\n\n\\begin{theorem*}(Theorem~\\ref{thm: val and gv} and Remark~\\ref{rmk:toric degen})\nThe toric degenerations of $\\text{Gr}_{n-k}(\\C^n)$ determined by the Newton--Okounkov polytopes constructed by Rietsch--Williams using Anderson's result coincide with the toric degenerations of $\\text{Gr}_{n-k}(\\C^n)$ given by Gross--Hacking--Keel--Kontsevich construction using principal coefficients.\n\\end{theorem*}\n\n\\subsection{The intrinsic Newton--Okounkov body} \nUnderstanding how Newton--Okounkov bodies change upon changing the valuation is an interesting problem that has attracted the attention of several authors, see for example \\cite{EH20, BMNC, FH21,CHM22,HN23}.\nSo let us return to the discussion on how the Newton--Okounkov bodies constructed above transform if we change the choice of seed.\nGiven any two seeds $\\seed $ and $\\seed'$ for $ \\cV^\\vee$ there is a piecewise linear bijection $\\Trop_{\\R}(\\mu^{\\cV^\\vee}_{\\seed,\\seed'}):\\R^d \\to \\R^d$ relating the identifications of $\\Trop_{\\R}(\\cV^{\\vee})$ with $\\R^d$.\nMore precisely, we have a commutative diagram\n\\[\n\\xymatrix{\n&\\Trop_{\\R}(\\cV^{\\vee})\n\\ar_{\\mathfrak{r}_{\\seed}}[dl] \\ar^{\\mathfrak{r}_{\\seed'}}[dr] & \\\\\n\\R^d \\ar^{\\Trop_{\\R}(\\mu^{\\cV^\\vee}_{\\seed,\\seed'})}[rr]&   & \\R^d.\n}\n\\]\nEvery map $\\Trop_{\\R}(\\mu^{\\cV^\\vee}_{\\seed,\\seed'})$ restricts to a piecewise linear bijection of $\\Z^d$ and, \nby construction, the maps $\\Trop_{\\R}(\\mu^{\\cV^\\vee}_{\\seed,\\seed'})$ are composition of tropicalized cluster transformations for $\\cV^\\vee$ (see \\S\\ref{sec:intrinsic_NOB} for a more concise description).\nFor a subset $P\\subseteq \\Trop_{\\R}(\\cV^{\\vee})$ we let $P_{\\seed}=\\mathfrak{r}_{\\seed}(P)$.\nOne of the main properties behind our interest in showing that the Newton--Okounkov bodies we have constructed are positive sets is the following:\nif $P\\subseteq \\Trop_{\\R}(\\cV^{\\vee})$ is a positive set then $\\Trop_{\\R}(\\mu^{\\cV^\\vee}_{\\seed,\\seed'})(P_{\\seed})=P_{\\seed'}$ for any two seeds, $\\seed$ and $\\seed'$.\nIn particular, in this situation the entire collection of sets $\\{P_\\seed\\}_{\\seed}$ parametrized by the seed for $\\cV^{\\vee}$ may be replaced by $P$, a single intrinsic object that can be used to recover any $P_\\seed$ in the family.\n\nIn the case where a Newton--Okounkov body $\\Delta_{\\nu_{\\seed}}$ (associated to a line bundle $\\lb$ or a pair $(D',\\tau)$ as in the previous subsection) is positive, any other Newton--Okounkov body $\\Delta_{\\nu_{\\seed'}}$ associated to the same data is also positive. \nIn this situation there is a single intrinsic object $\\Delta_{\\mathrm{BL}} \\subset\\Trop_{\\R}(\\cV^{\\vee})$ representing the entire collection $\\{ \\Delta_{\\nu_{\\seed}}\\}_{\\seed}$. \nWe call $\\Delta_{\\mathrm{BL}}$ the \\emph{intrinsic Newton--Okounkov body} (associate to the data we begin with).\nThe subindex $\\mathrm{BL}$ in $\\Delta_{\\mathrm{BL}}$ stands for \\emph{broken line}, the choice of this notation goes back to \\cite{CMNcpt} where the last three authors of this paper introduce \\emph{broken line convexity}-- a notion of convexity defined in a tropical space that ensures positivity. \nBroken lines are pieces of tropical curves in $\\Trop_{\\R}(\\cV^{\\vee})$ used to define theta functions on $\\cV$ and describe their multiplication (see \\S\\ref{sec:tf_and_parametrizations}).\nStraight line segments defining convexity in a linear space are replaced by broken line segments in the tropical space to define broken line convexity.\nThe main result of \\cite{CMNcpt} is that a closed set is broken line convex if and  only if it is positive.\n\nIn the situations where we are able to show that $\\Delta_{\\nu_{\\seed}}\\subset \\R^d$ is positive, it turns out that it is moreover polyhedral, a property that fails in general, see e.g. \\cite{LM09,KLM_NObodies_spherical}. \nSince $\\Delta_{\\nu_{\\seed'}}=\\Trop_{\\R}(\\mu^{\\cV^\\vee}_{\\seed,\\seed'})(\\Delta_{\\nu_{\\seed}})$ any other $\\Delta_{\\nu_{\\seed'}}$ is also polyhedral.\nThe integral points of the convex bodies we consider are naturally associated to theta functions,\nwhich suggests is the following  question: does there exist a finite set of theta functions such that $\\Delta_{\\nu_{\\seed'}}$ is the convex hull of their images under $ \\nu_{\\seed'}$ for any seed $\\seed'$?\nSuch a collection of points might vary as we change seeds as exhibited in the case of the Grassmannians in an example in \\cite[\\S9]{RW} and generalized to an infinite family of examples in \\cite[Theorem~3]{bossinger2019full}.\nGiven the notion of broken line convexity, a slight reformulation of the question becomes more natural: \ndoes there exist a finite set of theta functions such that the broken line convex hull of their images under $\\nu_{\\seed'}$ is $\\Delta_{\\nu_{\\seed'}}$ for some (and hence any) seed $\\seed'$?\nIn fact, from the intrinsic Newton--Okounkov body perspective, the valuation is replaced by integral tropical points parametrizing theta functions and there is no reference to a seed at all.\nUsing this perspective, $\\Delta_{\\mathrm{BL}}$ becomes a broken line convex subset of $\\Trop_{\\R}(\\cV^{\\vee})$ whose integral points parametrize the theta basis of the first graded piece $R_1$ of the corresponding graded ring.\nIn \\thref{taut} we give sufficient conditions ensuring that $\\Delta_{\\mathrm{BL}}$ can be described as the broken line convex hull of a finite collection of points and describe this collection.\nApplying this result to the setting of Grassmannians we obtain that if $\\lb_e$ is line bundle over $\\Grass_{n-k}(\\C^n)$ obtained by pullback of $\\mathcal{O}(1)$ under the Plücker embedding $\\Grass_{n-k}(\\C^n)\\hookrightarrow \\mathbb P^{\\binom{n}{k}-1}$ then the intrinsic Newton--Okounkov body $\\Delta_{\\mathrm{BL}}(\\lb_e)$ is the broken line convex hull of the ${\\bf g}$-vectors of the Pl\\\"ucker coordinates (Corollary~\\ref{cor:intrinsicNO grassmannian}).\n\nBroken line convexity also allows to generalize the Newton polytope of a Laurent polynomial to the the world of cluster varieties.\nIn particular, in \\S\\ref{sec:intrinsic_NOB} we introduce the \\emph{theta function analog of the Newton polytope} of $f$, for any $f\\in H^0(\\cV, \\mathcal{O}_\\cV)$.\nThe intrinsic Newton--Okounkov bodies $\\Delta_{\\mathrm{BL}}$ can be described using this notion.\nThe key idea is exploiting the bijection between the theta basis (a special case of an \\emph{adapted basis}) and integral tropical points parametrizing them.\nThis idea is explained for full rank valuations with finitely generated value semigroup in the survey \\cite{B-toric}.\nIt is therefore interesting to continue studying this new class of objects.\n\n\\subsection{Organization of the paper}\nIn \\S\\ref{sec:background} we review background material on cluster varieties their quotients and their fibres (\\S\\ref{sec:back_ghkk}), and on tropicalization (\\S\\ref{ss:tropicalization}).\nIn \\S\\ref{sec:tf_and_parametrizations} we recall the construction of cluster scattering diagrams and the theta functions on (quotients and fibres of) cluster varieties.\nIn \\S\\ref{sec:minimal_models} we elaborate on the existence of a theta basis on the ring of regular functions on a partial minimal model of (a quotient or a fibre of) a cluster variety. This section largely follows \\cite{GHKK}.\nIn \\S\\ref{sec:cluster_valuations} we recall the {\\bf g}-vector valuations for (quotients) $\\cA$-varieties. We introduce {\\bf c}-vector valuations for (fibres of) $\\cX$-varieties.\nThe main results of the paper are contained in \\S\\ref{sec:no}. The study of Newton--Okoukov bodies associated to  Weil divisors on minimal models is treated in \\S\\ref{sec:NO_bodies} while the Newton--Okoukov bodies for line bundles are treated in \\S\\ref{sec:universal_torsors}.\n The intrinsic Newton--Okounkov body and the wall-crossing phenomenon for these are addressed in \\S\\ref{sec:intrinsic_NOB}.\n Finally, in \\S\\ref{sec:NO_Grass} we apply the results of the previous section to Grassmannians. \n One of the main technical conditions to be satisfied is verified in \\S\\ref{sec:Pic_property}.\n In \\S \\ref{sec:GHKK_and_RW} we prove a unimodular equivalence between the Newton--Okounkov bodies we construct and those constructed by Rietsch--Williams in \\cite{RW}. \n In \\S\\ref{sec:Grass_intrinsic} we describe the intrinsic Newton--Okounkov bodies for Grassmannians as the broken line convex hull\n of the {\\bf g}-vectors of Pl\\\"ucker coordinates (in arbitrary seeds).\n \n\n \n\\subsubsection*{Acknowledgements} \nThe authors L. Bossinger and A. Nájera Chávez were partially supported by PAPIIT project IA100122 dgapa UNAM 2022 and by CONACyT project CF-2023-G-106.\nM. Cheung was supported by World Premier International Research Center Initiative (WPI Initiative), MEXT, Japan.\nT. Magee was supported by EPSRC grant EP/V002546/1.\n\n\\section{Preliminaries}\\label{sec:background}\n\n\n\\subsection{Cluster varieties, quotients and fibres}\\label{sec:back_ghkk}\nWe briefly recall the construction of cluster varieties, their quotients and their fibres. The reader is invited to consult \\cite{GHK_birational,GHKK} for the details we shall omit in this section. \n\nUnless otherwise stated, all tensor products are taken with respect to $\\Z$. Moreover, given a lattice $L$ we denote by $L^*:= \\Hom(L,\\Z)$ its $\\Z$-dual and let $ \\langle \\cdot , \\cdot \\rangle: L\\times L^* \\to \\Z$ be the canonical pairing given by evaluation. \nWe further denote by $L_\\R:= L \\otimes \\R$ the real vector space associated to $L$. We fix an algebraically closed field $\\Bbbk$ of characteristic $0$ and let $T_L:= \\text{Spec}(\\Bbbk [L^*])$ be the algebraic torus whose character lattice is $L^*$. \n\n\\subsubsection{Cluster varieties and their dualities}\n\\label{sec:cluster_var}\n\n\nThe {\\bf fixed data} $\\Gamma$ consist of the following:\n\\begin{itemize}\n    \\item a finite set $I$ of {\\bf directions} and a distinguished subset $\\Iuf \\subseteq I$ of {\\bf mutable} (or {\\bf unfrozen}) {\\bf directions}. Elements of $I \\setminus \\Iuf $ are the {\\bf frozen directions};\n    \\item a lattice $N$ of rank $|I|$ together with a saturated sublattice $N_{\\text{uf}}\\subseteq N$ of rank $|I_{\\text{uf}}|$; \n    \\item a skew-symmetric bilinear form $\\{ \\cdot , \\cdot \\} : N \\times N \\rightarrow \\Q$;\n    \\item a finite index sublattice $N^\\circ \\subseteq N$ such that $\\{ N, \\Nuf \\cap N^{\\circ}\\}\\subset \\Z$ and $\\{ \\Nuf, N^{\\circ} \\}\\subset \\Z$;\n    \\item a collection of positive integers $\\{d_i\\}_{i \\in I}$ with greatest common divisor $1$;\n    \\item the dual lattices $M = \\Hom (N, \\Z)$ and $M^{\\circ}=\\Hom(N^{\\circ},\\Z)$.\n\\end{itemize}\nA ${\\bf seed}$ for $\\Gamma$ is a tuple $\\seed := ( e_i )_{i \\in I}$ such that $\\{ e_i \\}_{i\\in I}$ is a basis for $N$, $\\{e_i\\}_{i \\in \\Iuf}$ is a basis for $\\Nuf$ and $\\{d_i e_i \\}_{i \\in I } $ is a basis for $N^{\\circ}$. \nWe let $f_i := {d_i}^{-1} e_i^*$ and observe that $\\{f_i\\}_{i\\in I}$ is a basis of $M^{\\circ}$.\nFor $i,j\\in I$ we write $\\epsilon_{ij}:= \\lbrace e_{i},d_j e_{j} \\rbrace$ and define the matrix $\\epsilon=(\\epsilon_{ij})_{i,j\\in I}$. \nWhen we work with various seeds at the same time we introduce labels of the form $e_{i;\\seed}$, $f_{i;\\seed}$, $\\epsilon_{\\seed}=(\\epsilon_{ij;\\seed})$, etc. to distinguish the data associated to $\\seed $. We can {\\bf mutate} a seed $\\seed=(e_i)_{i\\in I}$ in a mutable direction $k\\in \\Iuf$ to obtain a new seed $\\mu_k(\\seed)=(e'_i)_{i\\in I}$ given by\n\\begin{equation}\n\\label{e_mutation}\ne_i':=\\begin{cases} e_i+[\\epsilon_{ik}]_+e_k & i\\neq k,\\\\\n-e_k&i=k,\n\\end{cases}\n\\end{equation}\nwhere $[x]_+:= \\text{max}(0,x)$ for $x \\in \\R$. \n\nLet $r:=|\\Iuf|$ and let $\\mathbb{T}_r$ denote the $r$-regular tree whose edges are labeled by the elements of $\\Iuf$.\nWe refer to $r$ as the {\\bf rank} and fix it one and for all.\nBy a common abuse of notation, the set of vertices of this tree is also denoted by $\\mathbb T_r$. \nWe fix once and for all a distinguished vertex $v_0\\in \\T_r$ and let $\\orT$ be the unique orientation of $ \\T_r$ such that the $r$ edges incident to $v_0$ are oriented in outgoing direction from $v_0$, and every vertex different from $v_0$ has one incoming edge and $r-1$ outgoing edges.\nWe write $v\\overset{k}{\\longrightarrow}v'\\in \\orT$ to indicate that the edge in between the vertices $v,v'$ of $\\orT$ is oriented from $v$ to $v'$ and is labeled by $k$.\n\nFix once and for all a seed $\\seed_0=(e_i\\mid i \\in I)$ and call it the {\\bf initial seed}.\nTo every vertex $v\\in \\T_r$ we attach a seed $\\seed_v$ as follows: \nwe let $\\seed_{v_0}=\\seed_0$, if $v\\overset{k}{\\longrightarrow}v'\\in \\orT$ then $\\seed_{v'}=\\mu_k(\\seed_{v})$.\nFor simplicity we write $\\seed\\in \\orT$ if $\\seed=\\seed_v$ for some $v\\in \\orT$.\n\nFor every seed $\\seed=(e_{i;\\seed}\\mid i\\in I)\\in \\orT$ we introduce the {\\bf seed tori} $\\cA_{\\seed} = T_{N^{\\circ}} $ and $ \\cX_{\\seed} = T_{M}$ which are endowed with the {\\bf cluster coordinates} $\\{A_{i;\\seed} := z^{f_{i;\\seed}}\\}_{i \\in I}$ and $\\{X_{i;\\seed} := z^{e_{i;\\seed}}\\}_{i \\in I}$, respectively. The {\\bf $\\cA$-cluster transformation} associated to $\\seed$ and $k \\in \\Iuf$ is the birational map\n$\n\\mu^{\\cA}_{k}:\\cA_{\\seed} \\dashrightarrow \\cA_{\\mu_k(\\seed)}\n$\nspecified by the pullback formula\n\\begin{equation}\n\\label{A_mut}\n(\\mu^{\\cA}_{k})^*(z^m):=z^{m} (1+z^{v_{k;\\seed}})^{-\\langle d_k e_{k;\\seed},m\\rangle} \\ \\ \\text{ for }m\\in M^{\\circ},\n\\end{equation}\nwhere $v_{k;\\seed}:=\\{e_{k;\\seed}, \\cdot \\}\\in M^{\\circ}$. Similarly, the {\\bf $\\cX$-cluster transformation} associated to ${\\vb s}$ and $k$ is the birational map\n$\n\\mu^{\\cX}_{k}:\\cX_{\\seed} \\dashrightarrow \\cX_{\\mu_k(\\seed)} \n$\nspecified by the pull-back formula\n\\begin{equation}\n\\label{X_mut}\n(\\mu^{\\cX}_{k})^*(z^n):=z^{n} (1+z^{e_{k;\\seed}})^{-[ n,e_{k;\\seed} ]}\\ \\ \\text{ for }n\\in N,\n\\end{equation}\nwhere $[\\cdot, \\cdot]:N\\times N \\to \\Q$ is the bilinear form determined by setting $[e_i,e_j]=\\lrc{e_i, d_je_j}$. \n\n\nFor seeds $\\seed, \\seed'\\in \\orT$ connected by iterated mutation in a sequence of directions $k_1, \\dots, k_s\\in \\Iuf$, we let $\\mu^{\\cA}_{\\seed, \\seed'} $ (resp. $\\mu^{\\cX}_{\\seed, \\seed'} $) be the composition of cluster transformations in the same sequence of directions and in the same order. \nA birational transformation of the form $\\mu^{\\cA}_{\\seed, \\seed'}$ (or $\\mu^{\\cX}_{\\seed, \\seed'}$) can be used to glue its domain and range by identifying the largest open subschemes where the transformation is an isomorphism. \nWe use this kind of gluing to define cluster varieties. \nMore precisely, the cluster $ \\cA$-variety associated to $\\Gamma$ and $\\seed_0$ is\n\\[\n\\cA_{\\Gamma,\\seed_0}:=\\bigcup\\limits_{\\seed\\in \\orT} \\cA_{\\seed}/ \\left( \\text{gluing by } \\mu^{\\cA}_{\\seed', \\seed''} \\right)_{\\seed',\\seed''\\in\\orT}.\n\\]\nThe cluster $ \\cX$-variety associated to $\\Gamma$ and $\\seed_0$ is\n\\[\n\\cX_{\\Gamma,\\seed_0}:=\\bigcup\\limits_{\\seed\\in \\orT} \\cX_{\\seed}/ \\left( \\text{gluing by } \\mu^{\\cX}_{\\seed', \\seed''} \\right)_{\\seed',\\seed''\\in\\orT}.\n\\]\n\nFrom now on an element $\\seed\\in \\orT$ will be referred to as a seed for $\\cA$ (or $\\cX$).\nIt is important to recall that declaring another $\\seed \\in \\orT$ as an initial seed gives rise to isomorphic cluster varieties.\nWe fix the pair $(\\Gamma,\\seed) $ once and for all and denote $\\cA_{\\Gamma, \\seed_0}$ (resp. $\\cX_{\\Gamma, \\seed_0}$) simply by $\\cA$ (resp. $\\cX$). \n\n\n\\subsubsection{Quotients of $\\cA$-varieties and fibres of $\\cX $-varieties}\\label{sec:quotients-fibres} \n\nLet $N^{\\perp}_{\\text{uf}}:= \\{ m\\in M \\mid \\langle n, m \\rangle=0 \\ \\forall \\  n\\in N_{\\text{uf}} \\} $.\nIn particular, $ M/ N^{\\perp}_{\\uf}\\cong (N_{\\text{uf}})^*$. \nBy a slight abuse of notation we also write $ M^{\\circ}/ \\Nuf^{\\perp}$. Here $\\Nuf^\\perp$ is taken in $M^\\circ$ rather than $M$, so $M^{\\circ}/ \\Nuf^{\\perp}$ is torsion free.\nSince $\\{ N_{\\text{uf}},N \\}\\subseteq \\Z$ the following homomorphisms are well defined\n\\begin{align} \\label{eq:p12star}\n  \\begin{matrix}\n    p_1^*: & N_{\\uf} & \\rightarrow & M^\\circ &\\qquad \\phantom{aaaaa} \\qquad \\qquad & p_2^* : & N & \\rightarrow& M^{\\circ}/ N^{\\perp}_{\\uf}. \\\\\n    & n &\\mapsto & \\{ n, \\cdot \\}\n  & \\qquad  & &  n &\\mapsto &  \\{ n, \\cdot \\} + \\Nuf^{\\perp}\n  \\end{matrix}\n\\end{align}\nThe matrix representing $p_2^*$ with respect to a seed $\\seed\\in \\orT$ is the \\emph{extended exchange matrix} $\\widetilde{B}_{\\seed}$ of \\cite{FZ_clustersIV}.\n\n\\begin{definition}\\thlabel{def:p-star}\nA {\\bf cluster ensemble lattice map} for $\\Gamma$ is a homomorphism $p^*: N \\to M^\\circ$ such that $p^*|_{N_{\\text{uf}}} = p^*_1$ and the composition $N \\overset{p^*}{\\longrightarrow} M^\\circ \\twoheadrightarrow M^{\\circ}/ N^{\\perp}_{\\uf}$ agrees with $p_2^*$, where $ M^\\circ \\twoheadrightarrow M^{\\circ}/ N^{\\perp}_{\\uf}$ denotes the canonical projection. Note that different choices of $p^*$ differ by a homomorphism $N/ N_{\\uf} \\rightarrow  N^{\\perp}_{\\uf} $.\n\\end{definition}\n\nIn other words, given a seed $\\vb s $, the $|I|\\times|I| $ square matrix $B_{p^*;\\vb{s}}$ associated to a cluster ensemble lattice map $p^*$ with respect to the bases $(e_i)_{i\\in I}$ and $(f_i)_{i\\in I}$ satisfies \n\\begin{equation}\\label{eq:Mp*}\nB_{p^*;\\vb{s}} - \\epsilon^{\\rm{tr}}_\\seed=\n\\lrb{\\begin{matrix}\n0 & 0 \\\\\n0 & \\ast\n\\end{matrix}},\n\\end{equation}\nwhere the $0$ entries represent the blocks $\\Iuf\\times\\Iuf$, $\\Iuf\\times (I\\setminus \\Iuf)$, and $(I\\setminus \\Iuf)\\times \\Iuf$,\nand the $\\ast$ entry indicates that the $(I\\setminus \\Iuf)\\times(I\\setminus \\Iuf)$ block has no constraints.\nEvery cluster ensemble lattice map $p^*:N\\to M^{\\circ}$ commutes with mutation. Therefore, $p^*$ gives rise to a {\\bf cluster ensemble map}\n\\[\np:\\cA \\to \\cX.\n\\]\n\nThe map $p:\\cA \\to \\cX$ yields both, torus actions on $\\cA$ and fibrations of $\\cX$ over a torus, as we explain subsequently.\nLet\n\\begin{equation}\\label{eq:define K}\nK=\\ker(p_2^*)=\\lrc{k\\in N\\mid \\{k,n\\}=0\\,\\forall\\, n\\in N_{\\rm uf}^\\circ} \\quad \\text{and} \\quad K^{\\circ}=K \\cap N^{\\circ}. \n\\end{equation}\nTo obtain an action on $\\cA$ we consider  a saturated sublattice \n\\[H_{\\cA} \\subseteq K^\\circ.\n\\]\nThe inclusion $ H_{\\cA} \\hookrightarrow N^\\circ$ gives rise to an inclusion $T_{H_{\\cA}}\\hookrightarrow T_{N^{\\circ}}$\nas a subgroup.\nSince $p^*$ commutes with mutation and $H_{\\cA}\\subseteq K $ we have a non-canonical inclusion \n\\[\nT_{H_{\\cA}}\\hookrightarrow \\cA.\n\\]\nThe action of $ T_{H_{\\cA}} $ on $T_{N^\\circ}$ given by multiplication extends to a free action of $T_{H_{\\cA}} $ on $\\cA$ and gives rise to  \na geometric quotient $\\cA \\to \\cA/T_{H_{\\cA}}$.\nThe scheme $\\cA/T_{H_{\\cA}}$ is obtained by gluing tori of the form  $T_{N^{\\circ}/H_{\\cA}}\\cong T_{N^{\\circ}}/T_{H_{\\cA}}$; the gluing is induced by the $\\cA$-mutations used to glue the seed tori for $\\cA$.\nMore precisely, for every seed $\\seed$ for $\\cA$ we let $(\\cA/T_{H_{\\cA}})_{\\seed}$ be a copy of the torus $ T_{N^{\\circ}/H_{\\cA}} $. \nFor $k\\in \\Iuf$ the mutation   $\\mu^{\\cA/T_{H_\\cA}}_{k}: (\\cA/T_{H_{\\cA}})_{\\seed}  \\dashrightarrow  (\\cA/T_{H_{\\cA}})_{\\mu_k(\\seed)}$ is given by\n\\begin{equation}\n\\label{A/T_mut}\n\\lrp{\\mu^{\\cA/T_{H_{\\cA}}}_{k}}^*(z^m):=z^{m} (1+z^{v_{k;\\seed}})^{-\\langle d_k e_{k;\\seed},m\\rangle} \\ \\ \\text{ for }m\\in H_{\\cA}^{\\perp}.\n\\end{equation}\nLet $\\mu^{\\cA/T_{H_{\\cA}}}_{\\seed, \\seed'}$ denote the composition of mutations determined by the path in $\\orT$ connecting $\\seed, \\seed'\\in \\orT$. Then \n\\[\n\\cA/T_{H_{\\cA}}:=\\bigcup\\limits_{\\seed\\in \\orT} (\\cA/T_{H_{\\cA}})_{\\seed}/ \\left( \\text{gluing by } \\mu^{\\cA/T_{H_{\\cA}}}_{\\seed', \\seed''} \\right)_{\\seed', \\seed'' \\in \\orT}.\n\\]\n\nTo obtain the fibration of $\\cX$ over a torus we consider a saturated sublattice \n\\[\nH_{\\cX} \\subseteq K.\n\\]\nThe inclusion $H_{\\cX} \\hookrightarrow N$ induces a surjection $T_M:= \\Spec(\\Bbbk[N]) \\to \\Spec(\\Bbbk[H_{\\cX}])=:T_{H_{\\cX}^*}$. This extends to a globally defined map \n\\begin{equation}\n\\label{eq:weight_map}\n    w_{H_{\\cX}}:\\cX\\to T_{H^*_\\cX}.\n\\end{equation}\n\n\n\\begin{remark}\nThe subindex $\\cV $ in the lattice $H_{\\cV}$ stands for the cluster variety $\\cV$ for which the choice of sublattice is relevant. \nWhen there is no risk of confusion, we drop the subindex $\\cV$ from $H_{\\cV}$ (see the end of \\S\\ref{sec:FG_dual}).\n\\end{remark}\n\nWe let $\\cX_{\\phi}$ be the fibre of the map \\eqref{eq:weight_map} over a closed point $\\phi\\in T_{H^*_{\\cX}}$.\nIn this work we mainly focus on the fibre $\\cXeH$, where ${\\bf 1}_{T_{H^*_{\\cX}}}\\in T_{H^*_{\\cX}}$ is the identity element. When there is no risk of confusion on the fibration we are considering we will denote this scheme simply by $\\cXe$.\n\nThe fibre $\\cXe$ is obtained by gluing tori isomorphic to $T_{H^\\perp_{\\cX}}$ via the restrictions of the $\\cX$-mutations used to glue the seed tori for $\\cX$ (see \\cite[\\S4]{GHK_birational} for a detailed treatment of this construction).\nAs in the previous situations, we have a description of the form \n\\[\n\\cXe:=\\bigcup\\limits_{\\seed\\in \\orT} (\\cXe)_{\\seed}/ \\left( \\text{gluing by } \\mu^{\\cXe}_{\\seed', \\seed''} \\right)_{\\seed', \\seed'' \\in \\orT},\n\\]\nwhere $(\\cXe)_{\\seed}$ is a torus isomorphic to $T_{H_{\\cX}^\\perp}$, $\\mu^{\\cXe}_{k}: (\\cXe)_{\\seed}  \\dashrightarrow  (\\cXe)_{\\mu_k(\\seed)}$ is given by  \n\\begin{equation}\n\\label{X_phi_mut}\n\\lrp{\\mu^{\\cXe}_{k}}^*(z^{n+H_{\\cX}}):=z^{n+H_{\\cX}}(1+z^{e_{k;\\seed}+H_{\\cX}})^{-[ n,e_{k;\\seed} ]}\\ \\ \\text{ for } n+H_{\\cX} \\in N/H_{\\cX}\n\\end{equation}\nand $\\mu^{\\cXe}_{\\seed,\\seed'}$ is defined as for the other varieties we have introduced so far.\n\n\\begin{definition}\n\\label{def:quotient_fibre}\nA variety of the form $\\cA/T_{H_{\\cA}} $ is referred to as a {\\bf quotient of $\\cA$}. A variety of the form $\\cXe$ is referred to as a {\\bf fibre of $\\cX$}. \nA {\\bf cluster action} on $\\cA$ is the action of a torus of the form $T_{H_{\\cA}}$.\n\\end{definition}\n\nLet $T$ be an algebraic torus endowed with a set of coordinates $z_1, \\dots , z_r$ and let $\\omega_T$ be its canonical bundle. \nA {\\bf volume form} on $T$ is a nowhere vanishing form in $H^0(T, \\omega_T) $.\nThe {\\bf standard volume form} on $T$ is (any non-zero scalar multiple of) \n\\[\n\\Omega_T= \\frac{dz_1 \\wedge \\dots \\wedge dz_r}{z_1 \\cdots z_r}.\n\\]\n\n\\begin{definition}\nA {\\bf log Calabi–Yau pair} $(Y, D)$ is a smooth complex projective variety $Y$ together with a reduced normal crossing divisor $D\\subset Y$ such that $K_X+D=0$. \nWe say a scheme $V$ is log Calabi–Yau if there exists a log Calabi–Yau pair $(Y,D)$ such that $V$ is $Y \\setminus D$ up to codimension 2.\n\\end{definition}\n\nIt follows from \\cite{Iitaka} that any log Calabi--Yau variety $V$ is endowed with a unique up to scaling holomorphic volume form (\\ie a nowhere vanishing holomorphic top form) $\\Omega_V$ which has at worst a simple pole along each component of $D$ for any such $(Y,D)$.  See \\cite{GHK_birational} for further details.\n\n\nAs explained in \\cite[\\S1]{GHK_birational} both $ \\cA$ and $\\cX$ are log Calabi--Yau, the key point being that these schemes are obtained by gluing tori via birational maps that preserve the standard volume form on each seed torus (endowed with cluster coordinates). \nFor the same reason, the schemes of the form $\\cA/T_{H_{\\cA}}$ and $\\cX_{\\phi}$ are also log Calabi--Yau.\nThe canonical volume form on $\\cA/T_{H_{\\cA}}$ (resp. $\\cX_{\\phi}$) is induced by (resp. the restriction of) the canonical volume form of $\\cA$ (resp. $\\cX$).\n\n\n\n\\subsubsection{Principal coefficients, $\\cX$ as a quotient of $\\cAp $ and $\\cA$ as a fibre of $\\cXp$}\n\\label{sec:principal_coefficients}\nFor the fixed data $\\Gamma=\\lrp{I, \\Iuf, N,N^{\\circ}, M, M^{\\circ}, \\{ \\cdot, \\cdot \\}, \\{d_i\\}_{i\\in I} }$, we consider its principal counterpart \n\\[\n\\Gamma_{\\prin}=\\lrp{I_{\\prin}, (I_{\\prin})_{\\text{uf}}, N_{\\prin}, N_{\\prin}^{\\circ}, M_{\\prin}, M^{\\circ}_{\\prin}, \\{ \\cdot, \\cdot \\}_{\\prin}, \\{d_i\\}_{i\\in I_{\\prin}} },\n\\]\nwhere the index set $I_\\prin$ is the disjoint union of two copies of $I$, its subset $(I_\\prin)_{\\text{uf}}$ is the set $\\Iuf$ thought of as a subset of the first copy of $I$, \n\\[\n       N_{\\prin} = N \\oplus M^\\circ, \\quad  N_{\\prin}^{\\circ}= N^{\\circ}\\oplus M, \\quad (N_{\\prin})_{\\text{uf}}=\\Nuf \\oplus 0, \\quad M_{\\prin} = M \\oplus N^\\circ, \\quad M_{\\prin}^{\\circ}=M^{\\circ}\\oplus N.\n\\]\nFor $i \\in I_{\\prin}$ belonging to either the first or second copy of $I$, the corresponding integer in the tuple $\\{d_i \\mid i\\in I_{\\prin}\\}$ is equal to integer indexed by $i$ for $\\Gamma$, and \n\\[\n\\{(n_1,m_1),(n_2,m_2)\\}_{\\prin}= \\{n_1, n_2\\} + \\langle n_1,m_2 \\rangle - \\langle n_2,m_1 \\rangle.\n\\]\nRecall that $\\seed_0=(e_i)_{i \\in I} $ is the initial seed for $\\Gamma$. Then the initial seed for $\\Gamma_{\\prin}$ is ${\\seed_0}_{\\prin}=\\lrp{(e_i,0),(0,f_i)}_{i\\in I}$.\nSince $\\Gamma$ and $\\seed_0$ were already fixed, we denote the cluster variety $\\cA_{\\Gamma_{\\prin},{{\\seed{_0}}_{\\prin}}}$ (resp. $\\cX_{\\Gamma_{\\prin},{{\\seed{_0}}_{\\prin}}}$) simply by $\\cAp$ (resp. $\\cX_{\\prin}$). It is moreover worth pointing out that $\\cAp$ is in fact independent of the choice of initial seed $\\seed_0$ as explained in \\cite[Remark B.8]{GHKK}.\n\nIn \\cite{GHK_birational} the authors show that the scheme $\\cX$ can be described as a quotient of $\\cAp$ in the sense of Definition \\ref{def:quotient_fibre}. \nTo obtain such a description we need to choose a cluster ensemble lattice map $p^*:N \\to M^{\\circ}$ for $\\Gamma$. \nThis choice determines the cluster ensemble map \n\\begin{equation}\n\\label{eq:def_p_prin}\np_{\\prin}: \\cAp \\to \\cXp.\n\\end{equation}\nThe map $p_{\\prin}$ is induced by the cluster ensemble lattice map\n\\begin{align*}\np_{\\prin}^*:N_{\\prin} &\\to \\Mpc\\\\\n(n,m) &\\mapsto \\lrp{p^*(n)-m,n}\n\\end{align*}\nfor $\\Gamma_\\prin$.\nSet $K_{\\prin}:=\\ker(p_{\\prin,2}^*)$ and $ K_{\\prin}^\\circ:= K_{\\prin}\\cap N^\\circ_\\prin$, where $p_{\\prin,2}^*$ corresponds to the map $p_2^*$ in \\eqref{eq:p12star} for $\\Gamma_{\\prin}$.\nWe let\n\\begin{equation}\n\\label{eq:H_Aprin} \n  H_{\\cAp}:= \\lrc{\\left.\\lrp{n,-(p^*)^*(n)}\\in N^\\circ_\\prin \\, \\right| \\, n \\in N^\\circ}.  \n\\end{equation}\n\nIt is straightforward to verify that $H_{\\cAp}$ is a saturated sublattice of $K^\\circ_\\prin$ that is isomorphic to $N^\\circ$.\nIn particular, we have a quotient $\\cAp/ T_{H_{\\cAp}}$ endowed with an atlas of seed tori isomorphic to $T_M $ (indeed, $T_{N^\\circ_{\\prin}}/T_{H_{\\cAp}}\\cong T_{N^\\circ \\oplus M}/T_{N^\\circ}\\cong T_M$). \nThere is an isomorphism \n\\begin{equation}\n    \\label{eq:def_chi}\n    \\chi : \\cAp/T_{H_{\\cAp}}\\overset{\\sim}{\\longrightarrow} \\cX\n\\end{equation}\nrespecting the cluster tori of domain and range.\nThe restriction of $\\chi $ to a seed torus is a monomial map whose pullback is given by\n\\eqn{\n\\chi^*: N &\\to (H_{\\cAp})^\\perp \n\\\\\nn &\\mapsto (p^*(n),n).\n}\nThere is also a surjective map \n\\begin{equation}\n\\label{eq:def_tilde_p}\n    \\tilde{p}:\\cAp \\to \\cX.\n\\end{equation}\nrespecting seed tori.\nThe restriction of $\\tilde{p} $ to a seed torus is a monomial map whose pullback is given by\n\\begin{align*}\n    \\tilde{p}^*:  N &\\to \\Mpc\\\\\n       \\ \\ n &\\mapsto  (p^*(n),n).\n\\end{align*}\nIn particular, we have $ \\tilde{p}= \\chi\\circ \\varpi$, where \n\\begin{equation}\n    \\label{eq:def_varpi}\n\\varpi: \\cAp \\to \\cAp/T_{H_{\\cAp}} \n\\end{equation}\nis the canonical projection. \n\nIt is also possible to describe $\\cA$ as a fibre of $\\cXp$. There is an injective map\n\\begin{equation}\n\\label{eq:def_xi}\n\\xi:\\cA \\to \\cXp\n\\end{equation}\nrespecting seed tori.\nThe restriction of $\\xi $ to a seed torus is a monomial map whose pullback is given by\n\\begin{align*}\n\\xi^*: N_{\\prin} &\\to M^\\circ\n\\\\\n(n,m) &\\mapsto p^*(n)-m.\n\\end{align*}\nLet\n\\begin{equation}\n    \\label{eq:H_Xprin} \nH_{\\cXp}:= \\lrc{\\lrp{n,p^*(n)}\\in N_\\prin \\mid n \\in N}.\n\\end{equation}\nIt is routine to check that $H_{\\cXp}$ is a valid choice to construct a fibration of $\\cXp $ over the torus $T_{H^*_{\\cXp}}$. \nHence, we can consider the fibre $(\\cXp)_{\\bf 1}=(\\cXp)_{{\\bf 1}_{T_{H^*_{ \\cXp}}}} $ associated to this fibration. \nThere is an isomorphism \n\\begin{equation}\n\\label{eq:def_delta}\n\\delta:\n\\cA \\overset{\\sim}{\\longrightarrow} (\\cXp)_{\\bf 1}     \n\\end{equation}\nrespecting seed tori.\nThe restriction of $\\delta $ to a seed torus is a monomial map whose pullback is given by\n\\begin{align*}\n\\delta^*: N_{\\prin} /H_{\\cXp} &\\to M^\\circ\n\\\\\n(n,m) + H_{\\cXp} &\\mapsto p^*(n)-m.\n\\end{align*}\nIn particular, we have that\n\\[\n\\xi=\\iota \\circ \\delta,\n\\]\nwhere $\\iota: (\\cXp)_{\\bf 1}\\hookrightarrow \\cXp$ is the canonical inclusion.\nFor later reference we also introduce the map\n\\begin{equation}\n\\label{eq:def_rho}\n\\rho: \\cXp \\to \\cX.\n\\end{equation}\nrespecting seed tori.\nThe restriction of $\\rho $ to a seed torus is a monomial map whose pullback is given by\n\\begin{align*}\n    \\rho^*:  N &\\to \\Np \\\\\n       \\ \\ n & \\mapsto  (n,p^*(n)).\n\\end{align*}\nIn particular, $\\rho \\circ p_{\\prin}= \\tilde{p} $.\nThe maps we have considered so far fit into the following commutative diagram \\begin{equation*}\n\\xymatrix{\n(\\cXp)_{\\bf 1} \\ar@{^{(}->}^{\\ \\iota}[r] & \\cXp \\ar_{\\rho}[d] & \\cAp \\ar_{p_{\\prin}}[l] \\ar@{->>}^{\\varpi}[d] \\ar_{\\tilde{p}}[dl] \\\\\n\\cA \\ar^{\\delta}_{\\cong}[u] \\ar_{p}[r] \\ar^{\\xi}[ru] & \\cX & \\cAp/T_{H_{\\cAp}.} \\ar_{\\cong \\ \\ }^{\\chi \\ \\ }[l]\n}\n\\end{equation*}\n\n\\begin{remark}\n    \\label{rem:labels}\nThe maps introduced in this section are associated with $\\Gamma$, hence, we label the maps with the subindex $\\Gamma$ to stress the fixed data $\\Gamma$ they are associated with.\n\\end{remark}\n\n\\subsection{Tropicalization}\n\\label{ss:tropicalization}\nIn this section we discuss  tropicalizations of cluster varieties. We mainly follow \\cite[\\S1]{GHK_birational}, \\cite[\\S2]{GHKK} and \\cite[\\S1.1]{FG_cluster_ensembles}. \n\nLet $T_L$ be the torus associated to a lattice $L$. A rational function $f$ on $T_L$ is called positive if it can be written as a fraction $f=f_1/f_2$, where both $f_1$ and $f_2$ are a linear combination of characters of $T_L$ with coefficients in $\\Z_{>0} $.\nThe collection of positive rational functions on $T_L$ forms a semifield inside $ \\Bbbk(T_L)$ denoted by $Q_{\\rm sf}(L)$.\nA rational map $f:T_L\\dashrightarrow T_{L'}$ between two tori is a {\\bf positive rational map} if the pullback $f^*:\\Bbbk(T') \\to \\Bbbk(T)$ restricts to an isomorphism $f^*:Q_{\\rm sf}(L') \\to Q_{\\rm sf}(L)$.\nIf $P$ is a semifield, then the $P$ valued points of $T_L$ form the set\n\\begin{equation}\n\\label{eq:FG_tropicalization}\nT_L(P):=\\Hom_{\\rm sf}(Q_{\\rm sf} (L), P)\n\\end{equation}\nof semifield homomorphisms from $Q_{\\rm sf} (L)$ to $ P$.\nIn particular, a positive birational isomorphism $\\mu:T\\dashrightarrow T'$ induces a bijection\n\\begin{align*}\n\\mu_*: T(P) & \\to T'(P)\\\\\n h \\ & \\mapsto \\ h \\circ f^*.    \n\\end{align*}\nBy a slight but common abuse of notation the sublattice of monomials of $Q_{\\rm sf}(L)$ is denoted by $L^*$. \nConsidering $P$ just as an abelian group the restriction of an element of $Q_{\\rm sf}(L)$ to $L^*$\ndetermines a canonical bijection $T_L(P) \\overset{\\sim}{\\longrightarrow} \\Hom_{\\rm groups} (L^*, P) $. \n\n\\begin{remark}\n\\label{rem:identification}\nWe systematically identify $T_L(P)$ with $L\\otimes P$ by composing the canonical bijection $T_L(P) \\overset{\\sim}{\\longrightarrow} \\Hom_{\\rm groups} (L^*, P) $ with the canonical isomorphism $\\Hom_{\\rm groups}(L^*, P) \\cong L \\otimes P$.\n\\end{remark}\n\n\nLet $ \\cV$ be a (quotient or a fibre of a) cluster variety.\nFor every $\\seed, \\seed'\\in \\orT$ the gluing map $\\mu^{\\cV}_{\\seed , \\seed'}: \\cV_\\seed\\dashrightarrow  \\cV_\\seed'$ is a positive rational map.\nSo we can glue $\\cV_{\\seed}(P) $ and $ \\cV_{\\seed'} (P)$ using $(\\mu^{\\cV}_{\\seed, \\seed'})_*$ and define \n\\[\n\\cV(P):= \\coprod_{\\seed \\in \\orT} \\cV_{\\seed}(P) / \\left(\\text{gluing by } (\\mu^{\\cV}_{\\seed, \\seed'})_*\\right)_{\\seed, \\seed'\\in \\orT}.\n\\]\nEvery point ${\\bf a}\\in \\cV(P)$ can be represented as a tuple $(a_{\\seed})_{\\seed\\in \\orT}$ such that $(\\mu^{\\cV}_{\\seed, \\seed'})_*(a_\\seed)=(a_{\\seed'}) $ for all $\\seed,\\seed'\\in \\orT$.\nSince all of the maps $(\\mu^\\cV_{\\seed,\\seed'})_*$ are bijections, the assignment \n\\eq{\n   \\mathfrak{r}_{\\seed}:\\cV(P)&\\to \\cV_{\\seed}(P)\\quad \\text{given by} \\quad {\\bf a}=(a_{\\seed})_{\\seed \\in \\orT} \\mapsto a_{\\seed}.\n}{not:tropical_space}\ndetermines an identification of $\\cV(P) $ with $ \\cV_{\\seed}(P)$. \nIf $S\\subset \\cV(P)$ we let \n\\begin{equation}\n\\label{eq:identification}\nS_{\\seed}(P):=\\mathfrak{r}_{\\seed} (S) \\subset \\cV_\\seed(P)\n\\end{equation}\nand write $S_{\\seed}$ instead of $S_{\\seed}(P)$ when the semifield $P$ is clear from the context.\n\nThe semifields we consider in this note are the integers, the rationals and the real numbers with their additive structure together with the semifield operation determined by taking the maximum (respectively, minimum).\nWe denote these semifields by $\\Z^T$, $\\Q^T$ and $\\R^T$ (respectively, $\\Z^t$, $\\Q^t$ and $\\R^t$).\nThe canonical inclusions $\\Z \\hookrightarrow \\Q \\hookrightarrow \\R$ give rise to canonical inclusions\n\\[\n\\cV(\\Z^T) \\hookrightarrow \\cV(\\Q^T) \\hookrightarrow \\cV(\\R^T) \\quad \\quad \\text{ and } \\quad \\quad  \\cV(\\Z^t) \\hookrightarrow \\cV(\\Q^t) \\hookrightarrow \\cV(\\R^t).\n\\]\nFor a set $S\\subseteq \\cV(\\R^T)$ (resp. $S\\subseteq \\cV(\\R^t)$) we let $\nS(\\Z):= S\\cap \\cV(\\Z^T)$ (resp. $S(\\Z):= S\\cap \\cV(\\Z^t)$). Moreover, for $G=\\Z, \\Q$ or $\\R$, there is an isomorphism of semifields $G^T\\to G^t$ given by $x \\mapsto -x$ induces a canonical bijection\n\\begin{align} \\label{eq:imap}\n    i: \\cV(G^T) \\rightarrow \\cV(G^t). \n\\end{align}\nSince $i$ amounts to a sign change (see Remark \\ref{rem:i_map} below), we think of $i$ as an involution and denote its inverse again by $i$.\n\n\n\n\\begin{remark}\\label{rmk:geometric trop}\nThe set $\\cV(\\Z^t)$ can be identified with the {\\bf geometric tropicalization} of $\\cV$, \ndefined as\n\\begin{equation*}\n    \\cV^{\\trop}(\\Z) \n     \\coloneqq \\{ \\text{divisorial discrete valuations } \\nu: \\Bbbk(\\cV) \\setminus \\{ 0\\} \\rightarrow \\mathbb Z \\mid \\nu (\\Omega_{\\cV}) <0 \\} \\cup \\{ 0\\}, \n\\end{equation*}\nwhere a discrete valuation is divisorial if it is given by the order of vanishing of a $\\Z_{>0}$-multiple of a prime divisor on some variety birational to $\\cV$. \n\\end{remark}\n\n\n\\begin{remark}\n\\label{rem:i_map}\nLet $G=\\Z, \\Q$ or $\\R$. Identifying $\\cV(G^T)$ with $\\cV_{\\seed}(G^T)$ via the bijection $\\mathfrak{r}_\\seed$ the map $i$ in \\eqref{eq:imap} can be thought of as the multiplication by $-1$ (\\cf Remark \\ref{rem:identification}).\n\\end{remark}\n\nA positive rational function $g$ on $\\cV $ is a rational function on $\\cV$ such that the restriction of $g$ to every seed torus $\\cV_{\\seed}$ is a positive rational function.\n\n\\begin{definition}\n\\label{def:trop_functions}\nThe {\\bf tropicalization} of a positive rational function $g: \\cV \\dashrightarrow \\Bbbk$ with respect to $\\R^T $ is the function $g^T:\\cV(\\R^T)\\to \\R$ given by \n\\begin{equation}\n\\label{eq:restriction}\n{\\bf a}\\mapsto a_{\\seed}(g),\n\\end{equation}\nwhere $ {\\bf a}=(a_{\\seed})_{\\seed \\in \\orT}$. The tropicalization of $g$ with respect to $\\R^t$ is the function $g^t:\\cV(\\R^t)\\to \\R$ defined as\n\\[\n{\\bf v} \\mapsto -v_{\\seed}(g),\n\\]\n\\end{definition}\nwhere $ {\\bf v}=(v_{\\seed})_{\\seed \\in \\orT}$. A direct computation shows that both $g^T$ and $g^t$ are well defined. Namely, one checks that for $\\seed,\\seed'\\in \\orT$\n\\[\na_{\\seed} (g)=a_{\\seed'}(g),\n\\]\nwhere in the left (resp. right) side of the equality we think of $g$ as a rational function on $\\cV_\\seed$ (resp. $\\cV_{\\seed'}$).\nMoreover, we have that\n\\begin{equation}\n\\label{eq:comparing_tropicalizations}\n    g^T({\\bf a})=g^t(i({\\bf a})),\n\\end{equation}\nfor all ${\\bf a} \\in \\cV(\\R^T)$.\n\\begin{remark}\nIn order to keep notation lighter we adopt the following conventions: \n\\begin{itemize}\n\\item given a positive  rational function $g\\in \\Bbbk (\\cV)=\\Bbbk(\\cV_\\seed)$ the tropicalizations of $g$ with domains $\\cV(\\R^T)$ and $\\cV_{\\seed}(\\R^T)$ are denoted by the same symbol $g^T$ for all $\\seed\\in \\orT$;\n\n\\item the restriction of $g^T$ (resp. $g^t$) to $\\cV(\\Z^T)$ (resp. $\\cV(\\Z^t)$) is also denoted by $g^T$ (resp. $g^t$);\n\\item when $P$ is one of $\\Z^T, \\Q^T$ or $\\R^T$ (resp. $\\Z^t, \\Q^t$ or $\\R^t$) the map $(\\mu^{\\cV}_{\\seed, \\seed'})_*$ is denoted by $(\\mu^{\\cV}_{\\seed, \\seed'})^T$ (resp. $(\\mu^{\\cV}_{\\seed, \\seed'})^t$).\n\\end{itemize}\n\\end{remark}\n\n\\begin{remark}\n    Later we will need to systematically consider $\\cV(\\R^t)$ when $\\cV$ is a variety of the form $\\cA$ or $\\cA/T_H$ and $ \\cV(\\R^T)$ when $\\cV$ is a variety of the form $\\cX$ or $\\cXe$.\n    In particular, from \\S\\ref{sec:FG_conj} on we use the notation $\\Trop_{\\R}(\\cV)$ that takes into account the different kinds of tropicalizations that we use for different kinds of varieties, see equation \\eqref{eq:unif}. \n\\end{remark}\n\n\nFor latter use we record the following formulae associated to the mutations determined by $\\Gamma$:\n\\begin{equation}\n    \\label{eq:tropical_A_mutation}\n    \\lrp{\\mu^{\\cA}_{k}}^T(n)=n+[\\langle v_k,n\\rangle]_+(-d_ke_k)\n\\end{equation}\nand\n\\begin{equation}\n    \\label{eq:tropical_X_mutation}\n    \\lrp{\\mu^{\\cX}_{k}}^T(m)=m+[\\langle d_ke_k,m \\rangle]_+v_k.\n\\end{equation}\nIn case we tropicalize these mutations with respect to $\\R^t$ we replace $[\\ \\cdot\\ ]_+$ by $[\\ \\cdot\\ ]_-$.\n\nFinally, if we think of $T_L (\\R^T)$ (resp. $T_L(\\R^t)$) as a vector space (see Remark \\ref{rem:identification}), the tropicalization of a positive Laurent polynomial $g= \\sum_{\\ell\\in L^*}c_{\\ell} z^{\\ell} \\in Q_{\\rm sf}(L)$  with respect to $\\R^T$ (resp. $\\R^t$) is the function $g^T:  T_L(\\R^T) \\to \\R$ (resp. $g^t:  T_L(\\R^t) \\to \\R$) given by \n\\begin{eqnarray*}\nx &\\mapsto& - \\max \\{ \\langle \\ell , x  \\rangle \\mid \\ell\\in L^* \\text{ such that } c_{\\ell} \\neq 0 \\}\\\\\n (\\text{resp. } x &\\mapsto& \\min \\{ \\langle \\ell , x  \\rangle \\mid \\ell\\in L^* \\text{ such that } c_\\ell \\neq 0 \\}).\n\\end{eqnarray*}\n\n\\section{Theta functions and their labeling by tropical points}\n\\label{sec:tf_and_parametrizations}\n\n\n\\subsection{Fock--Goncharov duality}\\label{sec:FG_dual}\nFor $\\Gamma=(I, \\Iuf, N,N^{\\circ}, M, M^{\\circ}, \\{ \\cdot, \\cdot \\}, \\{d_i\\}_{i\\in I} )$ the Langlands dual fixed data\nis $\\Gamma^\\vee=(I, \\Iuf, N^\\vee, (N^\\vee)^{\\circ}, M^\\vee, (M^\\vee)^{\\circ}, \\{ \\cdot, \\cdot \\}^\\vee, \\{d^\\vee_i\\}_{i\\in I} )$, where $d:=\\text{lcm}(d_i)_{i\\in I}$,\n\\[\n     N^\\vee = N^\\circ, \\quad  (N^\\vee)^{\\circ}= d\\cdot N, \\quad  M^\\vee = M^\\circ, \\quad (M^\\vee)^{\\circ}=d^{-1}\\cdot M, \\quad \\{\\cdot, \\cdot \\}^\\vee= d^{-1}\\{\\cdot, \\cdot \\} \\quad \\text{and} \\quad d^\\vee_i:=d\\,d_i^{-1}.\n\\]\nIf $ \\seed=(e_i)_{ i\\in I}$ is a seed for $\\Gamma$ then the Langlands dual seed is $\\seed^\\vee:=(e_i^\\vee)_{i\\in I}$, where $e_i^\\vee:=d_ie_i$. We also set $v^\\vee_i:=\\{e^\\vee_i, \\cdot \\}^\\vee$ These constructions give rise to {\\bf Langlands dual cluster varieties} which we denote as follows\n\\begin{align*}\n\t\\begin{array}{l l l l}\n\t{}^L(\\cA_{\\Gamma;\\seed_0}) := \\cA_{\\Gamma^\\vee;\\seed_0^\\vee} \\qquad \\qquad & \t \\text{and} \\qquad \\qquad & {}^L(\\cX_{\\Gamma; \\seed_0}) := \\cX_{\\Gamma^\\vee; \\seed_0^\\vee}.\n\\end{array}\n\\end{align*}\nSince $\\Gamma $ and $\\seed_0$ were already fixed, we denote ${}^L(\\cA_{\\Gamma;\\seed_0})$ (resp. ${}^L(\\cX_{\\Gamma;\\seed_0})$) simply by $\\LA$ (resp. $\\LX$).\n\n\\begin{definition}\nThe {\\bf Fock--Goncharov dual} of $\\cA$ (resp. $\\cX$) is the cluster variety $\\cA^{\\vee}$ (resp. $\\cX^{\\vee}$) given by\n\\begin{equation*}\n    \\cA^{\\vee} := {}^L\\cX \\qquad \\qquad  \t \\text{and} \\qquad \\qquad  \\cX^{\\vee} := {}^L\\cA.\n\\end{equation*}\n\\end{definition}\n\nIn particular, we have that\n\\[\n \\cAp^\\vee = {}^L(\\cX_\\prin)=\\cX_{(\\Gamma_\\prin)^\\vee} \\qquad \\qquad  \t \\cXp^{\\vee}= {}^L(\\cAp)=\\cA_{(\\Gamma_\\prin)^\\vee}.\n\\]\n\n\\begin{remark}\n    \\label{rem:Lprin}\nNotice that $\\cA_{(\\Gamma_{\\prin})^\\vee}$ (resp. $\\cX_{(\\Gamma_{\\prin})^\\vee}$) is canonically isomorphic to  $\\cA_{(\\Gamma^\\vee)_{\\prin}}$ (resp. $\\cX_{(\\Gamma^\\vee)_{\\prin}}$). Hence, we frequently identify these schemes without making reference to the canonical isomorphisms between them.\n\\end{remark}\n\nIt is not hard to see that the map \n\\begin{eqnarray}\\label{eq:L p}\n    {(\\Lp)^*:= -d^{-1}(p^*)^*:N^\\vee \\to (M^\\vee)^{\\circ}}\n\\end{eqnarray}\nis well defined and is a cluster ensemble lattice map for the Langlands dual data $\\LGam$, where $(p^*)^*$ is the lattice map dual to $p^*$. \nIndeed, in the bases for $ N^\\vee $ and $ (M^\\vee)^{\\circ}$ determined by $\\seed^\\vee $, and in comparison with the matrix $B_{p^*;\\vb{s}}$ in \\eqref{eq:Mp*}, the matrix of $(\\Lp)^*$ is of the form\n\\begin{equation*}\nB_{(\\Lp)^*;\\seed^\\vee}= -B_{p^*;\\seed}^{\\rm{tr}}.\n\\end{equation*}\nIn particular, we have an associated dual cluster ensemble map\n\\[p^\\vee:\\cA^\\vee \\to \\cX^\\vee.\\]\n\nWe proceed to introduce the Fock--Goncharov dual for a quotient of $\\cA$. \nSo consider a cluster ensemble lattice map $p^*:N\\to M^{\\circ}$ for $\\Gamma$ and the cluster ensemble lattice map $(p^\\vee)^*:N^\\vee\\to (M^\\vee)^\\circ$ for $\\Gamma^\\vee$.\nRecall from \\eqref{eq:define K} that $K=\\ker(p_2^*)$.\nSimilarly, we set\n\n\\[\nK^\\vee=\\ker((p^\\vee)_2^*)=\\{k\\in N^\\circ\\mid \\{k,n\\}=0 \\text{ for all } n\\in d\\cdot N_{\\rm uf}\\},\n\\]\nwhere $(p^\\vee)_2^*$ is the map $p^*_2$ of \\eqref{eq:p12star} for $\\Gamma^\\vee$.\nLet $H_{\\cA}\\subseteq K^\\circ$ be a saturated sublattice and consider the quotient $\\cA/T_{H_\\cA}$. \nRecall from \\S\\ref{sec:quotients-fibres} that $\\cA/T_{H_\\cA}$ is obtained by gluing tori of the form $T_{N^{\\circ}/H_{\\cA}}$. \nSince $N^{\\circ}/H_{\\cA} $ and $H_{\\cA}^{\\perp} \\subset M^\\circ$ are dual lattices the Fock--Goncharov dual of $\\cA/T_{H_{\\cA}}$ should be a fibre of $\\cA^\\vee$ obtained by gluing tori of the form $T_{H_{\\cA}^{\\perp}}$. In order to construct it notice that for $n$ in $\\Nuf$ we have $\\langle k,p^*(n)\\rangle = -{d}^{-1}\\{k,dn\\}=\\langle dk,-(p^\\vee)^*(n)\\rangle$. This implies that\n\\[\nK^\\circ=p^*(N_{\\rm uf})^\\perp=K^\\vee.\n\\]\nIn particular, $H_{\\cA}$ is a saturated sublattice of $K^\\vee$ as it is saturated in $K^\\circ$.\nIt is therefore possible to find $T_{H_{\\cA}^*}$ as the base of a fibration of the form \\eqref{eq:weight_map} for $\\cA^{\\vee}$ as we are allowed to set\n\\[\nH_{\\cA^{\\vee}}=H_{\\cA}\\subseteq K^\\vee.\n\\]\nSo consider the fibration\n\\[\nw_{H_{\\cA}}:\\cAm \\to T_{H_\\cA^*}.\n\\]\nNotice that the fibre $(\\cAm)_{{\\bf 1}_{T_{H_\\cA^*}}}$ is obtained gluing tori of the form $T_{H_{\\cA}^\\perp}$ as desired. \nTherefore, we define the Fock--Goncharov dual of the quotient $\\cA/T_{H_\\cA}$ as\n\\[\n\\cAHAm := (\\cAm)_{{\\bf 1}_{T_{H_\\cA^*}}}=\\lrp{{}^L\\cX}_{{\\bf 1}_{T_{H_\\cA^*}}}.\n\\]\n\nSimilarly, let $H_{\\cX}\\subseteq K$ be a saturated sublattice and let $w_{H_{\\cX}}:\\cX \\to T_{H^*_\\cX}$ be the associated fibration. \nRecall that $\\cX_{{\\bf 1}_{T_{H^*_\\cX}}}$ is obtained by gluing tori of the form $T_{H_{\\cX}^{\\perp}}$. \nIts Fock--Goncharov dual is a quotient of $\\cXm$ glued from tori of the form $T_{(H^\\perp_\\cX)^*}$ which we construct next.\nA direct computation shows that $d\\cdot H_{\\cX}$ is a saturated sublattice of $(K^\\vee)^\\circ$. \nIn particular, we are allowed to choose\n\\[\nH_{\\cXm}= d\\cdot H_{\\cX}\\subseteq (K^\\vee)^\\circ \n\\]\nas a sublattice giving rise to a quotient $ \\LA/T_{d\\cdot H_{\\cX}}$. This quotient is obtained by gluing tori of the form $T_{d\\cdot N}/T_{d\\cdot H_\\cX}\\cong T_{N/H_\\cX}\\cong T_{(H^{\\perp}_\\cX)^*}$. \nTherefore, we define the Fock--Goncharov dual of $\\cX_{{\\bf 1}_{T_{H^*_{\\cX}}}}$ as \n\\[\n\\cXeHm := \\cXm/T_{ H_{\\cX^{\\vee}}}={}^L\\cA/T_{d\\cdot H_\\cX}.\n\\]\n\nIn what follows, when we consider a saturated  sublattice $H$ of $K^\\circ$ and write expressions such as $\\cA/T_{H}$ or $w_{H}:\\cAm\\to T_{H^*}$ we will be implicitly assuming that we have set \n\\[\nH_{\\cA}= H = H_{\\cAm}.\n\\]\nSimilarly, when $H$ is a saturated sublattice of $K$ and we write expressions such as $w_{H}:\\cX \\to T_{H^*}$, $\\cXe$ or $\\cXem$ we will be implicitly assuming that we have set\n\\[\nH_{\\cX}= H = d^{-1}\\cdot H_{\\cXm},\n\\]\n\\[\n\\quad  \\cXe = \\cXeH \\quad \\quad \\text{and} \\quad \\quad \\cXem= \\cXm /T_{H_{\\cXm}}.\n\\]\n\n\\begin{remark}\nLet $\\cV$ be (a quotient of) $\\cA$ or  (a fibre of) $\\cX$. In the skew-symmetric case Arg\\\"uz and Bousseau \\cite{AB22} showed that $\\cV$ and $\\cV^{\\vee}$ are mirror dual schemes from the point of view of \\cite{GS22}.\nA similar result is proven for the skew-symmetrizable case when $\\cV$ has dimension $2$ in \\cite{Mandy_rank2_MS} with arguments that may be generalized to arbitrary dimension.\n\\end{remark}\n\n\\subsection{Scattering diagrams and theta functions}\n\\label{sec:scat}\nTheta functions are a particular class of global function on (quotients and fibres of) cluster varieties introduced in \\cite{GHKK}.\nIn this subsection we outline their construction.\nThe main case to consider is the one of $\\cAp$ since scattering diagrams and theta functions for (quotients of) $\\cA$ and (fibres of) $\\cX$ can be constructed from this case.\n\n\\begin{remark}\n\\label{rem:full_rank_assumption}\nFrom now on, whenever we consider  the variety $\\cA=\\cA_{\\Gamma,\\seed_0}$ we will assume $\\Gamma$ is of {\\bf full-rank}. \nBy definition this means that the map $p_1^*:\\Nuf \\to M^\\circ$ given by $n \\mapsto \\{ n , \\cdot \\}$ is injective. \nThere are various results of this article for $\\cA$ that are valid even if $\\Gamma$ is not of full-rank. \nHowever, various key results we shall use do need the full-rank condition (\\cf Remark \\ref{rem:all_from_cAp}). \nEven though we are imposing full-rank assumption we will frequently recall that we are assuming it to insist on the necessity of the assumption.\n\\end{remark}\n\n\\subsubsection{Theta functions on full-rank $\\cA$}\n\\label{sec:tf_A}\nThroughout this section we systematically identify $\\cA^{\\vee}_{\\seed^\\vee}(\\R^T)$ with $M^\\circ_{\\R}$, see \\S\\ref{ss:tropicalization}.\nA {\\bf wall} in $M^{\\circ}_\\R$ is a pair $(\\wall, f_{\\wall})$ where\n $\\wall \\subseteq M^{\\circ}_\\R$ is a convex rational polyhedral cone of codimension one, contained in $n^{\\perp}$ for some $n \\in N_{\\uf, \\seed}^+$, and \n $f_{\\wall}  = 1+ \\sum_{k \\geq 1} c_k z^{kp^*_1(n)}$ is called a {\\bf scattering function}, where $c_k \\in \\Bbbk$. \nA {\\bf scattering diagram} $\\scat $ in $M^{\\circ}_\\R$ is a (possibly infinite) collection of walls satisfying a certain finiteness condition (see \\cite[\\S1.1]{GHKK}). \nThe {\\bf support} and the {\\bf singular locus} of $\\scat $ are defined as\n\\[\n\\Supp(\\scat):= \\bigcup_{\\wall \\in \\scat} \\wall \\ \\ \\ \\text{and} \\ \\ \\ \\Sing(\\scat):= \\bigcup_{\\wall \\in \\scat} \\partial\\wall \\ \\cup \\bigcup_{\\overset{\\wall_1,\\wall_2 \\in \\scat}{\\text{dim}(\\wall_1 \\cap \\wall_2) = |I|-2}} \\wall_1 \\cap \\wall_2.\n\\]\n\nA wall $(\\wall, f_{\\wall})$ defines a {\\bf wall-crossing automorphism} $\\mathfrak{p}_{\\wall}$ of $\\Bbbk (M)$  \ngiven in a generator $ z^m$ by $\n\\mathfrak{p}_{\\wall}(z^m)=z^m f_{\\wall}^{\\langle n_{\\wall}, m \\rangle }$,\nwhere $n_{\\wall}$ is the primitive normal vector of the wall $\\wall$ with a choice of direction going against the flow of the path $\\gamma$. \nIf we fix a scattering diagram $\\scat$ and a piecewise linear proper map $ \\gamma:[0,1]\\to M^\\circ_{\\R}\\setminus \\Sing(\\scat)$ intersecting $\\text{Supp}( \\mathfrak{D})$ transversely \nthen the {\\bf path ordered product} $ \\mathfrak{p}_{\\gamma , \\scat}$ is defined as the composition of automorphisms of the form $\\mathfrak{p}_{\\wall}$, where we consider the walls $\\wall$ that are transversely crossed by $\\gamma $. However, observe that $\\gamma$ might cross an infinite number of walls, therefore, we would be potentially composing an infinite number of automorphisms and such infinite composition is well defined.  \nAgain, the reader is referred to \\cite[\\S 1.1]{GHKK} for a detailed discussion.\n\n\\begin{definition}\nA scattering diagram $\\scat$ is {\\bf consistent} if for all $\\gamma$ as above $\\mathfrak{p}_{\\gamma, \\scat}$ only depends on the endpoints of $\\gamma$. Two scattering diagrams $\\scat$ and $\\scat'$ are {\\bf equivalent} if $\\mathfrak{p}_{\\gamma, \\scat}= \\mathfrak{p}_{\\gamma, \\scat'}$ for all $\\gamma$.\n\\end{definition}\n\nTo define cluster scattering diagrams for  $\\cA$ one first considers\n\\[\n\\scat_{{\\rm in}, \\seed}^{\\cA} := \\lrc{\\left.\\left( e_i^{\\perp} , 1+z^{ p_{1}^*\\left( e_i \\right)}\\right) \\right| \\  i \\in \\Iuf }.\n\\]\nA {\\bf cluster scattering diagram} for $\\cA$ is a consistent scattering diagram in $M^{\\circ}_{ \\R}$ containing $\\scat_{{\\rm in}, \\seed}^{\\cA}$. By the following theorem, cluster scattering diagrams for $\\cA$ do exist (provided $\\Gamma$ is of full-rank). \n\n\\begin{theorem} \\cite[Theorem 1.12 and 1.13]{GHKK}\n\\label{thm:consistent_scattering_diagrams}\nAssume $\\Gamma$ is of full-rank. Then for every seed $\\seed$ there is a consistent scattering diagram $\\scat_{\\seed}^{\\cA} $ such that $\\scat_{{\\rm in}, \\seed}^{\\cA} \\subset \\scat_{\\seed}^{\\cA}$.\nFurthermore $\\scat_{\\seed}^{\\cA}$ is equivalent to a scattering diagram all of whose scattering functions are of the form ${ f_{\\wall} = (1+ z^{p_{1}^*(n)})^c}$, for some $n \\in N$, and $c$ a positive integer. \n\\end{theorem}\n\n\\begin{definition} \\thlabel{def:genbroken}\nFix a cluster scattering diagram $\\scat^{\\cA}_{\\seed}$.\nLet $\\mono \\in M^{\\circ} \\setminus \\{0\\}$ and $x_0 \\in M^{\\circ}_{\\R} \\setminus \\text{Supp}(\\scat)$. \nA (generic) {\\bf broken line} for $\\scat^{\\cA}_{\\seed}$ with initial exponent $\\mono$ and endpoint $x_0$ is a piecewise linear continuous proper path $\\gamma : ( - \\infty , 0 ] \\rightarrow M^\\circ_{\\R} \\setminus \\Sing (\\scat^{\\cA}_{\\seed})$ bending only at walls, with a finite number of domains of linearity $L$ and a monomial $c_L z^{\\mono_L} \\in \\Bbbk[M^\\circ]$ for each of these domains. The path $\\gamma$ and the monomials $c_L z^{\\mono_L}$ are required to satisfy the following conditions:\n\\begin{itemize}\n\\setlength\\itemsep{0em}\n    \\item $\\gamma(0) = x_0$.\n    \\item If $L$ is the unique unbounded domain of linearity of $\\gamma$, then $c_L z^{\\mono_L} = z^{\\mono}$.\n    \\item For $t$ in a domain of linearity $L$, $\\gamma'(t) = -\\mono_L$.\n    \\item If $\\gamma$ bends at a time $t$, passing from the domain of linearity $L$ to $ L'$ then $c_{L'}z^{\\mono_{L'}}$ is a term in $\\mathfrak{p}_{{\\gamma}|_{(t-\\epsilon,t+\\epsilon)},\\scat_t} (c_L z^{m_L}) $, where ${\\scat_t = \\lrc{\\left.(\\wall, f_{\\wall}) \\in \\scat^{\\cA}_{\\seed}\\right| \\gamma (t) \\in \\wall }}$.\n\\end{itemize}\nWe refer to $\\mono_L$ as the {\\bf slope} or {\\bf exponent vector} of $\\gamma$ at $L $ and set \n\\begin{itemize}\n    \\item $I(\\gamma) = \\mono$;\n    \\item $\\text{Mono} (\\gamma) = c(\\gamma)z^{F(\\gamma)}$\nto be the monomial attached to the unique domain of linearity of $\\gamma$ having $x_0 $ as an endpoint. \n\\end{itemize}\n\\end{definition}\n\n\\begin{definition}\\thlabel{def:theta}\nChoose a point $x_0$ in the interior of $\\mathcal{C}_\\seed^+:=\\{m\\in M^{\\circ}_{\\R}\\mid \\langle e_i, m \\rangle \\geq 0 \\text{ for all  } i \\in \\Iuf\\}$ and let $\\mono\\in \\cA^\\vee_{\\seed^\\vee}(\\Z^T)=M^\\circ$. \nThe {\\bf theta function} on $\\cA$ associated to $\\mono$ is \n\\begin{equation}\n\\label{eq:tf}\n    \\tf^{\\cA}_{ \\mono}:= \\sum_{\\gamma} \\text{Mono} (\\gamma),\n\\end{equation}\nwhere the sum is over all broken lines $\\gamma$ with $I(\\gamma)=\\mono$ and $\\gamma(0)=x_0$. \nFor $\\mono = 0$ we define  $\\vartheta^{\\cA}_{0} =1$. We say $\\tf^{\\cA}_{ \\mono}$ is {\\bf polynomial} if the sum in \\eqref{eq:tf} is finite.\n\\end{definition}\n\n\\begin{remark}\n\\label{rem:on_tf}\nIt is a nontrivial fact that $\\tf^{\\cA}_{\\mono}$ is independent of the point $x_0\\in \\mathcal{C}_{\\seed}^+$ we have chosen, see \\cite[\\S3]{GHKK}.\nMoreover, in general $\\tf^\\cA_{\\mono}$ can be an infinite sum and in order to think of $\\tf^{\\cA}_{\\mono} $ as a function on a space one needs to work formally an consider a degeneration of $\\cA$, see \\cite[Proposition 3.4 and \\S6]{GHKK} for the details.\nHowever, in case $\\tf^{\\cA}_{\\mono}$ is polynomial then $\\tf^{\\cA}_{\\mono}\\in H^0(\\cA,\\mathcal{O}_{\\cA})$, that is, $\\tf^{\\cA}_{\\mono}$ is an algebraic function on $\\cA$. \nThe definition of $\\tf^{\\cA}_{\\mono}$ in \\eqref{eq:tf} corresponds to the expression of such function written in the coordinates of the seed torus $\\cA_{\\seed}$.\n\\end{remark}\n\n\\subsubsection{Labeling by tropical points}\n\nRecall that we are identifying $\\cA^{\\vee}_{\\seed^\\vee}(\\R^T)$ and $ M^{\\circ}_{\\R}$. \nBy construction, a theta function on $\\cA$ is labeled by a point $\\mono \\in \\cA^\\vee_{\\seed^\\vee}(\\Z^T)=  M^{\\circ}$. \nBy \\cite[Proposition 3.6]{GHKK}, this labeling upgrades to a labeling by a point in $\\cA^\\vee(\\Z^T)$. \nThe main point being that if we let $m'=(\\mu^{\\cA^\\vee}_{k})^T(m)\\in \\cA^{\\vee}_{\\mu_k(\\seed^\\vee)}(\\Z^T)$ for $k \\in \\Iuf$ then $\\tf^{\\cA}_\\mono$ and $\\tf^{\\cA}_{m'}$ correspond to the same function (see Remark \\ref{rem:on_tf}) expressed, however, in different cluster coordinates.\nThis fact is of great importance for this paper so we would like to highlight it:\n\n\\begin{center}\n    \\emph{every theta function on $\\cA$ is naturally labeled by a point of $\\cA^\\vee(\\Z^T)$}.\n\\end{center}\n\nIn light of the discussion just above, from now on we label theta functions on $\\cA$ either by elements of $\\cA^{\\vee}(\\Z^T)$ or of $\\cA^{\\vee}_{\\seed^\\vee}(\\Z^T)$. \nFor sake of clarity, tropical points are denoted in bold font and as tuples. \nThat is, ${\\bf m}=(m_{\\seed^\\vee})_{\\seed^\\vee\\in \\orT}$ denotes an element of $\\cA^\\vee(\\Z^T)$ and $m_{\\seed^\\vee}=\\mathfrak{r}_{\\seed^\\vee}({\\bf m})$. \n With this notation we have the following identity\n \\[\n\\tf^{\\cA}_{\\bf m}=\\tf^{\\cA}_{m_{\\seed^\\vee}}.\n \\]\n\nEven further, we can think of $\\Supp(\\scat^{\\cA}_{\\seed})$ as a subset of $\\cA^{\\vee}_{\\seed^\\vee}(\\R^T) $. By \\cite[Theroem 1.24]{GHKK} we have that for every $k \\in \\Iuf $, $\\mu^{\\cA^\\vee}_{\\seed^\\vee,\\mu_k(\\seed^\\vee)}\\lrp{\\Supp(\\scat^{\\cA}_{\\seed})}=\\Supp(\\scat^{\\cA}_{\\mu_k(\\seed)})$ and that $\\scat^{\\cA}_{\\seed}$ and $\\scat^{\\cA}_{\\mu_k(\\seed)}$ are equivalent. Hence there is a well defined subset $\\Supp(\\scat^\\cA) \\subset \\cA^{\\vee}(\\R^T)$ such that\n\\[\n\\mathfrak{r}_{\\seed^\\vee}\\lrp{\\Supp(\\scat^\\cA) }= \\Supp(\\scat^\\cA_{\\seed})\n\\]\nfor every $\\seed\\in \\orT$. \nThe point here is that $\\Supp(\\scat^\\cA)$ is seed independent.\nSimilarly, the map $\\mu^{\\cA^\\vee}_{\\seed^\\vee,\\mu_k(\\seed^\\vee)} $ determines a bijection between the set of broken lines for $\\scat^{\\cA}_{\\seed}$ and the set of broken lines for $\\scat^{\\cA}_{\\mu_k(\\seed)}$ (see \\cite[Proposition 3.6]{GHKK}). \nIn particular, supports of broken lines make sense in $\\cA^\\vee(\\R^T)$.\n\n\n\\begin{remark}\n    It is possible to upgrade $\\Supp(\\scat^\\cA)$ to a scattering diagram inside $\\cA^{\\vee}(\\R^T)$. In this generality scattering functions are described using log Gromov--Witten invariants. \n    See \\cite{KY19} for details.\n\\end{remark}\n\n\\subsubsection{The middle cluster algebra}\n\nLet us recall now that broken lines also encode the multiplication of theta functions. \nThat is, given a product of arbitrary theta functions $\\tf^{\\cA}_p \\tf^{\\cA}_q$ with $p,q \\in \\cA^\\vee_{\\seed^\\vee}(\\Z^T)$,\nwe can use broken lines to express the structure constants $\\alpha\\lrp{p,q,r}$ in the expansion \n\\begin{align} \\label{eq:product}\n    \\vartheta^{\\cA}_p \\vartheta^{\\cA}_q = \\sum_{r\\in \\cA^\\vee_{\\seed^\\vee}(\\Z^T)} \\alpha(p,q,r) \\vartheta^{\\cA}_r.\n\\end{align}\nWe review the construction here.\nFirst, pick a general endpoint $z$ near $r$.\nThen define (\\cite[Definition-Lemma~6.2]{GHKK})\n\\eq{\n    \\alpha_z (p, q, r) := \\sum_{\\substack{\\lrp{\\gamma^{(1)}, \\gamma^{(2)}} \\\\ I(\\gamma^{(1)})= p,\\ I(\\gamma^{(2)})= q\\\\ \\gamma^{(1)}(0) = \\gamma^{(2)}(0) = z\\\\\n    F(\\gamma^{(1)}) + F(\\gamma^{(2)}) = r   }}   c(\\gamma^{(1)})\\ c(\\gamma^{(2)}), }{eq:multibrokenline}\nwhere the sum is over all pairs of broken lines $\\lrp{\\gamma^{(1)}, \\gamma^{(2)}}$ ending at $z$ with initial slopes $I(\\gamma^{(1)}) = p$, $I(\\gamma^{(2)}) = q$ and final slopes satisfying $F(\\gamma^{(1)})+F(\\gamma^{(2)}) =r$.\nGross--Hacking--Keel--Kontsevich show that for $z$ sufficiently close to $r$, $\\alpha_z (p, q, r)$ is independent of $z$ and gives the structure constant $\\alpha (p, q, r)$ (see \\cite[Proposition~6.4]{GHKK}). \n\n\\begin{definition}\n\\thlabel{def:cmid}\nLet $\\Theta(\\cA):= \\{ {\\bf m} \\in \\cA^{\\vee}(\\Z^T) \\mid \\vartheta^{\\cA}_{{\\bf m}} \\text{ is polynomial}\\}$. The {\\bf middle cluster algebra} $\\text{mid}(\\cA)$ is the $\\Bbbk$-algebra whose underlying vector space is $\\{ \\tf^{\\cA}_{{\\bf m}} \\mid {\\bf m} \\in \\Theta(\\cA) \\}$, the multiplication of the basis elements is given by \\eqref{eq:product} and extended linearly to all $\\cmid(\\cA)$.\n\\end{definition}\n\n\\subsubsection{Theta functions on $\\cAp$}\nThe data $\\Gamma_{\\prin}$ is of full-rank. Therefore, this case is a particular case of \\S\\ref{sec:tf_A}. So we can talk about scattering diagrams, broken lines and theta functions for $\\cAp $. The following result follows from Theorem \\ref{thm:consistent_scattering_diagrams} and the definition of theta functions.\n\n\n\\begin{lemma}\nFix a seed $\\widetilde{\\seed}$ for $\\cAp$ and express theta functions on the cluster coordinates determined by $\\widetilde{\\seed}$.\nFor $(m,n)\\in \\mathfrak{r}_{\\widetilde{\\seed}}(\\Theta(\\cAp))$ we have that $\\tf^{\\cAp}_{(m,n)}=\\tf^{\\cAp}_{(m,0)}\\tf^{\\cAp}_{(0,n)}$ and $\\tf^{\\cAp}_{(0,n)}$ is the Laurent monomial on the coefficients given by $n$.\n\\end{lemma}\n\n\n\nNote that, for $(m_1,n_1),(m_2,n_2) \\in M^{\\circ}_{\\rm prin}$, in general we have that $\\tf^{\\cAp}_{(m_1+m_2,n_1+n_2)} \\neq \\tf^{\\cAp}_{(m_1,n_1)} \\tf^{\\cAp}_{(m_2,n_2)}$. \nThe above lemma holds because the decomposition is only separating the unfrozen and frozen parts (\\cf \\thref{g_is_val} below). \n\n\\begin{remark}\n\\label{rem:all_from_cAp}\nScattering diagrams for $\\cAp $ can be used to define scattering diagrams, broken lines therein and theta functions on a variety $\\cV $ of form $\\cA$ (even if $\\Gamma$ is not of full-rank), $\\cX$, $\\cA/T_{H}$ and $\\cX_{{\\bf 1}}$. \nFurther, in each one of these cases we can define the associated middle cluster algebra $\\cmid(\\cV)$ and the set $\\Theta(\\cV)$ parametrizing its theta basis. \nIn the following subsections we explain the cases of $\\cA/T_{H}$, $\\cX$, and $\\cX_{{\\bf 1}}$ individually. \nWe do not treat the case of $ \\cA $ for $\\Gamma$ when $\\Gamma$ is not of full-rank since the results of \\S\\ref{sec:cluster_valuations} do not apply to this case.\n\\end{remark}\n\n\n\n\n\\subsubsection{Theta functions on $\\cA/T_{H}$}\n\\label{tf_quotient}\nSuppose that $\\Gamma $ is of full-rank (\\cf Remark \\ref{rem:all_from_cAp}). Let $H \\subset K^\\circ$ be a saturated sublattice and consider the quotient $\\cA/T_{H}$ and the fibration $w_H: \\cAm \\to H^*$ (see the end of \\S\\ref{sec:FG_dual}).\nThe next result shows that theta functions on $\\cA$ have a well defined $T_{H}$-weight.\n\n\n\\begin{proposition}\n\\thlabel{prop:dual_fibration}\nEvery polynomial theta function on $\\cA$ is an eigenfunction with respect to the $T_{H}$-action. For every ${\\bf q}\\in \\Theta(\\cA)$ the $T_H$-weight of $\\tf^{\\cA}_{\\bf q}$ is the image of ${\\bf q} \\in \\cA^{\\vee}(\\Z^T)$ under the tropicalized map $w^{T}_{H}:\\cAm(\\Z^T) \\to H^*$. Under the isomorphism $ H^* \\cong M^\\circ/H^\\perp$ and in the lattice identification of $ \\cA^\\vee_{\\seed^\\vee}(\\Z^T)$ of $\\cA^\\vee (\\Z^T)$ the map $w^T_{H}$ is given by\n\\begin{align*}\n   w^{T}_{H} : \\ & \\cA^{\\vee}_{\\seed^\\vee}(\\Z^T) \\to M^\\circ/H^{\\perp},\\\\\n     & \\ \\ \\  \\ \\ \\  q \\longmapsto q + H^{\\perp}.\n\\end{align*}\n\\end{proposition}\nThe claims are essentially contained in the literature already\n(see for instance \\cite[Proposition 7.7]{GHKK}). \nThe differences are that we are acting by a potentially smaller torus (Gross--Hacking--Keel--Kontsevich act by $T_{K^\\circ}$ rather than $T_{H}$) and, regarding the map $w_{H}: \\cA^\\vee \\to T_{H^*}$, \nwe are including $\\Bbbk[H]$ into $\\Bbbk[N^\\vee]=\\Bbbk[N^\\circ]$ rather than including $\\Bbbk[K^\\circ]$ into $\\Bbbk[N^\\circ]$. \nFor the convenience of the reader we give a proof of the statement.\n\n\\begin{proof}[Proof of Proposition~\\ref{prop:dual_fibration}]\nBy \\cite[Theorem~1.13]{GHKK} all scattering functions may be taken to be of the form $\\lrp{1+z^{p^*(n)}}^c$ for some $n \\in \\Nuf$ and some positive integer $c$.\\footnote{That is, the equivalence class of consistent scattering diagrams for $\\cA$ contains a representative whose scattering functions are of this form.}\nFor $q\\in \\Theta(\\cA)_{\\seed^\\vee}$ we have that $\\tf^{\\cA}_q$ is as a Laurent polynomial in $\\Bbbk[M^\\circ]$.\nAll monomial summands of $\\tf^{\\cA}_{q}$ have the form $c_m z^{q + m}$ for some $m \\in p^*(\\Nuf)$ and $c_m \\in \\Z_{>0}$.\nThe $T_{H}$-weight of this monomial is obtained from the map\n\\eqn{\nT_{H} \\to T_{\\Z}=\\Bbbk^*\\quad \\text{given by } \\quad z^{h} \\mapsto z^{\\lra{q+m , h}} \\quad \\text{for $h \\in {H}$}.\n}\nSince $H \\subset p^*\\lrp{\\Nuf}^\\perp$ we have that\n$z^{\\lra{q+m , h}} = z^{\\lra{q, h}}$. \nThat is, the $T_{H}$-weight of each monomial $z^{m'}$, $m'\\in M^\\circ$, is the character of $T_{H}$ given by $m' + H^\\perp  \\in M^\\circ/H^\\perp \\cong H^*$. \nMoreover, all monomial summands of $\\tf^{\\cA}_q$ have the $T_{H}$-weight $q + H^\\perp \\in H^*$.\nNext, the piecewise linear map $(\\mu_k^{\\cA^\\vee})^T:M^\\circ_\\seed \\to M^\\circ_{\\mu_k(\\seed)}$ sends $m$ to $m+m'$ for some $m'\\in p^*(\\Nuf)$.\nSo, the choice of torus does not affect the $T_{H}$-weight.\nTherefore, $\\tf^{\\cA}_q$ is an eigenfunction whose weight is $q + H^\\perp$.\nFurthermore, the projection \n\\eqn{M^\\circ &\\to M^\\circ/H^\\perp\\quad \\text{given by} \\quad q \\mapsto q + H^\\perp } \ndualizes the inclusion $H \\hookrightarrow N^\\circ$.\nSo, restricting to seed tori, this is precisely the tropicalization of the map ${T_{M^\\circ} \\rightarrow T_{H^*}}$ whose pullback is the inclusion $H \\hookrightarrow N^\\circ$.\nSince $p^*$ commutes with mutation,\nwe see that the $T_{H}$-weight of $\\tf^{\\cA}_{\\bf q}$ is the image of ${\\bf q}$ under the tropicalization of \n$ w_{H}: \\cA^\\vee \\rightarrow T_{H^*}$.\n\\end{proof}\n\nEvery weight $0$ eigenfunction on $ \\cA$ induces a well defined function on $\\cA/T_{H}$. \nSo in order to construct a scattering-diagram-like structure $\\scat^{\\cA/T_H}$ defining theta functions on $\\cA/T_{H}$ we consider the {\\bf weight zero slice} inside $\\cA^{\\vee}(\\R^T)$ defined as \n$(w^T_H)^{-1}(0)$. \nObserve that identifying $ \\cA^\\vee$ with $M^\\vee$ via a choice of seed, then $(w^T_H)^{-1}(0)$ corresponds to $H^{\\perp}_{\\R}$.\nWith this in mind, we define $\\supp(\\scat^{\\cA/T_{H}})$ as\n\\[\n\\supp(\\scat^{\\cA/T_{H}}):=\\supp (\\scat^{\\cA})\\cap (w^T_H)^{-1}(0).\n\\]\nThe scattering functions attached to the walls of $\\mathfrak{r}_{\\seed^\\vee}(\\supp(\\scat^{\\cA/T_{H}}))$ are the same as the corresponding functions attached to the walls of $\\scat^{\\cA}_\\seed$. \nThis gives rise to a scattering diagram $\\scat^{\\cA/T_{H}}_{\\seed}$ inside $(\\cA/T_{H})^\\vee_{\\seed^\\vee}(\\R^T)$ for every $\\seed \\in \\orT$.\nThe broken lines for $\\scat^{\\cA/T_{H}}_\\seed$ are the broken lines for $\\scat^{\\cA}_\\seed $ entirely contained in $\\mathfrak{r}_{\\seed^\\vee}(w^{-1}_{H}(0))$.\n\nIn order to label a theta function on $\\cA/T_{H}$ with an element of $(\\cA/T_{H})^{\\vee}(\\Z^T)$ it suffices to consider a bijection $\n(\\cA/T_{H})^{\\vee}(\\R^T) \\overset{\\sim}{\\longrightarrow} (w_H^T)^{-1}(0)$.\nSuch a bijection can be obtained tropicalizing the inclusion $\\mathfrak{i}_H:(\\cA/T_{H})^{\\vee} \\hookrightarrow \\cA^\\vee $. \nIndeed, in lattice identifications of the tropical spaces given by a seed $\\seed$, the map\n$ \\mathfrak{i}_H^T:(\\cA/T_{H})^{\\vee}_{\\seed}(\\Z^T)\\hookrightarrow \\cA^\\vee_{\\seed}(\\Z^T)$ correspond to the inclusion $H^\\perp \\hookrightarrow M^\\vee$ and $w^{-1}_{H}(0)(\\Z) $ corresponds to $H^\\perp$. \n\n\nIn particular, we obtain (as one should have expected) that the theta functions on $\\cA/T_{H}$ are precisely the functions on $\\cA/T_{H}$ induced by the $T_H$-weight zero theta functions on $\\cA$. \nSo we let\n$\\Theta(\\cA/T_H)\\subset (\\cA/T_H)^{\\vee}(\\Z^T)$ be the preimage of $\\Theta(\\cA)\\cap (w_H^T)^{-1}(0)$ under $\\mathfrak{i}^T_H$ and define the middle cluster algebra $\\cmid(\\cA/T_H)$ as in the case of $\\cA$ (see \\thref{def:cmid}).\nIn particular, for ${\\bf m}\\in \\Theta (\\cA/T_H)$ the theta function $\\tf^{\\cA/T_H}_{\\bf m}$ is the function on $\\cA/T_H$ induced by $ \\tf^{\\cA}_{\\mathfrak{i}^T_H({\\bf m})}$. So,\n\\begin{center}\n    \\emph{every theta function on $\\cA/T_H$ is naturally labeled by a point of $(\\cA/T_H)^\\vee(\\Z^T)$}.\n\\end{center}\n\n\n\\subsubsection{Theta functions on $\\cX$}\n\\label{sec:tf_X}\nRecall from \\S\\ref{sec:principal_coefficients} that there is an isomorphism $\\chi: \\cAp/T_{H_{\\cAp}}\\to \\cX$, where \\[\nH_{\\cAp}=\\lrc{\\lrp{n,-(p^*)^*(n)}\\in N^\\circ_\\prin \\mid n \\in N^\\circ} \\subset K^{\\circ}_{\\prin}. \n\\]\nHence, the construction of theta functions on $\\cX$ is already covered in the previous subsection. \nHowever, there is a very subtle difference created by treating $ \\cAp/T_{H_{\\cAp}}$ as a cluster $\\cX$-variety as opposed to a quotient of $\\cAp$: \n\\begin{center}\n\\emph{every theta function on $ \\cX$ is naturally labeled by a point of $\\cX^\\vee(\\Z^t)$ as opposed to $\\cX^\\vee(\\Z^T)$.}\n\\end{center} \nIf we would proceed as in the previous subsection we would label theta functions on $ \\cAp/T_{H_{\\cAp}}$ by points of $(\\cAp/T_{H_{\\cAp}})^\\vee(\\R^T)$.\nThe origin of the difference is made explicit by the following lemma.\n\n\\begin{lemma}\n\\label{lem:right_tropical_space}\nThere is a canonical bijection between $\\cX^{\\vee}(\\R^t)$ and $\\lrp{w^T_{H_{\\cAp}}}^{-1}(0)\\subset \\cAp^\\vee(\\R^T)$.\n\\end{lemma}\n\\begin{proof}\nOne can verify directly that the composition $\\xi^T_{\\Gamma^\\vee}\\circ i$ gives rise to the desired bijection, where \\[\ni:\\cX^\\vee(\\R^t) \\to \\cX^\\vee(\\R^T)\n\\]\nis the bijection discussed in \\S\\ref{ss:tropicalization} and \\[\n\\xi_{\\Gamma^\\vee}^T:  \\cX^\\vee(\\R^T) \\to \\cAp^{\\vee}(\\R^T) \n\\]\nis the tropicalization of the map $\\xi_{\\Gamma^\\vee}:\\cX^\\vee=\\cA_{\\Gamma^\\vee} \\to  \\cX_{(\\Gamma^\\vee)_{\\prin}}\\cong \\cX_{(\\Gamma_{\\prin})^\\vee}=\\cAp^{\\vee} $ described in \\eqref{eq:def_xi}, see Remarks \\ref{rem:labels} and \\ref{rem:Lprin}. \nHowever, for the convenience of the reader we include computations that show in a rather explicit way the necessity to consider \n$\\cX^\\vee(\\Z^t)$ as opposed to $\\cX^\\vee(\\Z^T)$. For simplicity throughout this proof we denote $w_{H_{\\cAp}}$ simply by $w$.\n\nPick a seed $\\seed=(e_i)_{i \\in I}\\in \\orT$ for $\\Gamma$ and consider the seed $\\seed^\\vee$ for $\\Gamma^\\vee$. Denote by $\\widetilde{\\seed}^\\vee$ the seed for $(\\Gamma_{\\prin})^\\vee$ obtained mutating $ \\seed_{0_{\\prin}}$ in the same sequence of directions needed to obtain $\\seed$ from $\\seed_0$. Then\n\\[\n\\lrp{(w^T)^{-1}(0)}_{\\widetilde{\\seed}^\\vee}(\\R^T)=H^\\perp_{\\cAp}=\\{(p^*(n),n)\\in M^\\circ_{\\prin,\\R} \\mid n\\in N_{\\R}\\}\\subset M^\\circ_{\\prin, \\R}=M^\\circ_{\\R}\\oplus N_{\\R}\n\\]\n(see \\eqref{eq:identification} to recall the meaning of $\\lrp{(w^T)^{-1}(0)}_{\\widetilde{\\seed}^\\vee}(\\R^T)$). We now verify that for every $k\\in \\Iuf$ there is a commutative diagram\n\\[\n\\xymatrix{\n\\lrp{(w^T)^{-1}(0)}_{\\widetilde{\\seed}^{\\vee}}(\\R^T) \\ar^{\\lrp{\\mu^{\\cAp^\\vee}_{k}}^T}[rr]  \\ar_{\\pi^{\\cX^\\vee}_1}[d] & & \\lrp{(w^T)^{-1}  (0)}_{\\mu_k(\\widetilde{\\seed}^{\\vee})}(\\R^T) \\ar^{\\pi^{\\cX^\\vee}_2}[d]\n\\\\\n\\cX^{\\vee}_{\\seed^{\\vee}}(\\R^t)\\ar^{\\lrp{\\mu^{\\cX^\\vee}_{k}}^t}[rr] & & \\cX^{\\vee}_{\\mu_k(\\seed^{\\vee})}(\\R^t),\n}\n\\]\nwhere the vertical maps $\\pi^{\\cX^\\vee}_1$ and $\\pi^{\\cX^\\vee}_2$ are both given by $(p^*(n),n)\\mapsto dn$ (recall that $\\cX^{\\vee}_{\\seed^{\\vee}}(\\R^t)=(N^\\vee)^\\circ_{\\R}= (d\\cdot N)_{\\R} =\\cX^{\\vee}_{\\mu_k(\\seed^{\\vee})} (\\R^t)$). By definition we have that\n\\begin{eqnarray*}\n    \\lrp{\\mu^{\\cAp^\\vee}_{k}}^T(p^*(n),n)& \\overset{\\eqref{eq:tropical_X_mutation}}{=} & (p^*(n),n)+[\\langle (d e_k,0),(p^*(n),n)\\rangle]_+\\{d_ke_k, \\cdot \\}^{\\vee}_{\\prin} \\\\\n    &=&  (p^*(n),n)+ [p^*(n)(de_k)]_+(\\{d_ke_k, \\cdot\\}^\\vee,d_ke_k)\\\\\n    &=&(p^*(n) + [\\{n, de_k\\}]_+\\{d_ke_k, \\cdot\\}^\\vee,n+ [\\{n, de_k\\}]_+d_ke_k).\n\\end{eqnarray*}\nUsing the facts that $d, d_k>0$ and \n that $d\\max(a,b)=\\max(da,db)$ and $\\max(a,b)=-\\min(-a,-b)$ for all $a,b \\in \\R$, we compute that\n\\begin{eqnarray*}\n\\pi^{\\cX^\\vee}_2\\lrp{\\lrp{\\mu^{\\cAp^\\vee}_{k}}^T(p^*(n),n)} &=&  dn+ d[\\{n, de_k\\}^\\vee]_+d_ke_k\\\\\n&=&  dn+ [\\{dn, de_k\\}^\\vee]_+d_ke_k\\\\\n &=&  dn+[-\\{de_k,dn\\}^{\\vee}]_+d_ke_k\\\\\n &=&  dn+[-\\{d_ke_k,dn\\}^{\\vee}]_+de_k\\\\\n &=&  dn-[\\{d_ke_k,dn\\}^{\\vee}]_-de_k\\\\\n  &=&  dn-[\\langle v_k^\\vee,dn\\rangle]_-de_k\\\\\n&=& dn+[\\langle v_k^\\vee,dn\\rangle]_-(-d_k^\\vee e^\\vee_k)\\\\\n&\\overset{\\eqref{eq:tropical_A_mutation}}{=}& \\lrp{\\mu^{\\cX^\\vee}_{k}}^t (dn)\\\\\n&=& \\lrp{\\mu^{\\cX^\\vee}_{k}}^t \\lrp{\\pi^{\\cX^\\vee}_1(p^*(n),n)}.\n\\end{eqnarray*}\nThis gives the commutativity of the diagram. \nNotice moreover that $\\pi^{\\cX^\\vee}_1$ and $\\pi^{\\cX^\\vee}_2$ are canonical bijections. \nThese two facts together imply that we have a well defined bijection\n\\[\n\\pi^{\\cX^\\vee}: (w^T)^{-1}(0)(\\R^T) \\overset{\\sim}{\\longrightarrow} \\cX^{\\vee}(\\R^t).\n\\]\nThe fact that $\\xi^T_{\\Gamma^\\vee}\\circ i$ is the inverse of $\\pi^{\\cX^\\vee} $ follows from noticing that, in lattice identifications of the domain and codomian of $\\xi^T_{\\Gamma^\\vee}$ given by a choice of seed, we have that\n\\[\n\\xi^T_{\\Gamma^\\vee}(dn)=(-p^*(n),-n).\n\\]\n\\end{proof}\nWe can now define cluster scattering diagrams for $ \\cX$ using cluster scattering diagrams for $\\cAp$ and the quotient map $\\tilde{p}:\\cAp \\to \\cX$ described in \\eqref{eq:def_tilde_p} and the content of Lemma \\ref{lem:right_tropical_space}.\nWe define $\\supp(\\scat^{\\cX})$ as \n\\[\n\\supp(\\scat^{\\cX}):=\\pi^{\\cX^\\vee}\\lrp{\\supp (\\scat^{\\cAp})\\cap (w^T_H)^{-1}(0)}\\subset \\cX^\\vee(\\Z^t).\n\\]\nBy definition the support of the scattering diagram $ \\scat^{\\cX}_{\\seed}$ is $\\mathfrak{r}_{\\seed^\\vee}\\lrp{\\supp(\\scat^{\\cX})}$.\nThe scattering functions attached to the walls of $\\supp(\\scat^{\\cX}_\\seed)$ are obtained by applying $\\tilde{p}^*$ to the scattering functions of the corresponding walls of $\\scat^{\\cAp}_\\seed$. \nWe proceed in an analogous way to define broken lines for $\\scat^{\\cX}_\\seed$.\nAs in the previous cases, supports of broken lines are well defined inside $\\cX^\\vee(\\Z^t)$.\n\nThe labeling of a theta function on $\\cX$ with an element of $\\cX^{\\vee}(\\Z^t)$ is obtained using the bijection of Lemma \\ref{lem:right_tropical_space}. More precisely,\nfor ${\\bf n} \\in \\cX^\\vee(\\Z^t)$ with ${\\bf n}\\in \\Theta(\\cX)$ we have\n\\[\n\\tilde{p}^*(\\tf^\\cX_{\\bf n})=\\tf^{\\cAp}_{\\xi^T_{\\Gamma^\\vee}\\circ i({\\bf n})}.\n\\]\nExplicitly, in lattice identifications of the tropical spaces, we have that for $dn \\in \\cX^\\vee_{\\seed^\\vee}(\\Z^t) $\n\\[\n\\tilde{p}^*\\lrp{\\tf^{\\cX}_{dn}}:= \n\\tf^{\\cAp}_{(p^*(n),n)}.\n\\]\n\n\n\\begin{example}\n\\label{running_example_1}\nLet $\\epsilon\n=\n\\lrp{\\begin{matrix}\n0 & 2 \\\\\n-1 & 0\n\\end{matrix}}$\nand $d_1=1, d_2=2$. Using the above parametrization we compute\n\\[\n\\tf^{\\cX}_{2(-1,-2)}=X_1^{-1}X_2^{-2}+2X_1^{-1}X_2^{-1}+X_1^{-1}.\n\\]\nIndeed, we have that $\\xi^T_{\\Gamma^\\vee}\\circ i(2(-1,-2))=(2,-2)$ and\n\\[\n\\tf^{\\cAp}_{(2,-2),(-1,-2)}= \\lrp{\\tf^{\\cAp}_{(1,-1),(0,0)}}^2 \\tf^{\\cAp}_{(0,0),(-1,-2)} = \\lrp{\\dfrac{A_1+t_2}{A_2}}^2t_1^{-1}t_2^{-2}= \\tilde{p}^*(X_1^{-1}X_2^{-2}+2X_1^{-1}X_2^{-1}+X_1^{-1}).\n\\]\n\\end{example}\n\n\n\n\\subsubsection{Theta functions on $\\cXe$}\n\\label{tf_fibre}\nAs in the previous subsections we would like to highlight that \n\\begin{center}\n\\emph{every theta function on $ \\cXe$ is naturally labeled by a point of $(\\cXe)^\\vee(\\Z^t)$}\n\\end{center} \nas we now explain.\nThe tropical space $ (\\cXe)^{\\vee}(\\R^t)  $ is the quotient of $ \\cX^{\\vee} (\\R^t)$ by the tropicalization of the action of $T_H$ on $\\cX^{\\vee}$. \nIn other words, since the variety $(\\cXe)^{\\vee}$ is a quotient of $\\cX^\\vee$, we can consider the quotient map by $\\varpi_{H}: \\cX^\\vee \\to (\\cXe)^\\vee$\nto obtain a surjection\n\\[\n\\varpi_H^t:  \\cX^{\\vee}(\\R^t) \\to (\\cXe)^{\\vee} (\\R^t).\n\\]\nThen, given $\\overline{\\bf n}\\in (\\cXe)^{\\vee} (\\R^t)$ and ${\\bf n}\\in (\\varpi_H^t)^{-1}(\\overline{\\bf n})$ we define \n\\[\n\\tf^{\\cXe}_{\\overline{\\bf n}}=\\tf^{\\cX}_{\\bf n}|_{\\cXe}.\n\\]\nMore concretely, working in lattice identifications of the tropical spaces, we have that $\\cX^{\\vee}(\\R^t)_{\\seed^\\vee} = N_\\R$ and $ (\\cXe)^{\\vee}_{\\seed^\\vee}(\\R^t) {\\cong}N_\\R/H_{\\R}$. \nThen for every $n \\in N$\n\\[\n\\tf^{\\cXe}_{d n + H}=\\tf^{\\cX}_{dn}|_{\\cXe}.\n\\]\nOne can proceed in an analogous way as in the previous cases to construct a scattering diagram like structure $\\scat^{\\cXe}_{\\seed}$ inside $(\\cXe)^\\vee_{\\seed}(\\Z^t)$. In turn we obtain a description of  $ \\tf^{\\cXe}_{\\overline{\\bf n}}$ using broken lines and use these to define $\\cmid (\\cXe)$ and $\\Theta(\\cXe)$.\n\n\n\n\\subsubsection{The full Fock--Goncharov conjecture}\n\\label{sec:FG_conj}\n\n\n\nLet $\\cV$ be a scheme of the form $ \\cA$, $\\cX$, $\\cA/T_{H}$ or $\\cX_{{\\bf 1}}$. \nThe {\\bf upper cluster algebra} of $\\cV$ is defined as \n\\[\n\\text{up}(\\cV):=H^0(\\cV,\\mathcal{O}_{\\cV}).\n\\]\nEvery polynomial theta function on $\\cV$ belongs to $\\text{up}(\\cV)$, therefore, we have a natural $\\Bbbk$-linear map $\\cmid(\\cV)\\to \\text{up}(\\cV)$.\nIf $\\cV$ is one of $\\cA$ (see Remark \\ref{rem:full_rank_assumption}) or $\\cX$ it was proved in \\cite[Theorem 7.5, Corollary 7.13, Theorem 7.16]{GHKK} that this map is in fact an injective homomorphism of algebras.\nThese cases already imply that the same is true is $\\cV$ if of the form $\\cA/T_H$ or $\\cXe$.\n\n\\begin{remark}\n\\label{rem:integral_domain}\n    If $\\cV= \\cA$, $\\cX$, $\\cA/T_{H}$ or $\\cX_{{\\bf 1}}$ then $\\cmid(\\cV)$ is an integral domain. Indeed, $\\cmid(\\cV)$ is a subalgebra of $ \\up(\\cV)=H^0(\\cV,\\mathcal{O}_{\\cV})$ which is a domain as $\\cV$ is irreducible.\n\\end{remark}\n\n\nAs we have seen in the previous subsections theta functions on varieties of the form $ \\cA$ or $\\cA/T_H$ are naturally labeled by the $\\Z^T$-points of its Fock--Goncharov dual, whereas theta functions on varieties of the form $ \\cX$ or $\\cXe$ are naturally labeled by the $\\Z^t$-points of its Fock--Goncharov dual. \nSince we would like to consider all these cases simultaneously we introduce the following notation. For $G= \\Z, \\Q$ or $\\R$ we set\n\n\\begin{equation}\n\\label{eq:unif}\n    \\Trop_G(\\cV):=\n    \\begin{cases}\n        \\cV(G^t) &\\text{ if } \\cV=\\cA \\text{ or } \\cV=\\cA/T_H\\vspace{1mm}\\\\\n        \\cV(G^T) \\ & \\text{ if } \\cV=\\cX \\text{ or } \\cV=\\cXe.\n    \\end{cases}\n\\end{equation}\n\nSimilarly, for a positive rational function $g: \\cV \\dashrightarrow \\Bbbk $  we let\n\\begin{equation}\n\\label{eq:unif_function}\n    \\Trop_G(g):=\n    \\begin{cases}\n        g^t &\\text{ if } \\cV=\\cA \\text{ or } \\cV=\\cA/T_H\\vspace{1mm}\\\\\n        g^T \\ &\\text{ if } \\cV=\\cX \\text{ or } \\cV=\\cXe.\n    \\end{cases}\n\\end{equation}\n\nIn particular, if we think of the seed torus $\\cV_\\seed$ as a cluster variety with only frozen directions then $\\Trop_G(\\cV_\\seed)=\\mathfrak{r}_{\\seed}(\\Trop_G(\\cV))=\\cV_{\\seed}(G^t)$, if $\\cV$ is of the form $\\cA$ or $\\cA/T_H$ and $\\Trop_G(\\cV_\\seed)=\\mathfrak{r}_{\\seed}(\\Trop_G(\\cV))=\\cV_{\\seed}(G^T)$, if $\\cV$ is of the form $\\cX$ or $\\cXe$. For later use we also set\n\\begin{equation}\n    \\label{eq:Theta_seed}\n\\Theta(\\cV)_{\\seed^\\vee}:=\\mathfrak{r}_{\\seed^\\vee}(\\Theta(\\cV))\\subset \\Trop_{\\Z}(\\cV^\\vee),\n\\end{equation}\nsee the line just below equation \\eqref{eq:identification}. \nFollowing \\cite{GHKK} we introduce the following definition.\n\\begin{definition}\n\\label{def:full_FG}\nLet $\\cV$ be a scheme of the form $ \\cA$, $\\cX$, $\\cA/T_{H}$ or $\\cX_{{\\bf 1}}$. We  say that {\\bf the full Fock--Goncharov conjecture} holds for $\\cV$ if\n\\begin{itemize}\n    \\item $\\Theta(\\cV)=\\Trop_{\\Z}(\\cV^{\\vee})$, and\n    \\item the natural map $\\cmid(\\cV) \\to \\text{up}(\\cV)$ is an isomorphism.\n\\end{itemize}\n\\end{definition}\n\n\n\\section{Bases of theta functions for partial minimal models}\n\\label{sec:minimal_models}\n\nIn \\cite{GHKK}, the authors obtained nearly optimal conditions ensuring that the full Fock--Goncharov conjecture holds for a cluster variety. \nHowever, they were able to prove that the ring of regular functions of a partial compactifications of a cluster varieties has a basis of theta functions under much stronger conditions. \nIn this section we outline this framework, including quotients and fibres of cluster varieties, and refer to \\cite[\\S9]{GHKK} for a detailed treatment. \nThe main class of (partial) compactifications we shall consider are the (partial) minimal models defined below.\n\n\\begin{definition}{\\cite{GHK_birational}}\n\\label{def:cv_minimal_model}\nLet $\\cV$ be a scheme of the form $\\cA, \\cX, \\cA/T_H$ or $\\cXe$. An inclusion $\\cV \\subset Y$ as an open subscheme of a normal variety $Y$ is a {\\bf partial minimal model} of $ \\cV$ if the canonical volume form on $\\cV$ has a simple pole along every irreducible divisor of $Y$ contained in $ Y \\setminus \\cV$. It is a {\\bf minimal model} if $Y$ is, in addition, projective. We call $ Y \\setminus \\cV$ the {\\bf boundary} of $\\cV  \\subset Y$.\n\\end{definition}\n\nFor example, if $\\cV$ is a cluster $\\cA$-variety with frozen variables we can let these variables vanish to obtain a partial minimal model of $\\cV$ as in \\cite[Construction B.9]{GHKK}. \nSimilarly, if we consider a torus as a cluster variety (by letting $\\Iuf = \\emptyset$) then a partial minimal model is simply a normal toric variety.\n\n\nGiven a partial minimal model $\\cV\\subset Y$, where $\\cV$ is a scheme of the form $\\cA, \\cX, \\cA/T_H$ or $\\cXe$, we would like to describe the set of theta functions on $\\cV$ (resp. $\\cV^\\vee$) that extend to $Y$ in a similar way as the ring of algebraic functions on a normal toric variety is described in toric geometry using polyhedral fans. \nIn order to be able to do so we need that the pair $(\\cV, \\cV^\\vee)$ satisfies a technical condition --\\emph{theta reciprocity}-- that we will introduce shortly. \nFor this, we need to discuss first the \\emph{tropical pairings} associated to the pair $(\\cV,\\cV^{\\vee})$.\n \nIn order to define the tropical pairings we temporarily assume that $\\cV$ is a variety of the form $\\cA$ or $\\cA/T_{H}$ so that $\\cV^\\vee$ is a cluster $\\cX$-variety or a fibre of a cluster $\\cX$-variety, respectively. \nIn particular, $\\Theta(\\cV)\\subset \\cV^\\vee(\\Z^T)= \\Trop_{\\Z}(\\cV^\\vee)$ and $\\Theta(\\cV^\\vee)\\subset \\cV(\\Z^t)=\\Trop_{\\Z}(\\cV)$, see \\eqref{eq:unif}.  \nRecall from Remark~\\ref{rmk:geometric trop} that the set $\\cV(\\Z^t)$ (resp. $\\cV^\\vee(\\Z^t)$) is canonically identified with the geometric tropicalization\n$\\cV^\\trop(\\Z)$ (resp. $(\\cV^\\vee)^\\trop(\\Z)$). \nTherefore, we systematically think of the elements of $\\cV(\\Z^t)$ (resp. $\\cV^\\vee(\\Z^t)$) as divisorial discrete\nvaluations on $\\Bbbk(\\cV)$ (resp. $\\Bbbk(\\cV^\\vee)$).\nWe also consider the bijection $i : \\cV^\\vee(\\Z^T) \\to \\cV^\\vee(\\Z^t )$ introduced in \\S\\ref{ss:tropicalization} (see the comment bellow \\eqref{eq:imap}).\nThe {\\bf tropical pairings} associated to the pair $(\\cV,\\cV^\\vee) $ are the functions $\n    \\langle \\cdot , \\cdot \\rangle : \\Theta(\\cV^{\\vee})  \\times \\Theta (\\cV)  \\to \\Z   $ and $ \\langle \\cdot , \\cdot \\rangle^{\\vee} : \\Theta(\\cV^{\\vee})  \\times \\Theta (\\cV)  \\to \\Z$ given by\n\\[\n    \\langle {\\bf v} , {\\bf b} \\rangle = {\\bf v}(\\tf^{\\cV}_{\\bf b}) \\ \\ \\ \\ \\ \\ \\ \\text{and} \\ \\ \\ \\ \\ \\ \\ \\langle {\\bf v} , {\\bf b} \\rangle^{\\vee} = i({\\bf b}) (\\tf^{\\cV^{\\vee}}_{\\bf v}),\n\\]\n\n\n\\begin{definition}\n\\label{def:theta_reciprocity}\n Let $\\cV$ be a scheme of the form $\\cA, \\cX, \\cA/T_H$ or $\\cXe$. The pair $(\\cV,\\cV^\\vee)$ has {\\bf theta reciprocity} if $\\Theta(\\cV)=\\Trop_{\\Z}(\\cV^\\vee)$, $\\Theta(\\cV^{\\vee})=\\Trop_{\\Z}(\\cV)$, and $ \\langle {\\bf v} , {\\bf b} \\rangle = \\langle {\\bf v} , {\\bf b} \\rangle^{\\vee} $ for all $({\\bf v},{\\bf b})\\in \\Trop_{\\Z}(\\cV) \\times \\Trop_{\\Z}(\\cV^\\vee)$.\n\\end{definition}\n\\begin{remark}\n    Definition \\ref{def:theta_reciprocity} shall not be considered artificial. In fact, an analogous conjecture for affine log Calabi--Yau varieties with maximal boundary is expected to hold true, see \\cite[Remark 9.11]{GHKK}.\n\\end{remark}\n\n\n\\begin{lemma}\n\\label{lem:tf_that_extend}\n    Let $\\cV$ be a scheme of the form $\\cA, \\cX, \\cA/T_H$ or $\\cXe$ and let $\\cV\\subset Y$ be a (partial) minimal model. Suppose that the pair $(\\cV,\\cV^\\vee)$ has theta reciprocity.\n    Then for every seed $\\seed\\in \\orT$ the set of theta functions on $\\cV$ that extend to $Y$ can be described as the intersection of $\\Theta(\\cV^\\vee)_{\\seed^\\vee}$ (see \\eqref{eq:Theta_seed}) with a polyhedral cone of the vector space $\\Trop_{\\R}(\\cV^{\\vee}_{\\seed^\\vee})$ (see the sentence bellow equation \\eqref{eq:unif}).\n\\end{lemma}\n\\begin{proof}\nWe treat the cases $\\cV= \\cA$ or $\\cA/T_H$ as the proof is completely analogous for the cases $\\cV= \\cX$ or $\\cXe$.\nLet $D_1, \\dots, D_s$ be the irreducible divisors of $Y$ contained in the boundary of $\\cV \\subset Y $. \nSince $Y$ is normal, to describe the theta functions on $\\cV$ that extend to $Y$ it is enough to describe the set of theta functions that extend to $D_1, \\dots , D_s$ since $Y\\setminus (\\cV \\cup D_1, \\dots , D_s)$ has co-dimension greater or equal to $2$ in $Y$.\nLet $\\ord_{D_j}$ be the discrete valuation on $ \\Bbbk(\\cV)\\setminus \\{ 0 \\}$ associated to the irreducible divisor $D_j$.\nSince $\\cV \\subset Y$ is a partial minimal model, $\\ord_{D_j}$ determines a point of $  \\cV(\\Z^t) $. Since $\\Theta(\\cV^{\\vee})= \\cV(\\Z^t)$ we have $\\ord_{D_j} \\in \\Theta (\\cV^{\\vee})$. Therefore, $\n\\tf^{\\cV^{\\vee}}_{\\ord_{D_j}}$ is a polynomial theta function and its tropicalization is the function\n\\[\n(\\tf_{\\ord_{D_j}}^{\\cV^\\vee})^t:\\cV^{\\vee}( \\Z^t)\\to \\Z\\quad \\text{given by} \\quad  v \\mapsto v (\\tf^{\\cV^{\\vee}}_{\\ord_{D_j}}).\n\\]\nIn other words, $(\\tf_{\\ord_{D_j}}^{\\cV^{\\vee}})^t(v)=\\langle \\ord_{D_j}, i(v) \\rangle$. \nSince $\\Theta(\\cV)= \\cV^\\vee(\\Z^T)$ we have that $i(v)\\in \\Theta(\\cV)$ and, therefore, $\\tf^\\cV_{i(v)}$ is a polynomial theta function.\nThe assumption $ \\langle{\\bf v} , {\\bf b} \\rangle = \\langle {\\bf v} , {\\bf b} \\rangle^{\\vee} $ for all ${\\bf v}$ and ${\\bf b}$ implies that\n\\[\n(\\tf_{\\ord_{D_j}}^{\\cV^{\\vee}})^t(v)= (\\tf^{\\cV}_{i(v)})^t(\\ord_{D_j}),\n\\]\nsince\n\\[\n(\\tf_{\\ord_{D_j}}^{\\cV^{\\vee}})^t(v) =\n\\langle \\ord_{D_j}, i(v) \\rangle =\n\\langle \\ord_{D_j}, i(v)\\rangle^{\\vee} =\n\\ord_{D_j}(\\tf^{\\cV}_{i(v)}) =\n(\\tf^{\\cV}_{i(v)})^t(\\ord_{D_j}).\n\\]\nThus a theta function $\\tf^{\\cV}_{i(v)} \\in \\cmid(\\cV)$ extends to $D_j$ if and only if $0\\leq (\\tf^{\\cV^{\\vee}}_{\\ord_{D_j}})^t(v)$. \nIn particular, a theta function $\\tf^\\cV_{i(v)}$ extends to $Y$ if and only if\n\\[\ni(v)\\in \\bigcap_{i=1}^s\\{b\\in\\cV^{\\vee}_{\\seed^\\vee}(\\R^T)\\mid 0\\leq (\\tf^{\\cV^{\\vee}}_{\\ord_{D_j}})^T(b)\\}\n\\]\nsince $g^T(b)=g^t(i(b))$ for every positive function $g$ on $\\cV$, see \\eqref{eq:comparing_tropicalizations}. By definition of tropicalization, the set  $\\bigcap_{i=1}^s\\{b\\in\\cV^{\\vee}_{\\seed^\\vee}(\\R^T)\\mid 0\\leq (\\tf^{\\cV^{\\vee}}_{\\ord_{D_j}})^T(b)\\}$ is a polyhedral cone of $\\cV^{\\vee}_{\\seed^\\vee}(\\R^T)=\\Trop_{\\R}(\\cV^{\\vee}_{\\seed^\\vee})$. \n\\end{proof}\n\nWe now turn to the problem of understanding when the theta functions on $\\cV$ that extend to a (partial) minimal model $\\cV \\subset Y$ form a basis of $H^0(Y, \\mathcal{O}_Y)$. \nThe following notion is central.\n\\begin{definition}\n    \\label{def:respect_order}\n    Let $\\cV$ be a scheme of the form $\\cA, \\cX, \\cA/T_H$ or $\\cXe$. We say that the theta functions on $\\cV$ {\\bf respect the order of vanishing} if\n    for all ${\\bf v}\\in \\cV(\\Z^t)$ and $\\displaystyle \\sum_{{\\bf q}\\in \\Theta(\\cV)} \\alpha_{\\bf q} \\tf^{\\cV}_{\\bf q}\\in \\cmid(\\cV)$ then \n\\[\n{\\bf v}\\lrp{\\sum_{{\\bf q}\\in \\Theta(\\cV)} \\alpha_{\\bf q} \\tf^{\\cV}_{\\bf q}} \\geq 0 \\ \\ \\text{ if and only if }\\ \\  {\\bf v}(\\tf_{\\bf q})\\geq 0  \\text{ for all } {\\bf q} \\text{ such that } \\alpha_{\\bf q}\\neq 0.\n\\]\n\\end{definition}\n\nNotice that in \\cite[Conjecture 9.8]{GHKK} the authors conjecture that the theta functions on $\\cAp$ respect the order of vanishing.\nThe {\\bf superpotential} associated to a partial minimal model $\\cV \\subset Y $ is the function on $\\cV^\\vee$ defined as\n\\begin{equation}\\label{eq:def superpotential}\n    W_{Y}:=\\sum_{j=1}^n \\tf^{\\cV^{\\vee}}_{j},\n\\end{equation}\nwhere \n\\begin{equation}\n\\label{eq:def superpotential_summands}\n   \\tf^{\\cV^{\\vee}}_{j}=\\begin{cases}\n       \\tf^{\\cV^{\\vee}}_{\\ord_{D_j}} &\\text{ if } \\cV=\\cA \\text{ or } \\cV=\\cA/T_H\\vspace{1mm}\\\\\n       \\tf^{\\cV^{\\vee}}_{i(\\ord_{D_j})} \\ &\\text{ if } \\cV=\\cX \\text{ or } \\cV=\\cXe.\n   \\end{cases}\n\\end{equation}\nThe {\\bf superpotential cone} associated to $W_Y$ is\n\\begin{equation}\\label{eq:def Xi}\n    \\Xi_Y:= \\{ {\\bf v} \\in \\Trop_{\\R}(\\cV^\\vee) \\mid \\Trop_{\\R}(W_Y)({\\bf v})\\geq0 \\},\n\\end{equation} \nsee equation \\eqref{eq:unif_function}.\n\nWe further set $\\Xi_{Y;\\seed^\\vee}:= \\mathfrak{r}_{\\seed^\\vee}(\\Xi_Y)\\subset \\Trop_{\\R}(\\cV^{\\vee}_{\\seed^\\vee})$.\nNotice that if the theta functions on $\\cV$ respect the order of vanishing then $\\Xi_{Y;\\seed}$ is precisely the polyhedral subset of Lemma \\ref{lem:tf_that_extend}.\nThe next results follows at once from the definitions.\n\n\\begin{lemma}\n\\label{lem:basis_for_pmm}\n    Let $\\cV$ be a scheme of the form $\\cA, \\cX, \\cA/T_H$ or $\\cXe$ and let $\\cV\\subset Y$ be a (partial) minimal model. Suppose that the full Fock--Goncharov conjecture holds for $\\cV$, that the pair $(\\cV, \\cV^\\vee) $ has theta reciprocity and that the theta functions on $\\cV$ respect the order of vanishing. \n    Then the set of theta functions on $\\cV$ parametrized by the points of $\\Xi_Y(\\Z)$ is a basis of $H^0(Y, \\mathcal{O}(Y))$.\n\\end{lemma}\n\n\\begin{lemma}\nSuppose there is a cluster ensemble map $p:\\cA \\to \\cX$ that is an isomorphism. Then theta functions on $\\cA$ respect the order of vanishing if and only theta functions on $\\cX$ respect the order of vanishing.\n\\end{lemma}\n\\begin{proof}\n    The result follows at once from the fact that $p^*(\\tf^{\\cX}_{\\bf n})= \\tf^{\\cA}_{(p^{\\vee})^T\\circ i ({\\bf n})}$. \n\\end{proof}\n\n\nWe propose the following definition that allows to have the benefits of Lemma \\ref{lem:basis_for_pmm} without having to verify all its assumptions. \nWe apply this in \\S\\ref{sec:NO_Grass}.\n\n\\begin{definition}\n\\label{def:enough_tf}\nWe say that $ \\cV \\subset Y$ has {\\bf enough theta functions} if the full Fock--Goncharov conjecture holds for $\\cV$ and the theta functions on $\\cV$ parametrized  by $\\Xi_{Y} (\\Z)$ form a basis of $H^0(Y, \\mathcal{O}_Y)$.\n\\end{definition}\n\nWe now recall an important notion introduced in \\cite[Definition 9.1]{GHKK} that can be used to verify in a combinatorial way that a partial minimal model $\\cA\\subset Y$ has enough theta functions provided $Y$ is obtained by letting the frozen variables vanish.\n\n\\begin{definition}\nWe say that a seed $\\seed=(e_i)_{ i \\in I}$ is {\\bf optimized} for a point $ {\\bf n} \\in \\cA(\\Z^t) $ if under the identification of $\\cA(\\Z^t)$ with $N^\\circ$ afforded by $\\seed$ we have that $\\{ e_k, n_{\\seed} \\}\\geq 0 $ for all $k \\in \\Iuf$.\n\\end{definition}\n\n\\begin{lemma}\n\\thlabel{lemm:enough_tf}\nAssume that $\\cA$ satisfies the full Fock--Goncharov conjecture. Let $\\cA\\subset Y$ be a partial minimal model of $\\cA$ and let $D_1, \\dots , D_s$ be the irreducible divisors of $Y$ contained in $Y\\setminus \\cA$.  Assume that $p^*_2|_{N^{\\circ}}: N^{\\circ}\\to \\Nuf^*$ is surjective and that the point $\\ord_{D_j}\\in \\cA^{\\vee}(\\Z^t)$ has an optimized seed for every $1 \\leq j \\leq s$. Then the partial minimal model $\\cA \\subset Y$ has enough theta functions.\n\\end{lemma}\n\\begin{proof}\nSince $p^*_2|_{N^{\\circ}}$ is surjective we have that $\\cAp$ is isomorphic to $\\cA\\times T_M $ (see \\cite[Lemma B.7]{GHKK}). \nConsider the partial compactification $\\cAp \\subset Y \\times T_M$. Its boundary is isomorphic to $D\\times T_M$ and the irreducible components of the boundary are the divisors $\\widetilde{D}_1, \\dots, \\widetilde{D}_s$, where $\\widetilde{D}_j:=D_j \\times T_M $. \nBy hypothesis $\\ord_{D_j} $ is optimized for some seed $\\seed_j$. \nLet $\\widetilde{\\seed}_j$ be the seed for $\\Gamma_{\\prin}$ obtained mutating $ \\seed_{0_{\\prin}}$ in the same sequence of directions needed to obtain $\\seed_j$ from $\\seed_0$.\nObserve that for every $1\\leq j \\leq s$, under the identifications \n\\[\n\\cA_{\\prin,\\widetilde{\\seed}_j}(\\Z^t) = N_\\prin^{\\circ} = \\cA_{\\seed_j}(\\Z^t) \\oplus  T_M(\\Z^t), \n\\]\nthe point $ \\ord_{\\widetilde{D}_j}$ of $\\cAp(\\Z^t)$ corresponds to the point $ (\\ord_{D_j},0)$ of $\\cA(\\Z^t)\\times T_M(\\Z^t)$.\n\nRecall that the index set of unfrozen indices for $\\cAp$ is $\\Iuf$. \nIn particular, for every $k \\in \\Iuf$ we have that the $k^{\\text{th}}$ element of $\\widetilde{\\seed}_{j}$ is of the form $( e_{k;j},0)$, where $e_{k;j}$ is the $k^{\\text{th}}$ element of $\\seed_j$. Then for each $1\\leq j\\leq s$ we compute\n\\begin{align*}\n\\{ (e_{k;j},0),  \\ord_{\\widetilde{D}_j}\\} & = \\{ (e_{k;j},0), (\\ord_{D_j},0)\\} \\\\\n& = \\{e_{k;j}, \\ord_{D_j} \\} \\geq 0.\n\\end{align*}\n\nThis tells us that $\\ord_{\\widetilde{D}_j}$ is optimized for $\\widetilde{\\seed}_j$.\nLet $W_{Y\\times T_M}=\\sum_{j}^{s}\\tf^{\\cAp^\\vee}_{\\ord{\\widetilde{D}_j}}$ be the superpotential associated to $\\cAp \\subset Y \\times T_M$.\nBy Proposition 9.7 and Lemma 9.10 (3) of \\cite{GHKK} the integral points of $\\Xi_{Y \\times T_M}$ can be described as\n\\[\n\\Xi_{Y \\times T_M}\\cap (\\Z) = \\{ b \\in \\Theta(\\cAp) \\mid \\ord_{i(b)} (\\tf^{\\cAp^{\\vee}}_j)\\geq 0 \\text{ for all } j\\}.\n\\] \nWe define $\\cmid(Y\\times T_M)$ to be the vector subspace of $\\cmid (\\cAp)$ spanned by the theta functions parametrized by $\\Xi_{Y \\times T_M}(\\Z^T)$. \nFor the convenience of the reader we point out that in the notation of \\cite[\\S9]{GHKK} the partial compactification $Y\\times T_M$ of $\\cAp$ would be denoted by $\\overline{\\cA}_{\\text{prin}}^{S}$ and $\\Xi_{Y \\times T_M}(\\Z)$ by $\\Theta(\\overline{\\cA}_{\\text{prin}}^{S})$, where  $S:=\\{ i(\\ord_{\\widetilde{D}_1}),\\dots , i(\\ord_{\\widetilde{D}_s})\\}$.\nBy \\cite[Lemma 9.10(2)]{GHKK} we have\n\\[\\cmid(Y\\times T_M)=H^0(Y\\times T_M, \\mathcal{O}_{Y\\times T_M}) \\cong H^0(Y, \\mathcal{O}_{Y})\\otimes_{\\Bbbk} H^0( T_M, \\mathcal{O}_{ T_M}).\n\\]\nIn particular, $H^0(Y\\times T_M, \\mathcal{O}_{Y\\times T_M})$ has a theta basis parametrized by $\\Theta(Y\\times T_M)$.\nThe theta function $ \\tf^{\\cA}_{\\ord_{D_j}}$ is obtained from $\\tf^{\\cAp}_{\\widetilde{D}_j}$ by specializing the coefficients to $1$. This implies that\n\\[\n\\Xi_{Y}\\cap \\Trop_{\\Z}(\\cA^{\\vee})= \\Xi_{Y \\times T_M} \\cap \\Trop_{\\Z}(\\cA^{\\vee}).\n\\]\nWe conclude that  $ H^0(Y, \\mathcal{O}_Y)$ has a theta basis parametrized by the integral point of $\\Xi_{Y}$.\n\n\\end{proof}\n\n\\section{Valuations on middle cluster algebras and adapted bases}\n\\label{sec:cluster_valuations}\n In \\cite{FO20} the authors noticed that the so-called {\\bf g}-vectors associated to cluster variables can be used to construct valuations on $\\Bbbk(\\cA)$ provided $\\Gamma$ is of full-rank. In this section we study some properties of these valuations. We extend this approach for quotients of $\\cA$ and (fibres of) $\\cX$.\n\n Let $\\cV$ be a scheme of the form $\\cA, \\cX, \\cA/T_H$ or $\\cXe$. \n Recall from \\S\\ref{sec:tf_and_parametrizations} that every theta function on $\\cV$ is labeled with a point of $\\Trop_{\\Z}(\\cV^\\vee)$, see \\eqref{eq:unif}. \n\n\\begin{definition}\n\\label{def:dom_order}\nSuppose $\\Gamma$ is of full-rank and let $ \\seed \\in \\orT$ be a seed for $\\Gamma$. \nThe {\\bf opposite dominance order} on $M^\\circ$ defined by $\\seed$ is the partial order $\\preceq_{\\seed}$ on $M^\\circ$ determined by the following condition:\n\\begin{equation}\n\\label{eq:dom_order}\nm_1 \\prec_{\\seed} m_2 \\ \\Leftrightarrow \\ m_2=  m_1 + p^{\\ast}_1(n) \\text{ for some }n\\in N^+_{\\uf, \\seed}.\n\\end{equation}\n\\end{definition}\n\n\\begin{remark}\n\\label{rem:dom_order}\nIn Definition \\ref{def:dom_order}, $m_1\\preceq_{\\seed} m_2$ means that either $m_1 \\prec_\\seed m_2 $ or $m_1=m_2$. We will also adopt this notation for other orders we consider.\nThe dominance order was originally considered in \\cite[Proof of Proposition 4.3]{Labardini_et_al_CC-alg} and it is the opposite order to the one given in Definition \\ref{def:dom_order}. \nThis order was exploited by \\cite{Qin17,Qintropical} in his work on bases for cluster algebras.\nThe full-rank condition is needed so that $\\preceq_{\\seed}$ is reflexive. However, observe that for every seed $\\seed$ such that $\\text{ker}(p_1^*)\\cap N^+_{\\uf , \\seed} = \\emptyset$, equation \\eqref{eq:dom_order} still determines a partial order on $M^\\circ$ even if $\\Gamma$ is not of full-rank. \nNonetheless, whenever we talk about an (opposite) dominance order in this paper we will be tacitly assuming that $\\Gamma$ is of full-rank.\n\\end{remark}\n\nIt is straightforward to verify that $\\preceq_{\\seed}$ is {\\bf linear}. That is, $ m_1 \\preceq_\\seed m_2 $ implies that $m_1 + m \\preceq_\\seed m_2 + m$ for all $m \\in M^\\circ$.\n\n\n\\begin{definition}\n\\label{def:val}\nLet $A$ be an integral domain with a $\\Bbbk$-algebra structure, $L$ a lattice isomorphic to $\\Z^r$ and $\\leq$ a total order on $ L$. A {\\bf valuation} on $A$ with values in $L$ is a function $\\nu : A\\setminus \\{0 \\} \\to (L,<)$ such that \n\\begin{itemize}\n    \\item[(1)] $\\nu(f+g) \\geq  \\min\\{\\nu(f), \\nu(g)\\}$, unless $f+g=0$,\n    \\item[(2)] $\\nu(fg)= \\nu(f) + \\nu(g)$,\n    \\item[(3)] $\\nu(cf)=\\nu(f)$ for all $c \\in \\Bbbk^* $.\n\\end{itemize}\nFor $l \\in L$ we define the subspace\n$\nA_{\\nu \\geq l}:= \\{ x\\in A \\setminus \\{0\\} \\mid \\nu(x)\\geq l\\} \\cup \\{ 0 \\}\n$\nof $A$. The subspace $\nA_{\\nu > l}$ is defined analogously.\nWe say that $\\nu $ has {\\bf 1-dimensional leaves} if the dimension of the quotient\n\\begin{equation}\n\\label{eq:graded_piece}\nA_l:=A_{\\nu \\geq l} \\big{/} A_{\\nu > l}    \n\\end{equation}\nis either $0$ or $1$ for all $l\\in L$. A basis $B$ of $A$ is {\\bf adapted} for $ \\nu $ if for all $l\\in L$ the set $B\\cap A_{\\nu \\geq l}$ is a basis of $A_{\\nu\\geq l}$.\n\\end{definition}\n\n\\begin{lemma}\n\\thlabel{product_and_order}\n\nAssume $\\Gamma $ is of full-rank. \nLet $\\vartheta^{\\cA}_{m_1},\\vartheta^{\\cA}_{m_2}\\in \\text{mid}(\\cA)$ with $m_1,m_2\\in =\\Trop_{\\Z}(\\cA^{\\vee}_{\\seed^\\vee})=M^\\circ$. \nThen the product $\\vartheta^{\\cA}_{m_1}\\vartheta^{\\cA}_{m_2}$ expressed in the theta basis of $\\text{mid}(\\cA)$ has the following form\n\\[\n\\vartheta^{\\cA}_{m_1}\\vartheta^{\\cA}_{m_2}= \\vartheta^{\\cA}_{m_1+m_2}+ \\sum_{m_1+m_2 \\prec_{\\seed}  m}c_{m}\\vartheta^{\\cA}_{m}.\n\\]\n\\end{lemma}\n\n\n\\begin{proof}\nFirst notice that for any broken line $\\gamma $ we have that\n\\begin{equation*}\nF(\\gamma)=I(\\gamma) +a_{1}p^*_1(n_{1})+ \\dots + a_{r}p^*_1(n_{r}),  \n\\end{equation*}\nwhere $a_1, \\dots , a_r $ are non-negative integers and $n_1, \\dots , n_r \\in N^+_{\\uf, \\seed}$. \nThis follows from \\cite[Theorem 1.13]{GHKK} and the bending rule of broken lines (\\ie \\thref{def:genbroken}(4)). \nIn particular, we have that $a_{1}n_{1}+ \\dots + a_{r}n_{r}\\in N^+_{\\uf, \\seed} \\cup \\{ 0 \\} $. \nMoreover, \n$a_{1}p^*_1(n_{1})+ \\dots + a_{r}p^*_1(n_{r}) = 0$ if and only if $a_1=\\cdots = a_r =0$.\nTherefore, $I(\\gamma) \\preceq_{\\seed} F(\\gamma)$ and $I(\\gamma)=F(\\gamma)$ if and only if $\\gamma$ does not bend at all.\n\nThe statement we want to prove already follows from the observations made above.\nIndeed, by \\cite[Definition-Lemma 6.2]{GHKK} we know that $ \\alpha(m_1,m_2,m)\\neq 0$ if and only if there exist broken lines $\\gamma_1$ and $\\gamma_2$ such that \n$I(\\gamma_i)=m_i$ for $i \\in \\{1,2\\}$ and $F(\\gamma_1)+F(\\gamma_2)=m=\\gamma_1(0)=\\gamma_2(0)$.\nTherefore, if $ \\alpha(m_1,m_2,m)\\neq 0$ then $ m_1 + m_2=I(\\gamma_1) + I(\\gamma_2) \\preceq_{\\seed} m $. \nMoreover, the equality\n$ m_1+ m_2=m$\nholds if and only if both $\\gamma_1$ and $\\gamma_2$ do not bend at all. \nThis latter case can be realized in a unique way, therefore, $\\alpha(m_1,m_2,m_1+m_2)=1$. \n\\end{proof}\n\nFrom now on the symbol $\\leq_{\\seed} $ is used to denote a total order on $M^\\circ$ refining $\\preceq_{\\seed}$.\n\n\\begin{definition}\nLet ${\\bf m}=(m_{\\seed^\\vee})\\in \\Trop_{\\Z}(\\cA^{\\vee})$. The {\\bf g-vector of} $ \\tf^{\\cA}_{\\bf m}$ {\\bf with respect to} $\\seed $ is\n\\begin{equation}\n\\label{eq:red-g-val-A}\n{\\bf g}_{\\seed}\\left(\\tf^{\\cA}_{\\bf m}\\right):= m_{\\seed^\\vee}\n\\in \\Trop_{\\Z}(\\cA^{\\vee}_{\\seed^\\vee}).\n\\end{equation}\n\\end{definition}\n\n\\begin{definition}\n\\thlabel{g_valuation_A}\nAssume $\\Gamma$ is of full-rank and think of $M^{\\circ}$ as $\\Trop_{\\Z}(\\cA^{\\vee}_{\\seed^\\vee})$. Let $\\gv_{\\seed}:\\cmid(\\cA) \\setminus \\{ 0\\} \\to (M^{\\circ},\\leq_{\\seed})$ be the map given by \n\\begin{equation}\n\\label{eq:g_val}\n    \\gv_{\\seed}(f):= \\min{}_{\\leq_{\\seed}}\\{m_1, \\dots , m_t\\},\n\\end{equation}\nwhere $f=c_1\\vartheta^{\\cA}_{m_1} + \\dots + c_t\\vartheta^{\\cA}_{m_t}$, $m_j\\in M^\\circ$ and $c_j\\not=0$ for all $j=1,\\dots,t$ is the expression of $f$ in the theta basis of $\\text{mid}(\\cA)$.\n\\end{definition}\n\n\n\n\n\\begin{lemma}\n\\thlabel{g_is_val}\nFor every seed $\\seed$ the map $\\gv_{\\seed} $ is a valuation on $\\cmid(\\cA)$ with 1-dimensional leaves and the theta basis $\\{ \\tf_{m} \\mid m\\in \\Theta (\\cA) \\}$ is adapted for $\\gv_{\\seed} $.\n\\end{lemma}\n\\begin{proof}\nThis statement follows from \\cite[Remark 2.30]{KM_Khovanskii_bases} but for the convenience of the reader we give a proof here. Items (1) and (3) of Definition~\\ref{def:val} follow directly from the definition of $\\gv_{\\seed}$. \nFor item (2) consider the expressions $f=\\sum_{i=1}^r c_i\\vartheta^{\\cA}_{m_i}$ and $g=\\sum_{j=1}^s c'_j\\vartheta^{\\cA}_{m'_j}$ where all $c_i$ and $c'_j$ are non-zero.\nThen by \\thref{product_and_order}\n\\begin{eqnarray}\\label{eq:fg in basis}\nfg=\\sum_{i,j} c_ic'_j\\left(\\vartheta^{\\cA}_{m_i+m'_j} + \\sum_{m_i+m'_j\\prec_{\\seed} m}c_{m}\\vartheta^{\\cA}_{m}\\right).\n\\end{eqnarray}\nBy definition of $\\gv_{\\seed}$ we have $m_\\mu:=\\gv_{\\seed}(f)\\prec_{\\seed} m_i$ for all $i\\in \\{1,\\dots, r\\} \\setminus \\{ \\mu \\}$ and $m'_\\nu:=\\gv_{\\seed}(g)\\prec_{\\seed} m'_j$ for all $j\\in \\{1,\\dots,s\\}\\setminus \\{\\nu\\}$. \nWe need to show that the term $\\vartheta_{m_\\mu+m'_\\nu}$ appears with non-zero coefficient in $fg$.\nAssume there exist $i\\not =\\mu$ and $j\\not=\\nu$ such that $m_\\mu+m'_\\nu=m_i+m'_j$. \nThen as $\\prec_{\\seed}$ is linear we have\n\\[\nm_\\mu +m'_\\nu \\prec_{\\seed} m_\\mu + m'_j \\prec_{\\seed} m_i + m'_j, \n\\]\na contradiction. \nHence, the term $\\vartheta_{m_\\mu+m'_\\nu}$ appears in the expression \\eqref{eq:fg in basis} of $fg$ with coefficient $c_\\mu c'_\\nu\\not =0$ and $\\gv_{\\seed}(fg)=m_\\mu+m'_\\nu=\\gv_{\\seed}(f)+\\gv_{\\seed}(g)$.\n\nThe fact that ${\\bf g}_{\\seed}$ has one dimension leaves follows directly from (\\ref{eq:g_val}). It is also clear from the definitions that for $m\\in M^{\\circ}$ the subspace $\\cmid(\\cA)_{m} $ as in (\\ref{eq:graded_piece}) is isomorphic to $\\Bbbk\\cdot \\tf^{\\cA}_{m}$ if $m \\in \\Theta(\\cA)$ and $0$-dimensional otherwise. \nIn particular, the fact that we have a bijection between the set of values of $\\gv_\\seed$ and the elements of the theta basis is equivalent to the theta basis being an adapted basis, see \\cite[Remark 2.30]{KM_Khovanskii_bases} .\n\\end{proof}\n\n\n\n\\begin{corollary}\nThe image of the valuation ${\\bf g}_{\\seed}$ is independent of the linear refinement $\\leq_{\\seed}$ of $\\preceq_{\\seed}$.\n\\end{corollary}\n\\begin{proof}\nSince the theta basis is adapted for ${\\bf g}_{\\seed}$ we have \n\\[\n{\\bf g}_{\\seed}\\lrp{\\cmid(\\cA)\\setminus \\{0\\}}= {\\bf g}_{\\seed}\\lrp{\\Theta(\\cA)}.\n\\]\nThe result follows.\n\\end{proof}\n\n\\begin{remark}\n\\thlabel{g-val-field}\nSince $\\cmid(\\cA)$ is a domain (see Remark \\ref{rem:integral_domain}) whose associated field of fractions is isomorphic to $\\Bbbk(A_i :i \\in I)$, we can extend the valuation $ {\\bf g}_{\\vb s} $ on $\\text{mid}(\\cA)$ to a valuation on $\\Bbbk(A_i :i \\in I)$ by declaring ${\\bf g}_{\\vb s} (f/g):={\\bf g}_{\\vb s} (f)- {\\bf g}_{\\vb s} (g) $.\n\\end{remark}\n\nThe valuation ${\\bf g}_{\\seed} $ is called the {\\bf {\\bf g}-vector valuation associated to $\\seed$}. \n%Shortly we will discuss {\\bf g}-vector valuations for quotients of $\\cA$ so we will write   ${\\bf g}^{\\cA}_{\\seed} $ to stress that this valuation is defined on $\\Bbbk(\\cA)$.\n\n\nWe now turn our attention to quotients of $\\cA $. We keep the assumption that $\\Gamma$ is of full-rank and consider a saturated sublattice $H=H_{\\cA}$ of $K^\\circ$. \nRecall from \\S\\ref{tf_quotient} that \n\\[\n\\Trop_{\\Z}((\\cA/T_H)^{\\vee}_{\\seed^\\vee})= H^{\\perp}.\n\\]\nSince $ \\Theta(\\cA/T_{H})_{\\seed^\\vee}\\subset  H^ \\perp$, we can restrict restrict the total order $\\leq_{\\seed}$ on $M^{\\circ}$ to $H^{\\perp}$ to obtain a {\\bf g}-vector valuation on $\\cmid(\\cA/T_{H})$ associated to $\\seed$ as in the previous cases:\n\\[\n{\\bf g}_{\\seed}: \\cmid(\\cA/T_H)\\setminus\\{0\\} \\to \\Trop_{\\Z}((\\cA/T_H)^{\\vee}_{\\seed^\\vee}).\n\\]\n\n\\begin{remark}\n\\label{rem:g_val_quotient}\n    As opposed to the case of $\\cA$, in general the field of fractions of $\\cmid(\\cA/T_H)$ might not be isomorphic to $\\Bbbk(\\cA/T_H)$. \n    This fails for example if the smallest cone in $\\Trop_{\\R}((\\cA/T_H)^{\\vee}_{\\seed^\\vee})$ containing $\\Theta(\\cA/T_H)_{\\seed^\\vee}$ is not full-dimensional.\n    However, the field of fractions of $\\cmid(\\cA/T_H)$ is isomorphic to $\\Bbbk(\\cA/T_H)$ provided $\\cA/T_H$ satisfies the full Fock--Goncharov conjecture. \n    In such a case, a {\\bf g}-vector valuation on $\\cmid(\\cA/T_H)$ can be extended to $\\Bbbk(\\cA/T_H)$ as in \\thref{g-val-field}.\n\\end{remark}\n\nWe now treat the case of $\\cX$. So fix a cluster ensemble lattice map $p^*:N \\to M^{\\circ} $ and a seed $\\seed $. \nConsider the identifications $\\Trop_{\\Z}(\\cX^\\vee_{\\seed})= d\\cdot N$ and  $\\Trop_{\\Z}(\\cA^{\\vee}_{\\prin,\\widetilde{\\seed}^\\vee}) = M^{\\circ}_{\\prin}=M^{\\circ}\\oplus N$ where $\\widetilde{\\seed}$ is the seed for $\\Gamma_\\prin$ obtained mutating $\\seed_{0_\\prin}$ in the same sequence of directions needed to obtain $\\seed$ from $\\seed_0$. \nRecall from \\S\\ref{sec:tf_X} that we have an inclusion $\\Trop_{\\Z}(\\cX^\\vee_{\\seed})\\to \\Trop_{\\Z}(\\cA^{\\vee}_{\\prin,\\widetilde{\\seed}^\\vee})$ given by $dn \\mapsto (p^*(n),n)$.\n\n\\begin{definition}\nLet ${\\bf n}=(dn_{\\seed^\\vee})\\in \\Trop_{\\Z}(\\cX^{\\vee})$. The {\\bf c-vector of} $ \\tf^{\\cX}_{\\bf n}$ with respect to $\\seed $ is\n\\begin{equation}\n\\label{eq:red-g-val-X}\n{\\bf c}_{\\seed}\\left(\\tf^{\\cX}_{\\bf n}\\right):= dn_{\\seed^\\vee}\n\\in \\Trop_{\\Z}(\\cX^{\\vee}_{\\seed^\\vee}).\n\\end{equation}\n\\end{definition}\n\n\\begin{remark}\nObserve that $\\cv_\\seed (\\tf^{\\cX}_{\\bf n})$ is an element of $d\\cdot N$. In practice we could work with the lattice $N$ as opposed to $d\\cdot N$ as they are canonically isomorphic. \nThe lattice $N$ is the set where the ${\\bf c}$-vectors (in the sense of \\cite{NZ}) live. \n\\end{remark}\n\n\\begin{definition}\nThe {\\bf divisibility order} on $ N$ determined by $\\seed$ is the partial order $\\preceq_{\\seed, \\text{div}}$ given by\n\\[\nn_1 \\preceq_{\\seed, \\text{div}} n_2 \\text{ if and only if } \nn_2- n_1 \\in N_{\\seed}^+.\n\\]\n\\end{definition}\n\n\\begin{lemma}\n\\thlabel{lem:restriction}\nThe restriction of $\\preceq_{\\widetilde{\\seed}^\\vee}$ to the $N$ component of $ M^\\circ_{\\prin}$ coincides with the divisibility order $\\prec_{\\seed,\\text{div}}$ on $N$.\n\\end{lemma}\n\\begin{proof}\nLet $p^*_{\\prin,1}:N_{\\uf, \\prin}\\to M^\\circ_\\prin$ be the given by $(n,m)\\mapsto \\{ (n,m), \\cdot \\}_{\\prin}$ (in other words, $p^*_{\\prin,1}$ corresponds to the map $p_1^*$ in \\eqref{eq:p12star} for $\\Gamma_{\\prin}$). \nIn particular, $p^*_{\\prin,1} (n,0) = (p^*_1(n), n) $.\nLet $n_1,n_2 \\in N$ be distinct elements such that $n_2-n_1 \\in N^+_\\seed$. Let $ \\widetilde{m}_i=(p_1(n_i),n_i)$ for $i = 1,2$. Then $\\widetilde{m}_2 -\\widetilde{m}_1= (p^*_1(n_2-n_1), n_2 -n_1)$. The result follows.\n\\end{proof}\n\nThe next result follows at once from \\thref{lem:restriction} and \\thref{product_and_order}.\n\n\\begin{lemma}\n\\label{lem:tf_X_pointedness}\nLet $ \\tf^{\\cX}_{dn_1},\\tf^{\\cX}_{dn_2} \\in \\cmid(\\cX)$ with $d_1n_1, d_2n_2 \\in \\Trop_{\\Z}(\\cX^\\vee_{\\seed^\\vee})=d\\cdot N$.\n Then the product $\\vartheta^{\\cX}_{dn_1}\\vartheta^{\\cX}_{dn_2}$ expressed in the theta basis of $\\cmid(\\cX)$ is of the following form\n\\[\n\\vartheta^{\\cX}_{dn_1}\\vartheta^{\\cX}_{dn_2}= \\vartheta^{\\cX}_{dn_1+dn_2}+ \\sum_{n_1+n_2 \\ \\prec_{\\seed, \\text{div}} \\ n} c_{n}\\vartheta^{\\cX}_{dn}.\n\\]\n\\end{lemma}\n\nFrom now on we let $\\leq_{\\seed,\\text{div}}$ be any total order refining $\\preceq_{\\seed, \\text{div}}$. \n\n\\begin{corollary}\\label{cor:gv on midX}\nLet ${\\bf c}_{\\seed}:\\cmid(\\cX) \\setminus \\{ 0\\} \\to (d \\cdot N,\\leq_{\\seed,\\text{div}})$ be the map defined by \n\\[\n{\\bf c }_{\\seed}(f):= \\min{}_{\\leq_{\\seed,\\text{div}}}\\{n_1, \\dots , n_t\\},\n\\]\nwhere $f=c_1\\tf^{\\cX}_{d n_1} + \\dots + c_t\\tf^{\\cX}_{d n_t}$ is the expression of $f$ in the theta basis of $\\text{mid}(\\cX)$. Then ${\\bf c }_{\\seed}$ is a valuation with 1-dimensional leaves and the theta basis for $\\cmid(\\cX) $ is adapted for ${\\bf c}_\\seed$.\n\\end{corollary}\n\n\n\nWe now let $\\cX_{\\bf 1}$ be the fibre of $\\cX$ associated to a sublattice $H:= H_{\\cX} \\subset K$. In order to define a {\\bf c}-vector valuation on $\\cmid(\\cX_{\\bf 1})$ we need that \n\\[\nH\\cap N^+_{\\seed}= \\emptyset.\n\\]\nSince, if this condition holds,  $\\preceq_{\\seed, \\text{div}}$ induces a well partial order on $N/H =\\mathcal X_{\\bf 1,\\seed}$ defined as\n\\[\nn_1 + H \\preceq_{\\seed, \\text{div}} n_2+H \\quad \\text{ if and only if } \\quad n_2 - n_1 \\in N^+_{\\seed}+ H.\n\\]\nThe rest of the construction follows from the cases already treated.\n\n\\begin{lemma}\\label{lem:cval_gval}\nSuppose $\\Gamma$ is of full-rank and let $p: \\cA \\to \\cX$ be a cluster ensemble map. Then we have a commutative diagram\n\\[\n\\xymatrix{\n\\cmid(\\cX) \\setminus \\{0\\} \\ar^{p^*}[r] \\ar_{{\\bf c}_{\\seed}}[d] &  \\cmid(\\cA) \\setminus \\{0\\} \\ar^{{\\bf g}_{\\seed}}[d] \\\\\n\\Trop_{\\Z}(\\cX^{\\vee}_{\\seed^\\vee}) \\ar_{(p^\\vee)^T\\circ i} [r] & \\Trop_{\\Z}(\\cA^{\\vee}_{\\seed^\\vee}) \n}\n\\]\n\\end{lemma}\n\\begin{proof}\nIt is enough to show that for ${\\bf n} \\in \\Theta(\\cX) $ we have\n\\[\n\\gv_{\\seed}(p^*(\\tf^\\cX_{\\bf n}))=(p^\\vee)^T\\circ i({\\bf c}_{\\seed} (\\tf^\\cX_{\\bf n}))\n\\]\nLet $dn=\\mathfrak{r}_{\\seed^\\vee}({\\bf n})$. We have that\n\\[\n\\tf^{\\cX}_{dn}=z^n + \\sum_{n\\prec_{\\seed}n'}a_{n'}z^{n'}.\n\\]\nTherefore,\n\\[\np^*(\\tf^{\\cX}_{dn})=z^{p^*(n)} + \\sum_{n<_{\\seed, \\text{div}}n'}a_{n'}z^{p^*(n')}.\n\\]\nWe conclude that $\\gv_{\\seed}(p^*(\\tf^\\cX_{\\bf n}))=p^*(n)$. On the other hand we have that $ {\\bf c}_{\\seed} (\\tf^\\cX_{\\bf n})=dn$. We compute\n\\begin{align*}\n    (\\Lp)^T\\circ i (dn)= ((\\Lp)^*)^*(-dn)=\\lrp{-\\frac{1}{d}(p^*)^*)}^*(-dn)=p^*(n).\n\\end{align*}\nThe claim follows.\n\\end{proof}\n\nWe would like to treat {\\bf g}-vector valuations for varieties of the form $\\cA$ and $\\cA/T_H$ and {\\bf c}-vector valuations on $\\cX$ and $\\cXe$ in a uniform way.\nWith this in mind we introduce the following notation.\n\n\\begin{notation}\n\\thlabel{not:g-val}\nLet $\\cV$ be a cluster variety and $ \\cV^{\\vee}$ its Fock--Goncharov dual. The cluster valuation on $\\cmid(\\cV)$ associated to a seed $\\seed\\in \\orT$ is \n\\[\n\\nu_{\\seed}:\\cmid (\\cV)\\setminus\\{0\\} \\to (\\Trop_{\\Z}(\\cV^\\vee_{\\seed^\\vee}), <_{\\seed}),\n\\]\nwhere $\\Trop_{\\Z}(\\cV^\\vee_{\\seed^\\vee})$ is as in \\eqref{eq:unif} and $<_{\\seed}$ is a linear order on $\\Trop_{\\Z}(\\cV^\\vee_{\\seed^\\vee})$ refining $\\prec_\\seed$ in case $\\cV=\\cA$ or $\\cA/T_H$ and it refines $\\prec_{\\seed,\\text{div}}$ if $\\cV=\\cX$ or $\\cXe$. \n\\end{notation}\n\n\\section{Newton--Okounkov bodies} \\label{sec:no}\n\nIn this section we provide a general approach to construct Newton--Okounkov bodies associated to certain partial minimal models of varieties with a cluster structure. \nIn particular, we treat a situation that often arises in representation theory where the universal torsor of a projective variety has a cluster structure of type $\\cA$.  \nThe Newton--Okounkov bodies we construct depend on the choice of an initial seed. Hence we discuss how the bodies associated to different choices of initial seed are related and introduce  the intrinsic Newton--Okounkov body which is seed independent.\n\n\\subsection{Schemes and ensembles with cluster structure}\n\n\\begin{definition}\\thlabel{def:cluster-structure}\nWe say a smooth \nscheme (over $\\Bbbk$) $V$ {\\bf can be endowed with cluster structure of type} $\\cV$ if there is a birational map $ \\Phi: \\cV \\dashrightarrow V$ which is an isomorphism outside a codimension two subscheme of the domain and range.\nIn this setting, we say that the pair $(V,\\Phi)$ is {\\bf{a scheme with cluster structure of type}} $\\cV$.\n\\end{definition}\n\n\\begin{remark}\nWe are straying slightly from \\cite{CMNcpt} in \\thref{def:cluster-structure}.\nSpecifically, we are now including $\\Phi$ as part of the data defining a scheme with cluster structure.\nSo, given two different birational maps $\\Phi_1:\\cV_1 \\dashrightarrow V$ and $\\Phi_2: \\cV_2 \\dashrightarrow V$ as in \\thref{def:cluster-structure}, we now consider $(V,\\Phi_1)$ and $(V,\\Phi_2)$ different as schemes with cluster structure (as is the case, for example, for open positroid varieties, see Remark~\\ref{rmk:open positroid}). \nNevertheless, when the map $\\Phi$ is clear from the context or we are just dealing with a single birational map $\\cV \\dashrightarrow V$, we will simply say that $V$ has a cluster structure of type $\\cV$.\n\\end{remark}\n\nLet $V=(V,\\Phi)$ be a scheme with a cluster structure of type $\\cV$. Since $V$ is normal and isomorphic to $\\cV$ up to co-dimension $2$ then $V$ and $\\cV$ have isomorphic rings of regular functions. In turn, we can talk about polynomial theta functions on $V$ which we denote by $\\tf^V_{\\bf v}$ for ${\\bf v}\\in \\Theta (\\cV)$.\nMoreover, recall that $\\cV$ is log Calabi--Yau. By \\cite[Lemma~1.4]{GHK_birational} $V$ is also log Calabi--Yau. Hence, $V$ has a canonical volume form whose pullback by $\\Phi $ coincides with the canonical volume form on $\\cV$. Moreover, a (partial) minimal model $V\\subset Y$ and its boundary can be defined as in Definition \\ref{def:cv_minimal_model}.\n\n\n\\begin{definition}{\\cite{GHK_birational}}\n An inclusion $V \\subset Y$ as an open subscheme of a normal variety $Y$ is a {\\bf partial minimal model} of $ V$ if the canonical volume form on $V$ has a simple pole along every irreducible divisor of $Y$ contained in $ Y \\setminus V$. It is a {\\bf minimal model} if $Y$ is, in addition, projective. We call $ Y \\setminus V$ the {\\bf boundary} of $V  \\subset Y$.\n\\end{definition}\n\n\n\\begin{definition}\n    Suppose $\\Phi:\\cV \\dashrightarrow V$ endows $V$ with a cluster structure of type $\\cV$ and that the cluster valuation $\\nu_{\\seed}$ extends to $\\Bbbk(\\cV) $. \n    Then the {\\bf cluster valuation} $\\nu^{\\Phi}_{\\seed}:\\Bbbk(V)^*\\to \\Trop_{\\Z}(\\cV^\\vee)$ is given by\n    \\[\n    \\nu^{\\Phi}_{\\seed}(f)=  \\nu_{\\seed}(\\Phi^*(f)).\n    \\]\n\\end{definition}\n\n\\begin{definition}\n    Suppose $\\Phi_{\\cA}:\\cA \\dashrightarrow V_1$ and $\\Phi_{\\cX}:\\cX \\dashrightarrow V_2$ endow $V_1$ (resp. $V_2$) with cluster structures of type $ \\cA$ (resp. $\\cX$). We say that $V_1 \\overset{\\tau}{\\to} V_2$ is a cluster ensemble structure if there exists a cluster ensemble map $p:\\cA \\to \\cX$ such that the following diagram commutes\n    \\[\n    \\xymatrix{\n    V_1 \\ar^{\\tau}[r] & V_2 \\\\\n    \\cA \\ar@{-->}^{\\Phi_{\\cA}}[u] \\ar_p[r] & \\cX \\ar@{-->}_{\\Phi_{\\cX}}[u].\n    }\n    \\]\n\\end{definition}\n\n\n\n\n\\subsection{Newton--Okounkov bodies for Weil divisors supported on the boundary}\n\\label{sec:NO_bodies}\nThroughout this section we let $\\cV$ be a scheme of the form $ \\cA$, $\\cX$, $\\cA/T_{H}$ or $\\cX_{{\\bf 1}}$. Whenever we talk about a cluster valuation on $\\cmid(\\cV)$ we are implicitly assuming we are in a setting where such valuation exist, see \\S\\ref{sec:cluster_valuations}.\n\n\\begin{definition}{\\cite{GHKK}}\n\\label{def:positive_set}\nA closed subset $S\\subseteq \\Trop_{\\R}(\\cV^{\\vee})$ is {\\bf positive} if for any positive integers $d_1, d_2$, any $p_1\\in d_1\\cdot S(\\Z)$, $p_2\\in d_2\\cdot S(\\Z)$ and any $r \\in \\Trop_{\\Z}(\\cV^{\\vee})$ such that $\\alpha (p_1,p_2,r)\\neq 0$, we have that $r \\in (d_1 +d_2)\\cdot S(\\Z)$. \n\\end{definition}\n\n\\begin{remark}\n\\label{rem:positive_sets_in_vs}\n  We can also define positive sets inside $\\Trop_{\\R}(\\cV^{\\vee})_{\\seed^\\vee}$ in exactly the same way they are defined in Definition \\ref{def:positive_set}. \n  In particular we have that  $S\\subset \\Trop_{\\R}(\\cV^{\\vee})$ is positive if and only if $\\mathfrak{r}_{\\seed^\\vee}(S)\\subset \\Trop_{\\R}(\\cV^\\vee_{\\seed})$ is positive.\n\\end{remark}\n\nIn \\cite[\\S8]{GHKK} the authors discuss how positive sets give rise to both, partial minimal models of cluster varieties and toric degenerations of such. In this section we study the inverse problem. \nNamely, we let $(V,\\Phi)$ be a scheme with a cluster structure of type $\\cV$ and construct Newton--Okounkov bodies associated to a partial minimal model $V \\subset Y$ (see \\S\\ref{sec:minimal_models}). \nThen we show that under suitable hypotheses these Newton--Okounkov bodies are positive sets. \nWe let $D_1, \\dots , D_s $ be the irreducible divisors of $Y$ contained in the boundary of $V\\subset Y$ and let $D:=\\bigcup_{j=1}^s D_j$.\n%\\footnote{Since $Y$ is normal a Cartier divisor $\\{(U_i,f_i)\\}$ on $Y$ is fully determined by the corresponding Weil divisor $\\sum \\text{div}(f_i)$, where $\\text{div}(f_i)$ is the principal divisor associated to $f_i$.} \n\n\nGiven a Weil divisor $D'$ on $Y$ we denote by $R(D')$ the associated {\\bf section ring}. \nRecall that $R(D')$ can be described as the $\\Z_{\\geq 0}$-graded ring whose $k^{\\mathrm{th}}$ homogeneous component is\n\\begin{equation*}\n    R_k(D') := H^0(Y, \\mathcal{O}(kD'))= \\lrc{  f\\in \\Bbbk(Y)^* \\mid \\text{div}(f)+kD'\\geq 0 }\\cup \\{ 0\\},\n\\end{equation*}\nwhere $\\text{div}(f)$ is the principal divisor associated to $f$.\nEven more concretely, if $D'=c_1  D'_1 + \\cdots + c_{s'}D'_{s'}$, where $D'_1, \\dots , D'_{s'}$ are distinct prime divisors of $Y$ and $c_1, \\dots , c_{s'}$ are non-negative integers, then $ R_k(D')$ is the vector space consisting of the rational functions on $Y$ that are regular on the complement of $\\bigcup_{j=1}^{s'} D'_j$ and whose order of vanishing along every prime divisor $D'_j$ is bounded below by $-kc_j$. The multiplication of $R(D')$ is induced by the multiplication on $ \\Bbbk(Y)$. \n\n\\begin{definition}\n\\thlabel{def:NOlb}\nLet $\\nu:\\Bbbk(Y)\\setminus \\{ 0 \\} \\to L$ be a valuation, where $(L, < )$ is a linearly ordered lattice. Let $D'$ be a Weil divisor on $Y $ having a non-zero global section. For a choice of non-zero section $\\tau \\in R_1 (D')$ the associated {\\bf Newton--Okounkov body} is\n\\eqn{\n\\Delta_\\nu(D',\\tau) := \\overline{\\conv\\Bigg( \\bigcup_{k\\geq 1}  \\lrc{\\frac{\\nu\\lrp{f/\\tau^k}}{k} \\mid f\\in R_k(D')\\setminus \\{0\\} } \\Bigg) }\\subseteq L\\otimes \\R,\n}\nwhere $\\conv $ denotes the convex hull and the closure is taken with respect to the standard topology of $L\\otimes \\R$.\n\\end{definition}\n\nFrom now on we assume that $D'$ has a non-zero global section.\nWe would like to use a cluster valuation $\\cval: \\Bbbk(V)\\setminus \\{ 0\\} \\to (\\Trop_{\\Z}(\\cV^{\\vee}_{\\seed^\\vee}),<_{\\seed})$ to construct Newton--Okounkov bodies. Notice that if $\\cV$ satisfies the full Fock--Goncharov conjecture, then it is possible to do so as we can extend $\\nu_{\\seed}$ from $\\cmid(\\cV)=\\up(\\cV)$ to $\\Bbbk(\\cV) = \\Bbbk(Y)$. \nObserve, moreover, that if $D'$ is supported on $D$ (that is $D'=\\sum_{j=1}^s c_jD_j$ for some integers $c_1,\\dots , c_s$) then every graded piece $R_k(D') $ is contained in $H^{0}(V,\\mathcal{O}_V)\\cong H^{0}(\\cV,\\mathcal{O}_{\\cV})$, so elements of $R_k(D')$ can be described using the theta basis for $H^0(\\cV,\\mathcal{O}_{\\cV})$. \nMoreover, $\\ord_{D_j}\\in \\cV(\\Z^t)$, so we can define $\\tf^{\\cV}_j$ as in \\eqref{eq:def superpotential_summands}.\n\n\\begin{definition}\n\\label{def:graded_theta_basis}\nAssume $\\cV$ satisfies the full Fock--Goncharov conjecture and that $D'$ is of the form $D'=\\sum_{j=1}^s c_jD_j$. We say that $R(D')$ {\\bf has a graded theta basis} if for every integer $k\\geq 0$ the set of theta functions on $\\cV$ parametrized by the integral points of\n\\[\nP_k(D'):= \\bigcap_{j=1}^s \\lrc{b\\in \\Trop_{\\R}(\\cV^{\\vee}) \\mid  \\Trop_{\\R}(\\tf^{\\cV^\\vee}_j)(b) \\geq -kc_j}\n\\] \nis a basis for $R_k(D')$.\n\\end{definition}\n\nThe reader should notice that in case $\\cV$ has theta reciprocity (see Definition \\ref{def:theta_reciprocity}), then the definition of $P_k(D')$ becomes very natural from the perspective of toric geometry, see \\S\\ref{sec:minimal_models}. We now introduce a notion that allows us to make a good choice for the section $\\tau$.\n\n\n\n \\begin{definition}\n \\label{def:linear_action}\n A subset $L\\subset \\Theta(\\cV)$ is {\\bf linear} if \n \\begin{itemize}\n \\item for any $a,b\\in L$ there exists a unique $r\\in\\Theta(\\cV)$ such that $\\alpha(a,b,r)\\neq 0$ and moreover, $r\\in L$,\n \\item for each $a\\in L$ there exists a unique $b\\in L$ such that $\\tf^{\\cV}_a \\tf^{\\cV}_b=1 $. \n \\end{itemize}\nWe further say that a linear subset $L$ {\\bf acts linearly} on $\\Theta(\\cV)$ if for any $a\\in L$ and $ b \\in \\Theta(\\cV)$ there exists a unique $r\\in \\Trop_{\\Z}(\\cV^{\\vee})$ such that $\\alpha(a,b,r)\\neq 0$. \n \\end{definition}\n \n\nFor example, if $\\cV=\\cA$  then $\\mathfrak{r}_{\\seed}^{-1}(\\Nuf^\\perp)$ is linear and acts linearly on $\\Theta(\\cV)$. If $\\cV =\\cX$\nthen $\\mathfrak{r}_{\\seed}^{-1}(\\ker(p_2^*))$ is linear and acts linearly on $\\Theta(\\cV)$.\n\n\n\\begin{theorem}\n\\label{NO_bodies_are_positive}\nLet $V\\subset Y$ be a partial minimal model. Assume the full Fock--Goncharov conjecture holds for $ \\cV$. Let $D'=\\sum_{j=1}^s c_j D_j$ be a Weil divisor on $Y$ supported on $D$ such that $R(D')$ has a graded theta basis. Let $\\tau\\in R_1(D')$ be such that $\\nu^{\\Phi}_{\\seed}(\\tau) $ belongs to a linear subset of $ \\Trop_{\\Z}(\\cV^{\\vee}) $ acting linearly on $\\Trop_{\\Z}(\\cV^{\\vee}) $. Then the Newton--Okounkov body  $\\Delta_{\\nu^{\\Phi}_{\\seed}}(D',\\tau)\\subset \\Trop_{\\R}(\\cV^{\\vee}_{\\seed^\\vee})$ is a positive set.\n\\end{theorem}\n\n\\begin{proof}\nTo make notation lighter, throughout this proof we denote $\\Delta_{\\nu_{\\seed}}(D',\\tau) $ simply by $ \\Delta $, $P_k(D')_\\seed$ by $P_k$ and $\\nu^{\\Phi}_{\\seed}$ by $\\nu_{\\seed}$.\nWe work in the lattice identification $ \\Trop_{\\Z}(\\cV^{\\vee}_{\\seed^\\vee})$ of $\\Trop_{\\Z}(\\cV^{\\vee})$.\nThe linear subset of the statement corresponds to a sublattice $L \\subseteq \\Trop_{\\Z}(\\cV^{\\vee}_{\\seed^\\vee})$.\n\nConsider $d_1, d_2 \\in \\Z_{>0}$ and $p_1\\in d_1\\Delta(\\Z)$, $p_2\\in d_2\\Delta(\\Z)$. We have to show that for any $r \\in \\Trop_{\\Z}(\\cV^{\\vee}_{\\seed^\\vee})$ with $\\alpha (p_1,p_2,r)\\neq 0$ then $r \\in (d_1 +d_2)\\Delta(\\Z)$. \nFor this it is enough to show that $k\\Delta = P_k - k\\nu_{\\seed}(\\tau)$ for all $k \\in \\Z_{>0}$ as we now explain.\\footnote{In fact, it is enough that the equality holds at the level of integral points, namely, $k\\Delta(\\Z)= P_k(\\Z) - k\\nu_{\\seed}(\\tau)$. However, we are able to show the stronger condition $k\\Delta= P_k - k\\nu_{\\seed}(\\tau)$.}\nIf this is the case then for $i=1,2$, the point $p_i+d_i\\nu_\\seed(\\tau)$ belongs to $P_{d_i}(\\Z)$.\nBy hypothesis $\\tf^V_{p_i+d_i \\nu_{\\seed}(\\tau)}\\in R_{d_i}(D')$.\nIn particular, the product $\\tf^V_{p_1+d_1 \\nu_{\\seed}(\\tau)}\\tf^V_{p_2+d_2 \\nu_{\\seed}(\\tau)} $ must belong to $R_{d_1+d_2}(D')$ and this product must be expressed as a linear combination of theta functions that belong to  $R_{d_1+d_2}(D')$.  \nTo finish we just need to convince ourselves that \n\\[\n\\alpha(p_1+d_1\\nu_\\seed(\\tau),p_2+d_2\\nu_\\seed(\\tau), r+(d_1+d_2)\\nu_\\seed(\\tau))\\neq 0\n\\]\nas this would imply \n\\[\nr+(d_1+d_2)\\nu_\\seed(\\tau)\\in P_{d_1+d_2}(\\Z)=(d_1+d_2)\\Delta(\\Z)+ (d_1+d_2)\\nu_\\seed(\\tau) .\n\\]\nHowever, this follows at once from the fact that $\\nu_\\seed(\\tau)$ belongs to the linear subset $L$. \nIndeed, the condition $\\alpha(p_1,p_2,r)\\neq 0$ implies the existence of a pair of broken lines $\\gamma_1, \\gamma_2$ such that $I(\\gamma_i)=p_i$ and $ F(\\gamma_1)+F(\\gamma_2)=r$. \nSince $\\nu_\\seed(\\tau)\\in L$ we can construct new broken lines $\\gamma'_1$ and $\\gamma'_2$ such that $I(\\gamma'_i)=p_i+d_i\\nu_\\seed(\\tau)$ and $ F(\\gamma'_1)+F(\\gamma'_2)=r+(d_1+d_2)\\nu_\\seed(\\tau)$ by changing the direction of all the domains of linearity of $\\gamma_i$ by  $d_i\\nu_\\seed(\\tau)$. \n\nWe now proceed to show that $k\\Delta= P_k-k\\nu_\\seed(\\tau) $ for all $k \\in \\Z_{>0}$. \nFirst notice that $aP_1= P_a$ for all $a\\in \\R_{\\geq 0}$ (if $g$ is a positive Laurent polynomial then $g^T(ax)=ag^T(x)$ provided $a$ is non-negative). \nSince $P_k$ is closed and convex in order to show that $k \\Delta  \\subset P_k- k\\nu_\\seed(\\tau)$ it is enough to show that $\\frac{k}{k'}\\ \\nu_\\seed(f/\\tau^{k'})=\\frac{k}{k'}\\ \\nu_\\seed(f)-k \\nu_\\seed(\\tau)$ belongs to $P_k-k\\nu_\\seed(\\tau)$ for all $k'\\geq 1$ and all $f\\in R_{k'}(D')\\setminus \\{0\\}$. This follows at once from the fact that $\\frac{k}{k'}\\nu_\\seed(f)\\in P_k$ as $\\frac{k}{k'}P_{k'}=P_k$.\nTo obtain the reverse inclusion it is enough to show that the inclusion holds at the level of rational points, namely, $P_k(\\Q)-k\\nu_\\seed(\\tau)\\subset k\\Delta(\\Q)$. \nIndeed, since $P_k$ is a finite intersection of rational hyperplanes in $\\Trop_{\\R}(\\cV^{\\vee}_{\\seed^\\vee})$ it can be described as the convex hull of its rational points. \nIf $x\\in P_k(\\Q)$ then $\\frac{x}{k}\\in \\frac{1}{k}P_k(\\Q)=P_1(\\Q)$. \nLet $d\\in \\Z_{>0}$ be such that $x':=\\frac{dx}{k} \\in \\Trop_{\\Z}(\\cV^{\\vee}_{\\seed^\\vee})$. \nIn particular, $x'\\in P_{d}(\\Z)_{\\seed}$ which gives that $d^{-1}\\nu_\\seed(\\frac{\\tf_{x'}}{\\tau^{d}})\\in \\Delta$. Finally, notice that $d^{-1}\\nu_\\seed(\\frac{\\tf_{x'}}{\\tau^{d}})=d^{-1}(\\nu_\\seed(\\tf_{x'})-d\\nu_\\seed(\\tau))=d^{-1}x'-\\nu_\\seed(\\tau)$ which implies $x-k\\nu_\\seed(\\tau) \\in k\\Delta$. \n\\end{proof}\n\nIn Theorem \\ref{NO_bodies_are_positive} the assumption that $R(D')$ has a graded theta basis might seem rather strong. We now provide a situation in which this hypothesis holds and in the next subsection we treat a more robust framework in which this condition follows directly from the equivariant nature of theta functions.\n\n\n\\begin{lemma}\n\\label{lem:graded_theta_basis}\nLet $V\\subset Y$ be a minimal model. Assume $D=\\sum_{j=1}^n D_j$ is ample with $D'=cD$ very ample for some $c\\in \\Z_{>0}$. Assume further that the image of the embedding of $Y$ into a projective space given by $D'$ is projectively normal. \nIf $\\cV$ has theta reciprocity and the theta functions on $\\cV$ respect the order of vanishing (see Definition~\\ref{def:respect_order}), then $R(D')$ has a graded theta basis.\n\\end{lemma}\n\\begin{proof}\nIt is enough to treat the case $\\cV=V$.\nConsider the affine cone $\\widetilde{Y}$ of the embedding of $Y$ into a projective space given by $D'$. We consider the canonical projection $\\widetilde{Y}\\setminus \\{ 0\\} \\overset{\\pi}{\\to } Y $ and let $ \\cV':= \\pi^{-1}(\\cV) $.\nObserve that $\\cV'\\cong \\cV \\times \\C^*$. \nWe may think of $\\cV'$ as the cluster variety obtained from $\\cV$ by adding a frozen index and extending trivially the bilinear form in the fixed data defining $\\cV$. \nIn particular, $\\text{up}(\\cV')= \\text{up}(\\cV)[x^{\\pm 1}]$, where $x$ is the coordinate for the $\\C^*$ component. Notice that the theta functions on $\\cV'$ are of the form $\\tf^{\\cV'}_{(p,h)}=\\tf^{\\cV'}_{(0,h)}\\tf^{\\cV'}_{(p,0)} =x^h\\tf^{\\cV}_p$, where $\\tf^{\\cV}_p$ is a theta function on $\\cV$ and $h \\in \\Z=\\Trop_{\\Z}(\\C^*) $.\nAn analogous description holds for the theta functions on $(\\cV')^\\vee \\cong \\cV^\\vee \\times \\C^* $. Namely, these theta functions are of the form $x^h\\tf_q^{\\cV^\\vee}$ for some $h\\in \\Z$.\nWe consider the inclusion $R(D')\\hookrightarrow \\text{up}(\\cV')$ given by sending a homogeneous element $f\\in R_k(D')$ to $x^kf$. The map is well defined since $f$ is regular on $ \\cV $. \nMoreover, if we let $\\widetilde{D}_j:= \\pi^{-1}(D_j)$ then for all $j$ we have $\\ord_{\\widetilde{D}_j}\\lrp{x^{k}}=k$ and $\\ord_{\\widetilde{D}_j}\\lrp{\\tf^{\\cV'}_{(p,0)}}=\\ord_{D_j}\\lrp{\\tf^V_p}$. \nIn particular, thinking of $\\Trop_{\\Z}(\\cV')$ as $\\Trop_{\\Z}(\\cV)\\times \\Z$ we have  $\\ord_{\\widetilde{D}_k}=(\\ord_{D_k},1)$. \nSince theta functions on $\\cV$ respect the order of vanishing, the same holds for the theta functions on $\\cV'$.\n%This is the argument Tim and I came up with (we might wnat to leave it as a comment and add it if the referee asks for it): Indeed, let $f=\\sum_{i=1}^sc_ix^{a_i}\\tf^{\\cV}_{b_i}$ be such that $\\ord_{D'}(f)\\geq 0$. Without loss of generality we may assume $\\ord_{D'}(x^{a_1}\\tf^{\\cV}_{b_1}),\\dots ,\\ord_{D'}(x^{a_k}\\tf^{\\cV}_{b_k})\\leq 0$ and $\\ord_{D'}$ of the other terms are non-negative. For each $1\\leq i \\leq k$ let $\\sum_{r_i}\\alpha_{ r_i}x^{a_i}z^{b_{r_i}}$ be the terms in $x^{a_i}\\tf^{\\cV}_{b_i}$ achieving the worst pole of $(x^{a_i}\\tf^{\\cV}_{b_i})$ along $D'$. Since $\\ord_{D'}(f)\\geq 0 $ we have that $\\sum_{i=1}^kc_i\\sum_{r_i}x^{a_i}z^{b_{r_i}}=0$. Since $x$ does not appear in the monmilas of the form $z^{b_r}$ we have that $a_1=\\dots=a_k$ and that $\\sum_{i=1}^kc_i\\sum_{r_i}z^{b_{r_i}}=0$. It follows that if we write $ord_{D'}= (r, \\ord_{D})$ for $D$ some divisor at inifinity for $\\cV$ then $\\ord_{D}(\\sum_{i=1}^S c_i \\tf^{\\cV}_{b_i})\\geq 0$. Since theta functions on $ \\cV$ respect the order of vanishing we have that $\\ord_{D}(\\tf^{\\cV}_{b_1})\\geq 0$ for all $i$. In particular the poles of $x^{a_1}\\tf^{\\cV}_{b_1},\\dots , x^{a_k}\\tf^{\\cV}_{b_k}$ come from $x$ ant these poles cancel in $f$. Without loss loss of generality $x^{a_1}$ gives the worst pole of $x$ along $D'$. Consider $f_2:=x^{a_1}(\\sum_{i=1}^k x^{a_k-ai}\\tf^{\\cV}_{b_i})= \\sum_{i=1}^k c_ix^{a_i}\\tf^{\\cV}_{b_i} $. Since $\\ord_{D'}(f_2)\\geq 0$ we have that $\\sum_{i=1}^k c_ix^{a_k-ai}\\tf^{\\cV}_{b_i}=0$, but this is impossible since theta functions are linearly independent.\nThis implies that for every $a \\in \\Z$ and every $j$, $\\ord_{D_j}\\lrp{\\sum_q \\alpha_q \\tf_q^{\\cV}}\\geq a$ if and only if $\\ord_{D_j}(\\tf_q^{\\cV})\\geq a$ for all $q$ such that $\\alpha_q \\neq 0$. \nTo see this there is only one implication to be checked (the other follows from the axioms of valuations). \nSo assume $\\ord_{D_j}\\lrp{\\sum_q \\alpha_q \\tf_q^{\\cV}}\\geq a$.\nSince $\\ord_{D_j}\\lrp{\\sum_q \\alpha_q \\tf_q^{\\cV}}=\\ord_{\\widetilde{D}_j}\\lrp{\\sum_q \\alpha_q \\tf_q^{\\cV}}$ and $x^{-a}\\tf_q^{\\cV}$ is a theta function on $\\cV'$ for all $q$  we have the following\n\\begin{align*}\n    \\ord_{D_j}\\lrp{\\sum_q \\alpha_q \\tf_q^{\\cV}}\\geq a & \\Longleftrightarrow   \\ord_{\\widetilde{D}_j}\\lrp{\\sum_q \\alpha_q \\tf_q^{\\cV}}\\geq a \\\\\n    & \\Longleftrightarrow   \\ord_{\\widetilde{D}_j}\\lrp{x^{-a}\\sum_q \\alpha_q \\tf_q^{\\cV}} \\geq 0 \\\\\n    & \\Longleftrightarrow   \\ord_{\\widetilde{D}_j}(x^{-a} \\tf_q^{\\cV})\\geq 0  \\text{ for all } q \\text{ such that } \\alpha_q\\neq 0 \\\\\n     & \\Longleftrightarrow   \\ord_{\\widetilde{D}_j}( \\tf_q^{\\cV})\\geq a  \\text{ for all } q \\text{ such that } \\alpha_q\\neq 0 \\\\\n     & \\Longleftrightarrow   \\ord_{D_j}( \\tf_q^{\\cV})\\geq a  \\text{ for all } q \\text{ such that } \\alpha_q\\neq 0.\n\\end{align*}\nSince $D'$ is very ample and $Y$ is projectively normal in its embedding given by $D'$ we have that $H^0(\\widetilde{Y}, \\mathcal{O}_{\\widetilde{Y}}) \\cong R(D') \\hookrightarrow \\text{up}(\\cV')$.\nIn particular, if we express $f \\in R_k(D')$ as $f= \\sum_q \\alpha_q \\tf^{\\cV}_q$, we have that $\\ord_{D_j}\\lrp{\\tf^\\cV}\\geq -kc$ for all $j$ and all $q $ such that $\\alpha_q \\neq 0$. This means that $\\tf_q^{\\cV} \\in R_k(D')$ for all such $q$. \nIn particular, the theta functions of $\\cV$ that lie in $R_k(D')$ have to be a basis a of $R_k(D')$. By theta reciprocity, such theta functions are precisely those parametrized by $P_k(D')$.\n\\end{proof}\n\n\\begin{remark}\\label{rmk:toric degen}\nIf $R(D')$ is finitely generated and the semigroup generated by the image of $\\nu^\\cV_{\\seed}$ is of full-rank and finitely generated then there is a one parameter toric degeneration of $Y$ to the toric variety associated to $ \\Delta_{\\nu^\\cV_{\\seed}}(D',\\tau)$ \\cite{An13}\\footnote{That is, there is a scheme $\\mathcal Y$ and a flat morphism $ \\mathcal{Y}\\to \\mathbb A^1$ whose generic fibre is isomorphic to $Y$ and special fibre isomorphic to the toric variety associated to $ \\Delta_{\\nu_{\\seed}}(D',\\tau)$.}.\nAs explained in \\cite[\\S8.5]{GHKK} for cluster varieties of type $\\cA$ (regardless of the full-rank assumption) a polyhedral positive set defines a partial compactification $\\cA_{\\text{prin}} \\subset \\overline{\\cA}_{\\text{prin}}$. \nThis compactification comes with a flat morphism $\\overline{\\cA}_{\\text{prin}}\\to \\mathbb A^r$ having $\\overline{\\cA}=Y$ as fibre over ${\\bf 1}=(1, \\dots , 1)$ and whose fibre over $0$ is the toric variety associated to the positive set.\nTherefore, both constructions can be used to degenerate varieties with a cluster structure to the same toric variety. However, the variety given by the latter construction contains various intermediate fibres that lie in between $\\mathcal A=\\cV$ and a toric variety. Moreover, while Anderson's degenerations produces a $(\\Bbbk^*)$-equivariant family, for the latter degeneration this is the case if and only if $\\Gamma$ is of full-rank.\n\\end{remark}\n\n\\subsection{Newton--Okounkov bodies for line bundles via universal torsors}\n\\label{sec:universal_torsors}\n\nIn this section we consider a particularly nice geometric situation that arises often in representation theory. We let $Y$ be an irreducible normal projective scheme whose Picard group $\\text{Pic}(Y) $ is free of finite rank $\\rho \\in \\Z_{>0}$ (recall that $\\text{Pic}(Y) $ is always abelian).\nFollowing \\cite[\\S2]{Hau02} (see also \\cite[\\S3]{BH03}, \\cite[Chapter 1]{ADHL}, \\cite[\\S4]{GHK_birational} or \\cite[\\S2]{HK00}), we consider the universal torsor of $Y$ and the associated Cox ring (\\cf Remark \\ref{rem:Cox}). For the convenience of the reader we recall these concepts. We begin by considering the quasi-coherent sheaf of $\\mathcal{O}_Y$-modules\n\\[\n\\bigoplus_{[\\lb] \\in \\text{Pic}(Y)} \\lb. \n\\]\nIn essence, the universal torsor of $Y$ is obtained by applying a relative spectrum construction (also denoted by {\\bf Spec}) to this sheaf. \nHowever, the choice of the representative $\\lb $ in the class $[\\lb]$ prevents this sheaf from having a natural $\\mathcal{O}_Y$-algebra structure. \nTo address this situation one can proceed as in \\cite[\\S2]{HK00} and consider line bundles $\\lb_1, \\dots, \\lb_{\\rho} $ whose isomorphism classes form a basis of $\\text{Pic}(Y)$. For $v=(v_{1},\\dots, v_{\\rho})\\in \\Z^{\\rho}$ we let $\\lb^{v}= \\lb_1^{\\otimes v_1}\\otimes \\cdots \\otimes \\lb_{\\rho}^{\\otimes v_{\\rho}}$ and consider the quasi-coherent sheaf\n\\[\n\\bigoplus_{v \\in \\Z^{\\rho}}\\lb^{v}.\n\\]\nThis sheaf has a natural structure of a reduced $\\mathcal{O}_Y$-algebra that is locally of finite type over $\\mathcal{O}_Y$ (the component associated to the zero element of $\\text{Pic}(Y)$).\nThis means that for sufficiently small\naffine open subsets $U$ of $Y$, the space $\\bigoplus_{v \\in \\Z^{\\rho}}\\lb^{v}(U)$ is a finitely generated $\\mathcal{O}_Y(U)$-algebra.\nThe universal torsor of $Y$ is obtained by gluing the affine schemes $\\text{Spec}\\lrp{\\bigoplus_{v \\in \\Z^{\\rho}}\\lb^{v}(U)}$.\n\n\\begin{definition}\nThe {\\bf universal torsor} of $ Y$ is \n\\[\n\\UT_Y= \\textbf{Spec}\\lrp{\\bigoplus_{v \\in \\Z^{\\rho}}\\lb^{v} }.\n\\]\nThe {\\bf Cox ring} of $Y$ is\n\\[\n\\text{Cox}(Y)= H^0 (\\UT_Y,\\mathcal{O}_{\\UT_Y}).\n\\]\n\\end{definition}\n\nUniversal torsors can be used to generalize the construction of a projective variety from its affine cone as follows.\nObserve that the inclusion of $ \\mathcal{O}_Y  $ as the degree $0$ part of $\\bigoplus_{v \\in \\Z^{\\rho}}\\lb^{v} $ gives rise to an affine regular map $\\UT_Y\\to Y$.\nSince $\\text{Cox}(Y)$ is $\\text{Pic}(Y)$-graded there is an action of $T_{\\text{Pic}(Y)^*}= \\text{Spec}(\\C[\\text{Pic}(Y)])$ on $\\UT_Y$. \nThis action is free and the map $\\UT_Y\\to Y$ is the associated quotient map (see \\cite[Remark 1.4]{Hau02}).\n\n\\begin{remark}\n\\label{rem:Cox}\nThe notion of a Cox ring associated to a projective variety (satisfying some technical assumptions) was first introduced in \\cite[Definition 2.6]{HK00}.\nThis notion was generalized in \\cite{BH03} for any divisorial variety with only constant globally invertible functions, in particular, for any quasi-projective variety (over very general ground fields). However, in \\cite{BH03} the term \\emph{Cox ring} was not used.\nThe importance of considering universal torsors and Cox rings in the context of cluster varieties was pointed out in \\cite[\\S4]{GHK_birational} (see also \\cite{Man19}) and satisfactorily pursued in representation theoretic contexts where Cox rings arise naturally, see for example \\cite{Mag20}.\n\\end{remark}\n\n\\begin{remark}\nFor simplicity we are assuming that $\\text{Pic}(Y)$ is free. In case it has torsion we can still construct a universal torsor which might not be unique as it depends on the choice of a \\emph{shifting family} as in \\cite[\\S3]{BH03} (see \\cite[\\S3]{Man19} for a related discussion). Generalizations of the results of this section to the torsion case shall be treated elsewhere. \n\\end{remark}\n\n\n\\begin{remark}\nIf $Y$ is smooth we can construct the Cox ring of $Y$ and the universal torsor (still assuming that $\\text{Pic(Y)}$ is torsion free) in an equivalent way. The Cox ring can be defined as $\\text{Cox}(Y)=\\bigoplus_{v\\in \\Z^{\\rho}} H^0 (Y, \\lb^v)$. If $\\text{Cox}(Y)$ is finitely generated over $\\mathcal{O}_Y$-algebra then the universal torsor $\\UT_Y $ is obtained from $\\text{Spec}(\\text{Cox}(Y))$ by removing the unstable locus of the natural $T_{\\text{Pic(Y)}^*}$-action on $\\text{Spec}(\\text{Cox}(Y))$.\n\\end{remark}\n\nFrom now on we assume $ V\\subset \\UT_Y$ is a partial minimal model where $(V,\\Phi)$ is a scheme with a cluster structure of type $\\cA$. \nIn most of the result of this section we assume that $ V\\subset \\UT_Y$ has enough theta functions.\nUnder certain conditions that we discuss next, it is possible to show that $Y$ is a minimal model for a scheme with a cluster structure given by a quotient of $\\cA$ and construct Newton--Okounkov bodies for elements of $\\text{Pic}(Y)$.\nThe key point is to relate the action of $T_{\\text{Pic}(Y)^*}$ on $\\UT_Y$ with the torus actions on $\\cA$ arising from cluster ensemble maps.\n\n\\begin{lemma}\n\\label{lem:positivity_of_q_slice}\nLet $p:\\cA \\to \\cX$ be a cluster ensemble map and $H\\subset K^{\\circ}$ be a saturated sublattice. Consider the quotient $\\cA/T_H$ and the fibration $w_H:\\cA^\\vee \\to T_{H^*}$ (see \\S\\ref{sec:FG_dual}).\nThen the set\n\\[\n\\lrc{ \\tf^{\\cA}_{\\bf m} \\in \\cmid(\\cA)  \\mid {\\bf m} \\in \\lrp{\\Trop_{\\Z}(w_H)}^{-1}(q) \\cap \\Theta(\\cA)}\n\\]\nconsists precisely of the polynomial theta functions on $\\cA$ whose $T_H$-weight is $q$. Moreover, \nfor every $q \\in H^*$ the set $\\lrp{\\Trop_{\\R}(w_H)}^{-1}(q)\\subset \\Trop_{\\R}(\\cA^{\\vee})$ is positive.\n\\end{lemma}\n\\begin{proof}\nThe first claim follows from \\thref{prop:dual_fibration}. \nSo we only need to show that $\\lrp{\\Trop_{\\R}(w_H)}^{-1}(q)$ is positive. \nIn order to show this it is convenient to work with a condition equivalent to positivity called broken line convexity, see \\S\\ref{sec:intrinsic_NOB}.\nWe work in the lattice identification $ \\Trop_{\\R}(\\cA^{\\vee}_{\\seed^\\vee})$ of $\\Trop_{\\R}(\\cA^{\\vee})$.\nWe first argue that the set $ \\lrp{\\Trop_{\\R}(w_H)}^{-1}(0)_{\\seed^\\vee}$ is positive.\nFirst notice that any linear segment $L$ of a broken line segment contained in $ \\lrp{\\Trop_{\\R}(w_H)}^{-1}(0)_{\\seed^\\vee}$ has itself tangent direction in $ \\lrp{\\Trop_{\\Z}(w_H)}^{-1}(0)_{\\seed^\\vee}$. \nLet $m\\in \\lrp{\\Trop_{\\Z}(w_H)}^{-1}(0)_{\\seed^\\vee}$ be the tangent direction of $L$. The tangent direction of the following linear segment is of form $m+cp^*(n)$ for some $n\\in N^+_{\\seed}$ and $c\\in  \\Z_{\\geq 0}$.\nFor any $h\\in H^\\circ$ we have\n\\[\n\\langle m+cp^*(n),h\\rangle =\\langle m,h\\rangle + c\\{n,h\\}=0,\n\\]\nas $H^\\circ\\subset K^\\circ$. So the next tangent direction also belongs to $ \\lrp{\\Trop_{\\Z}(w_H)}^{-1}(0)_{\\seed^\\vee}$.\nWe conclude that the set  $\\lrp{\\Trop_{\\R}(w_H)}^{-1}(0)_{\\seed^\\vee}$ is broken line convex and by the main result of \\cite{CMNcpt} (see Theorem \\ref{thm:mainCMN} below) the set $\\lrp{\\Trop_{\\Z}(w_H)}^{-1}(0)_{\\seed^\\vee}$ is positive.\nThis already implies that for any $ x\\in \\lrp{\\Trop_{\\R}(w_H)}^{-1}(q)_{\\seed^\\vee}$ the set $x+ \\lrp{\\Trop_{\\R}(w_H)}^{-1}(0)_{\\seed^\\vee}$ remains positive. \nIndeed, let $y, z \\in x+ \\lrp{\\Trop_{\\R}(w_H)}^{-1}(0)_{\\seed^\\vee}$. Then $y- z \\in \\lrp{\\Trop_{\\R}(w_H)}^{-1}(0)_{\\seed^\\vee} $. \nIn other words, any line segment within the set $ x+\\lrp{\\Trop_{\\R}(w_H)}^{-1}(0)_{\\seed^\\vee}$ has tangent direction in $\\lrp{\\Trop_{\\R}(w_H)}^{-1}(0)_{\\seed^\\vee}$. \nTherefore, after bending it will remain in the set $x+\\lrp{\\Trop_{\\R}(w_H)}^{-1}(0)_{\\seed^\\vee}$. \nFinally, observe that $x+\\lrp{\\Trop_{\\R}(w_H)}^{-1}(0)_{\\seed^\\vee}=\\lrp{\\Trop_{\\R}(w_H)}^{-1}(q)_{\\seed^\\vee}$. \n\\end{proof}\n\nHaving in mind Proposition~\\ref{prop:dual_fibration} and the action of the $T_{\\text{Pic}(Y)^*}$ on $\\UT_Y$ we introduce the following notion.\n\n\\begin{definition}\n\\thlabel{k_and_pic}\nThe pair $(p,H) $ has the {\\bf Picard property} with respect to $V\\subset \\UT_Y$ if\n\\begin{itemize}\n   \\item $H$ and $\\text{Pic}(Y)^*$ have the same rank, and\n    \\item the action of $T_{H}$ on $\\cA$ coincides with the action of $T_{\\text{Pic}(Y)^*}$ on $\\UT_Y$ restricted to the image of $\\Phi:\\cA \\dashrightarrow V $.\n\\end{itemize}\n\\end{definition}\n\nRecall the definitions of the superpotential and its associated cone of tropical points from \\eqref{eq:def superpotential} and \\eqref{eq:def Xi} in \\S\\ref{sec:minimal_models}.\nThe following result adapts the content of Proposition \\ref{prop:dual_fibration} to this framework. \n\n\n\\begin{lemma}\n\\label{lem:basis_of_tf}\nSuppose that $ V \\subset \\UT_Y$ is a partial minimal model with enough theta functions and that  $(p,H)$ has the Picard property with respect to this model.\nThen for every class $[\\lb]\\in \\text{Pic}(Y)\\cong H^*$ we have that the theta functions parametrized by the integral points of the set $ \\lrp{\\Trop_{\\R}(w_H)}^{-1}\\lrp{[\\lb]}\\cap \\Xi_{\\UT_Y}$ is a basis for $H^0(Y, \\lb)$.\nIn particular, $\\Cox(Y)$ has a basis of theta functions which are $T_{\\text{Pic}(Y)^*}$-eigenfunctions.  \n\\end{lemma}\n\nWe consider the section ring $R(\\lb)=\\bigoplus_{k\\geq 0} R_k(\\lb)  $. The  $k^{\\mathrm{th}}$ homogeneous component is defined as $R_k(\\lb )=H^0(Y, \\lb^{\\otimes k})$. The product of $R(\\lb)$ is given by the tensor product of sections. \nFix a seed $\\seed\\in \\orT$, a linear dominance order $<_{\\seed}$ on $ \\Trop_{\\Z}(\\cA^{\\vee}_{\\seed^\\vee})$ and consider the valuation\n$\n\\gv^{\\Phi}_{\\seed}:\\Bbbk(V)\\setminus \\{ 0 \\} \\to (\\Trop_{\\Z}(\\cA^{\\vee}_{\\seed^\\vee}), <_{\\seed}). \n$\nObserve that $R_k(\\lb)\\subset \\Cox(Y)$ for all $k$. Hence we can define the Newton--Okounkov body\n\\eqn{\n\\Delta_{\\gv^{\\Phi}_{\\seed}}(\\lb) := \\overline{\\conv\\Bigg( \\bigcup_{k\\geq 1}\\lrc{ \\frac{1}{k}\\gv^{\\Phi}_{\\seed} (f) \\mid f\\in R_k(\\mathcal L)\\setminus \\{0\\} } \\Bigg) }\\subseteq \\Trop_{\\Z}(\\cA^\\vee_{\\seed^\\vee})=M^{\\circ}_{\\R}.\n}\n\n\\begin{theorem}\\thlabel{thm:k_and_pic}\nSuppose that $ V \\subset \\UT_Y$ is a partial minimal model with enough theta functions and that  $(p,H)$ has the Picard property with respect to this model.\nThen for any line bundle $\\lb $ on $Y$\n\\[\n\\Delta_{{\\bf g}^{\\Phi}_{\\seed}}(\\lb)=\\lrp{\\Trop_{\\R}(w_H)}^{-1}\\lrp{[\\lb]}_{\\seed^\\vee}\\cap \\Xi_{\\UT_Y, \\seed^\\vee}.\n\\]\nIn particular, $\\Delta_{{\\bf g}^{\\Phi}_{\\seed}}(\\lb)$ is a positive subset of $\\Trop_{\\R}(\\cA^{\\vee}_{\\seed^\\vee})$.\n\\end{theorem}\n\\begin{proof}\nTo make notation lighter, throughout this proof we let $S=\\lrp{\\Trop_{\\R}(w_H)}^{-1}\\lrp{[\\lb]}_{\\seed^\\vee}\\cap \\Xi_{\\UT_Y,\\seed^\\vee}$ and denote $\\gv^{\\Phi}_{\\seed}$ simply by $\\gv_{\\seed}$. Observe that $[\\lb^{\\otimes k}]=k[\\lb]$ in $\\text{Pic}(Y)$. \nTherefore, by Lemma \\ref{lem:basis_of_tf} we have that ${\\bf g}_{\\seed}(R_k(\\lb))\\subseteq \\lrp{\\Trop_{\\R}(w_H)}^{-1}\\lrp{k[\\lb]}_{\\seed^\\vee}$ for all $k\\geq 1$. \nIn particular, $\\dfrac{1}{k}{\\bf g}_{\\seed}(R_k(\\lb))\\subseteq \\lrp{\\Trop_{\\R}(w_H)}^{-1}\\lrp{[\\lb]}_{\\seed^\\vee}$ for all $k \\geq 1$.\nSince $\\lrp{\\Trop_{\\R}(w_H)}^{-1}\\lrp{[\\lb]}_{\\seed^\\vee} $ is closed in $\\Trop_{\\R}(\\cA^{\\vee}_{\\seed^\\vee})$ and convex we have that $\\Delta_{{\\bf g}_{\\seed}}(\\lb)\\subseteq \\lrp{\\Trop_{\\R}(w_H)}^{-1}\\lrp{[\\lb]}_{\\seed^\\vee}$.\nLet $ \\mathbb B_k$ be the theta basis of $R_k(\\lb)$, see \\thref{g_is_val}. Since the theta basis is adapted for ${\\bf g}_{\\seed}$ we have that ${\\bf g}_{\\seed}(R_k(\\lb))={\\bf g}_{\\seed}(\\mathbb B_k)$. \nSince $\\cA \\subseteq \\UT_Y$ has enough theta functions, every theta function $\\tf \\in \\mathbb B_k$ is a global function on $\\UT_Y$, therefore, we have that ${\\bf g}_{\\seed}(\\tf) \\in \\Xi_{\\UT_Y}$.\nSince $\\Xi_{\\UT_Y}$ is closed in $\\Trop_{\\R}(\\cA^{\\vee}_{\\seed^\\vee})$, convex and closed under positive scaling then $\\Delta_{{\\bf g}_{\\seed}}(\\lb)\\subseteq \\Xi_{\\UT_Y,\\seed^\\vee}$.\nHence, $\\Delta_{{\\bf g}_{\\seed}}(\\lb)\\subseteq S$. \nTo see the reverse inclusion we notice that the set of rational points of $S$ coincide with the set $ \\bigcup_{k\\geq 1} \\frac{1}{k}  \\gv_{\\seed }(\\mathbb B_k)=  \\bigcup_{k\\geq 1} \\frac{1}{k}  \\gv_{\\seed }\\lrp{R_k(\\lb)}$.\nSince $S$ can be expressed as the closure of its set of rational points we have that $S\\subseteq \\Delta_{\\gv_{\\seed}}(\\lb)$.\nFinally, since $\\lrp{\\Trop_{\\R}(w_H)}^{-1}\\lrp{[\\lb]}_{\\seed^\\vee}$ and $\\Xi_{\\UT_Y,\\seed^\\vee}$ are positive sets then $S=\\Delta_{{\\bf g}_{\\seed}}(\\lb)$ is an intersection of positive sets. Hence, it is positive.\n\\end{proof}\n\n\\begin{remark}\n\\label{rem:comparing_NO_bodies}\nUnder the assumptions of \\thref{thm:k_and_pic} we have that $Y$ is a minimal model with enough theta functions for an open subscheme $V'\\subset Y$ with a cluster structure given by a birational map $ \\Phi':\\cA/T_{H}\\dashrightarrow V'$ induced by $\\Phi$.\nTo relate the Newton--Okounkov bodies constructed in this section with those constructed in the former we let $\\lb $ be isomorphic to $\\mathcal{O}(D')$ for some Weil divisor $D'$ on $Y$ satisfying the framework of \\S\\ref{sec:NO_bodies}.\nUnder the identification  $ \\Trop_{\\R}(\\cA^{\\vee}_{\\seed^\\vee}) = M^\\circ_\\R$ we realize $ \\Trop_{\\R}((\\cA/T_H)^{\\vee}_{\\seed^\\vee})$ as the subset of $M^\\circ_\\R$ orthogonal to $H$ (see \\S\\ref{tf_quotient}). \nFor any $\\tau \\in R_1(D')$ we have $\\Delta_{\\gv^{\\Phi'}_\\seed}(D',\\tau)\\subset M_{\\R}^\\circ\\cap \\lrp{\\Trop_{\\R}(w_H)}^{-1}(0)_{\\seed^\\vee}$ and by construction\n\\[\n\\Delta_{\\gv^{\\Phi'}_\\seed}(D',\\tau) =\\Delta_{\\gv^{\\Phi}_\\seed}(\\lb)- \\gv^{\\Phi}_{\\seed}(\\tau).\n\\]\n\\end{remark}\n\n\\begin{example}\\thlabel{exp:full_flag}\nAn important class of examples is provided by the base affine spaces. \nConsider $G=SL_{n+1}(\\Bbbk)$ and $B\\subset G$ a Borel subgroup with unipotent subgroup $U\\subset B$.\nThen $G/U$ is a universal torsor for $G/B$. \nMoreover, $G/U$ carries a cluster structure induced by the double Bruhat cell $G^{e,w_0}:=B_-\\cap Bw_0B$, where $B_-\\subset G$ is the Borel subgroup opposite to $B$ (i.e. $B\\cap B^-=:T$ is a maximal torus) and $w_0$ the longest element in this Weyl group $S_n$ is identified with a matrix representative in $N_G(T)/C_G(T)$ (the normalizer of $T$ modulo the centralizer of $T$).\nThe cluster structure on $G^{e,w_0}$ was introduced by Berenstein--Fomin--Zelevinsky in \\cite{BFZ05} and it follows that (up to co-dimension 2) $G^{e,w_0}$ agrees with the corresponding $\\mathcal A$-cluster variety. \nBy \\cite[Proposition 23]{Mag15} there is an embedding $G^{e,w_0}\\hookrightarrow G/U$ compatible with the cluster structure. \nIn particular, $G/U$ is a partial compactification of the $\\mathcal A$-cluster variety $G^{e,w_0}$ obtained by adding the locus where frozen variables are allowed to vanish.\nMagee further proved in \\cite{Mag20} that the full Fock--Goncharov conjecture holds and a cluster ensemble map satisfying \\thref{k_and_pic} is provided in \\cite{Mag20}.\nHence, we obtain a ${\\bf g}$-vector valuation ${\\bf g}_{\\seed}$ on $H^0(G/U,\\mathcal O_{G/U})$ for every choice of seed $\\seed$.\n\nIn particular, \\thref{thm:k_and_pic} applies: recall that the Picard group of $G/B$ is isomorphic to the lattice spanned by the fundamental weights $\\omega_1,\\dots,\\omega_{n}$. \nLet $\\Lambda$ denote the dominant weights, \\ie its elements are $\\lambda=a_1\\omega_1+\\dots+a_n\\omega_n$ with $a_i\\in \\mathbb Z_{\\ge 0}$ and let $\\mathcal L_\\lambda\\to G/B$ be the associated line bundle.\nThe ring of regular functions on the quasi-affine variety $G/U$ coincides with the Cox ring of the flag variety:\n\\[\nH^0(G/U,\\mathcal O_{G/U})\\cong \\bigoplus_{\\lambda \\in \\Lambda} H^0 (G/B,\\mathcal L_\\lambda).\n\\]\nHence, we may restrict the ${\\bf g}$-vector valuations ${\\bf g}_{\\seed}$ for all seeds $\\seed$ to the section ring of any line bundle on $G/B$.\nThe resulting Newton--Okounkov polytopes coincide with slices of the tropicalization of the superpotential corresponding to the compactification. It has been shown in \\cite{BF,GKS_typeA} that for certain choices of seeds these polytopes are unimodularly equivalent to Littelmann's string polytopes (see \\cite{Lit98,BZ01}).\n\\end{example}\n\n\\begin{example}\nGrassmannians also form a distinguished class of examples fitting this framework. We treat this class separately in \\S\\ref{sec:NO_Grass}.\n\\end{example}\n\n\n\n\n\\subsection{The intrinsic Newton--Okounkov body}\\label{sec:intrinsic_NOB} \nIn the situation of \\S\\ref{sec:NO_bodies} or \\S\\ref{sec:universal_torsors}, we can choose two seeds $\\seed, \\seed'\\orT$ to obtain two Newton--Okounkov bodies, say $\\Delta_{\\nu_\\seed}$ and $\\Delta_{\\nu_\\seed'}$ (these are associated to a line bundle $\\lb$ in case we are in a framework as in  \\S\\ref{sec:universal_torsors} or to a divisor $D'$ and a section $\\tau$ in case our framework is as in \\S\\ref{sec:NO_bodies}).\nIn the same spirit as in \\cite{EH20,FH21} (see also \\cite[\\S4]{BMNC} and \\cite{HN23,CHM22}), in this section we show that if one of $\\Delta_{\\nu_{\\seed}}$ or $\\Delta_{\\nu_{\\seed'}}$ (equivalently both) is a positive set then these Newton--Okounkov bodies are related to each other by a distinguished piecewise linear transformation and, moreover, any such Newton--Okounkov body can be intrinsically described as a \\emph{broken line convex hull} (see Theorems \\ref{thm:intrinsic} and \\ref{thm:intrinsic_lb} below).\nIn order to obtain the last assertion we rely on \\cite{CMNcpt}. Along the way we introduce a theta function analog of the Newton polytope associated to a regular function on a torus.\n\nWe start by considering Newton--Okounkov bodies associated to Weil divisors as in \\S\\ref{sec:NO_bodies}. Let $\\cV$ be a scheme of the form $\\cA$, $\\cX$, $\\cA/T_{H}$ or $\\cX_{\\bf 1}$ and $(V, \\Phi)$ a scheme with a cluster structure of type $\\cV$.\nDenote by $\\mathbb{B}_{\\tf}(\\cV)=\\{\\tf^{\\cV}_{\\bf v}\\mid {\\bf v}\\in \\Theta(\\cV)\\}$ the theta basis of $\\cmid(\\cV)$.\nWe begin by observing that a cluster valuation $\\nu_{\\seed}$ on $\\cmid(\\cV)$ can be thought of as an extension of the composition of the seed-independent map \n\\begin{eqnarray}\n    \\label{eq:nu_seed_free}\n    \\nu: \\mathbb{B}_{\\tf}(\\cV)  &\\to&   \\Trop_{\\Z}(\\cV^\\vee)\\\\\n    \\nonumber\n\\tf^{\\cV}_{\\bf v} &\\mapsto &{\\bf v},\n\\end{eqnarray}\nwith the identification $\\mathfrak{r}_{\\seed^\\vee}:\\Trop_{\\Z}(\\cV^\\vee) \\to  \\Trop_{\\Z}(\\cV^\\vee_{\\seed^\\vee})$. \nIf $ \\mathbb B_{\\tf}(V)$ denotes the set of polynomial theta functions on $V$ then we can define $\\nu^{\\Phi} : \\mathbb B_{\\tf}(V) \\to \\Trop_{\\Z}(\\cV^\\vee)$ analogously.\nMoreover, even though $\\Trop_{\\Z}(\\cV^{\\vee})$ may not have a linear structure, if $\\Theta (\\cV)= \\Trop_{\\Z}(\\cV^{\\vee})$ and $L\\subseteq \\Trop_{\\Z}(\\cV^{\\vee})$ is a linear subset acting linearly on $\\Trop_{\\Z}(\\cV^{\\vee})$ (see Definition \\ref{def:linear_action}) then for every $y\\in L$ we have a well defined ``subtraction\" function\n\\eqn{ (\\ \\cdot \\ )-y: \\Trop_{\\Z}(\\cV^{\\vee})  &\\to   \\Trop_{\\Z}(\\cV^{\\vee})\\\\\nx &\\mapsto x-y, }\nwhere $-y $ is the unique point of $ \\Trop_{\\Z}(\\cV^{\\vee})$ such that $\\tf_y\\tf_{-y}=1$ and $x-y$ is the unique point of $\\Trop_{\\Z}(\\cV^{\\vee})$ such that $\\tf_{x}\\tf_{-y}=\\tf_{x-y}$. \n\nWe now define our notion of convexity. Recall from  \\S\\ref{sec:tf_A} that we might think of supports of broken lines as seed independent objects. In light of this we consider the following.\n\n\\begin{definition}\\label{def:blc_intro} \\cite{CMNcpt}\nA closed subset $S$ of $\\Trop_{\\R}(\\cV)$ is {\\bf{broken line convex}} \nif for every pair of rational points $s_1, s_2$ in $S(\\Q)$,\nevery segment of a broken line with endpoints $s_1$ and $s_2$ is entirely contained in $S$.\n\\end{definition}\n\n\\begin{remark}\n\\label{rem:non-generic_bl}\nThe broken lines considered in  Definition \\ref{def:blc_intro} include those that are \\emph{non-generic}. Namely, broken lines that are obtained as limits of the generic broken lines introduced in \\thref{def:genbroken}. See \\cite[Definition~3.3]{CMNcpt} for details. \n\\end{remark}\n\nThe main result of \\cite{CMNcpt} asserts that positivity of a set is equivalent to its broken line convexity:\n\n\\begin{theorem}\n\\thlabel{thm:mainCMN}\n\\cite[Theorem 6.1]{CMNcpt}\nLet $\\cV$ be a variety of the form $\\cA $, $\\cX$, $\\cA/T_{H}$ or $\\cX_{\\bf 1}$. Then a closed subset $S$ of $\\Trop_{\\R}(\\cV)$ is is broken line convex if and only if it is positive.\n\\end{theorem}\n\nMorally, this means that broken line convexity in $\\Trop_{\\R}(\\cV^\\vee)$ play the same role in describing partial minimal models of $\\cV$ that usual convexity in $M_\\R$ plays in describing normal toric varieties $T_N \\subset X$.\nOne appealing feature of the broken line convexity notion is that it makes no reference to any auxiliary data-- given $\\cV$, we can talk about broken line convexity in $\\Trop_{\\R}(\\cV^{\\vee})$.\nIn contrast, the Newton--Okounkov bodies we discussed in \\S\\ref{sec:NO_bodies}\nand \\S\\ref{sec:universal_torsors}\n are convex bodies whose construction depends upon a choice of seed $\\seed$.\nMore generally, a usual Newton--Okounkov body depends not only on the geometric data of a projective variety together with a divisor but also on the auxiliary data of a choice of valuation.\nBroken line convexity makes no reference to any such auxiliary data and will lead us to an intrinsic version of a Newton--Okounkov body. \n\n\\begin{definition}\n\\label{def:bl_convex_hull}\nLet $S \\subset\\Trop_{\\R}(\\cV^{\\vee})$ be a set. The {\\bf{broken line convex hull of $S$}}, denoted by $\\bconv(S)$, is the intersection of all broken line convex sets containing $S$. \n\\end{definition}\n\n\\begin{remark}\n  We can also define broken line convexity \n and broken line convex hulls inside $\\Trop_{\\R}(\\cV^{\\vee}_{\\seed^\\vee})$ in exactly the same way they are defined in Definitions \\ref{def:blc_intro}  and \\ref{def:bl_convex_hull}. \n  In particular, we have that  $S\\subset \\Trop_{\\R}(\\cV^{\\vee})$ is broken line convex if and only if $\\mathfrak{r}_{\\seed^\\vee}(S)\\subset \\Trop_{\\R}(\\cV^\\vee_{\\seed})$ is broken line convex.\n\\end{remark}\n\nUsing this convexity notion, we describe a set analogous to the Newton polytope of a function on a torus.\n\\begin{definition}\n\tGiven a regular function ${f= \\sum_{{\\bf v} \\in \\Trop_{\\Z}(\\cV^{\\vee})}} a_{\\bf v} \\tf^{V}_{\\bf v}$ on $V$, we define the {\\bf{$\\tf$-function analogue of the Newton polytope of $f$}} to be \n\t\\eqn{ \\NewtT(f) := \\bconv\\lrc{ {\\bf v} \\in \\Trop_{\\Z}(\\cV^{\\vee}) \\mid a_{\\bf v} \\neq 0 }. }\n\\end{definition}\n\nThis leads to an intrinsic version of the Newton--Okounkov bodies we have constructed. So consider a partial minimal model $ V \\subset Y$ and let $D'$ be a  divisor on $Y$ supported on the boundary of $V\\subset Y$.  \n\n\\begin{definition}\nAssume that $R(D')$ has a graded theta basis (see Definition \\ref{def:graded_theta_basis}). Then the associated {\\bf{intrinsic Newton--Okounkov body}} is\n\\eqn{\n\\Delta_{\\mathrm{BL}}(D'):= \\bconv\\Bigg( \\bigcup_{k\\geq 1} \\Bigg(\\bigcup_{f \\in R_k(D')} \\frac{1}{k} \\NewtT(f) \\Bigg)  \\Bigg)\\subseteq \\Trop_{\\R}(\\cV^\\vee).\n}\n\\end{definition}\n\nIn order to describe how the different realizations of intrinsic Newton--Okounkov bodies are related we record the tropicalization of the gluing map  $\\mu^{\\cV^\\vee}_k:\\cV^{\\vee}_\\seed \\dashrightarrow \\cV^{\\vee}_{\\seed'}$ in terms of the fixed data $\\Gamma$ and inital seed $\\seed_0=(e_i)_{i\\in I}$ defining $\\cV$. \n\\begin{equation*}\n\\Trop_{\\R}\\lrp{\\mu^{\\cA^\\vee}_{k}}(m)=\\begin{cases} m + \\langle d_ke_k, m \\rangle  v_k & \\text{if } \\langle e_k, m \\rangle \\geq 0,\\\\\nm & \\text{if } \\langle e_k, m \\rangle \\leq 0,\n\\end{cases}\n\\end{equation*}\nfor $m \\in M^{\\circ}$. \n\\[\n\\Trop_{\\R}\\lrp{\\mu^{\\cX^\\vee}_{k}}(n)=\\begin{cases} n + \\{n,d_ke_k \\} e_k & \\text{if } \\{  n,e_K \\}\\geq 0,\\\\\nn & \\text{if } \\{ n,e_K\\} \\leq 0,\n\\end{cases}\n\\]\nfor $n \\in N$. \n\\[\n\\Trop_{\\R}\\lrp{\\mu^{(\\cXe)^\\vee}_{k}}(n+H)=\\begin{cases} n + \\{n,d_ke_k \\}e_k + H & \\text{if } \\{ n, e_k \\}\\geq 0,\\\\\nn + H& \\text{if } \\{ n, e_k \\} \\leq 0,\n\\end{cases}\n\\]\nfor $n + H \\in N/H$. \n\\[\n\\Trop_{\\R}\\lrp{\\mu^{(\\cA/T_H)^\\vee}_{k}} =  \\Trop_{\\R}\\lrp{\\mu^{\\cA^\\vee}_{k}} \\mid_{H^\\perp}.\n\\]\n\n\n\\begin{theorem}\n\\thlabel{thm:intrinsic}\nLet $(V,\\Phi)$ be a scheme with a cluster structure of type $\\cV$ and let $V \\subset Y$ be a partial minimal model. Assume that the full Fock--Goncharov conjecture holds for $\\cV$ and that there exists a theta function $\\tau \\in R_1(D')$ such that $\\nu^{\\Phi}_{\\seed}(\\tau)$ lies in a linear subset of $\\Trop_{\\Z}(\\cV^\\vee)$. If $\\Delta_{\\nu^{\\Phi}_{\\seed}}(D',\\tau)$ is positive then for every seed $\\seed \\in \\orT$ we have that $\\mathfrak{r}_{\\seed^\\vee}(\\Delta_{\\mathrm{BL}}(D')-\\nu^{\\Phi}(\\tau))= \\Delta_{\\nu^{\\Phi}_{\\seed}}(D',\\tau) $.\nIn particular, for any other seed $\\seed'\\in \\orT $ we have that\n\\[\n\\Delta_{\\nu_{\\seed'}}(D', \\tau )= \\Trop_{\\R}\\lrp{\\mu^{\\cV^\\vee}_{\\seed,\\seed'}}\\lrp{\\Delta_{\\nu_{\\seed}}(D', \\tau)}.\n\\] \n\\end{theorem}\n\\begin{proof}\nIt is enough to treat the case $V= \\cV$. We consider the broken line convex hull of \n\\[\nS=\\bigcup_{k\\geq 1}\\lrc{\\dfrac{\\nu_{\\seed}(f)}{k}-\\nu_{\\seed}(\\tau)\\mid f\\in R_k(D') \\setminus\\{0\\}} \n\\]\nin $\\Trop_{\\R}(\\cV^{\\vee}_{\\seed^\\vee})$. \nSince all line segments of $\\Trop_{\\R}(\\cV^{\\vee}_{\\seed^\\vee})$ can be thought of as a segment of a broken line and $\\Delta_{\\nu_{\\seed}}(D', \\tau)$ is closed we have that $ \\Delta_{\\nu_{\\seed}}(D', \\tau)\\subseteq \\bconv(S)$. \nBy \\thref{thm:mainCMN} $\\Delta_{\\nu_{\\seed}}(D', \\tau)$ is broken line convex. Since $S\\subset \\Delta_{\\nu_{\\seed}}(D', \\tau)$ we have the reverse inclusion. \nThe last statement follows from the fact that broken line convex sets are preserved by $\\Trop_{\\R}(\\mu^{\\cV^{\\vee}}_k)$. \n\\end{proof}\n\nThere is an analogous result for line bundles fitting the framework of \\S\\ref{sec:universal_torsors}.\n\n\\begin{definition}\n\\label{def:intrinsic_lb}\nLet $Y$ be a projective variety such that $\\text{Pic}(Y)$ is free of finite rank. Assume $(V, \\Phi)$ is a scheme with a cluster structure of type $\\cA$ and that $V \\subset \\UT_Y$ is a partial minimal model with enough theta functions. Let $(p, H)$ have the Picard property (see \\thref{k_and_pic}). The {\\bf{intrinsic Newton--Okounkov body} associated to a class $[ \\lb ]\\in \\text{Pic}(Y)\\cong H^*$} is\n\\eqn{\n\\Delta_{\\mathrm{BL}}(\\lb):= \\bconv\\Bigg( \\bigcup_{k\\geq 1} \\Bigg(\\bigcup_{f \\in R_k(\\lb)} \\frac{1}{k} \\NewtT(f) \\Bigg)  \\Bigg)\\subseteq \\Trop_{\\R}(\\cA^{\\vee}).\n}\n\\end{definition}\n\nIn this case we have the following theorem whose proof is completely analogous to the proof of \\thref{thm:intrinsic}. Moreover, it uses the fact that $\\nu_{\\seed}(\\lb)$ is a positive set, as shown in \\thref{thm:k_and_pic}.\n\n\\begin{theorem}\n\\thlabel{thm:intrinsic_lb}\nKeep the assumptions of Definition \\ref{def:intrinsic_lb}.\nFor every seed $\\seed\\in \\orT$ we have that $\\Delta_{\\nu^{\\Phi}_{\\seed}}(\\lb)=\\mathfrak{r}_{\\seed^\\vee}(\\Delta_{\\mathrm{BL}}(\\lb))$. In particular, for every $\\seed' \\in \\orT $ we have that\n\\[\n\\Delta_{\\nu^\\Phi_{\\seed'}}(\\lb )= \\lrp{\\mu^{\\cV^\\vee}_{\\seed^\\vee, \\seed'^\\vee}}^T(\\Delta_{\\nu^\\Phi_{\\seed}}(\\lb)).\\] \n\\end{theorem}\n\\begin{proof}\nWe showed in \\thref{thm:k_and_pic} that $\\Delta_{\\nu_{\\seed}}(\\lb)$ is a positive set. The proof of this result is completely analogous to the proof of \\thref{thm:intrinsic}.\n\\end{proof}\n\nIn either situation (divisors or line bundles) we are of course free to compute the intrinsic Newton--Okounkov body as a usual Newton--Okounkov body in any vector space realization of $\\Trop_{\\R}(\\cV^{\\vee})$.\nHowever, the intrinsic definition has certain advantages as we now explain.\nFor simplicity, from now on we concentrate on line bundles as in \\thref{thm:intrinsic_lb}; the reader can make the appropriate changes for the case of divisors as in \\thref{thm:intrinsic}.\nIt is often the case that $\\Delta_{\\mathrm{BL}}(\\lb) = \\bconv \\Big( \\bigcup_{k=1}^\\ell \\frac{1}{k} \\nu^{\\Phi}\\lrp{R_k(\\lb)} \\Big)$ for some finite $\\ell$, meaning in these cases the infinite union reduces to finite union.\nConsider such an instance and let $\\ell_{\\seed}$ be the smallest integer such that $\\Delta_{\\nu^{\\Phi}_{\\seed}}(\\lb)=\\conv \\Big( \\bigcup_{k=1}^{\\ell_{\\seed}} \\frac{1}{k} \\nu^{\\Phi}_{\\seed}\\lrp{R_k(\\lb)} \\Big)$.  \nThen the corresponding $\\ell$ for the intrinsic Newton--Okounkov body is at most $\\min_{\\seed}\\lrc{\\ell_{\\seed}}$.\nMoreover, we can give conditions indicating when $\\ell$ has been attained. \nWe will start with a condition that, after adopting a slightly different perspective on theta functions, becomes tautological.\\footnote{This perspective is essentially the {\\it{jagged path}} description of theta functions rather than the broken line description.  See for example \\cite[Section~3]{GS12}.}  \nWe will then adapt this condition to give a sufficient criterion that is more likely to be known for a given minimal model (and a known line bundle or Weil divisor).\n\n\\begin{proposition}\\thlabel{taut}\nLet $\\lb$ be as in \\thref{thm:intrinsic_lb}. Suppose there exists a positive integer $\\ell$ such that for all $h>\\ell$, each theta function $\\tf^V_r$ in $R_h(\\lb)$ appears as a summand (with non-zero coefficient) of some product $\\tf^V_p \\tf^V_q$, where $\\tf^V_p \\in R_i(\\lb)$ and $\\tf^V_q \\in R_j(\\lb)$  for some positive integers $i$ and $j$ with $i+j =h$. \nThen \\eqn{\n\\Delta_{\\mathrm{BL}}(\\lb) = \\bconv\\Bigg( \\bigcup_{k=1}^{\\ell} \\Bigg(\\bigcup_{f \\in R_k(\\lb)} \\frac{1}{k} \\NewtT(f) \\Bigg)  \\Bigg) .\n}\n\\end{proposition}\n\n\\begin{proof}\nThis is an immediate consequence of results in \\cite{CMNcpt}.  We adopt the terminology and conventions of {\\it loc. cit.} for this proof.  In particular, we allow non-generic broken lines (see Remark \\ref{rem:non-generic_bl}).\n\nSince the structure constant $\\alpha(p,q,r)$ is non-zero, there exists a pair of broken lines $\\lrp{\\gamma_1,\\gamma_2}$ with $I(\\gamma_1) = p $, $I(\\gamma_2) = q $, $\\gamma_1(0)=\\gamma_2(0) = r$, and $F(\\gamma_1)+ F(\\gamma_2) = r$.\nThen the construction of \\cite[\\S4]{CMNcpt} yields a broken line segment from $\\frac{p}{i}$ to $\\frac{q}{j}$ passing through $\\frac{r}{h}$. \nAs a consequence, we have \n\\eqn{\\frac{r}{h} \\in \\bconv\\Bigg( \\bigcup_{k=1}^{\\max(i,j)} \\Bigg(\\bigcup_{f \\in R_k(\\lb)} \\frac{1}{k} \\NewtT(f) \\Bigg)  \\Bigg) . }\nBy hypothesis, $R_k(\\lb)$ has a basis of theta functions for all $k$, so \n\\eqn{\n\\bconv \\Bigg(\\bigcup_{f \\in R_h(\\lb)} \\frac{1}{h} \\NewtT(f) \\Bigg) =  \\bconv \\lrp{ \\frac{r}{h} \\mid \\tf^V_r \\in R_h(\\lb)} .\n}\nWe have just seen that each such $\\frac{r}{h}$ is contained in \n\\eqn{\\bconv\\Bigg( \\bigcup_{k=1}^{h-1} \\Bigg(\\bigcup_{f \\in R_h(\\lb)} \\frac{1}{h} \\NewtT(f) \\Bigg)  \\Bigg), }\nso \n\\eqn{\n\\bconv \\Bigg(\\bigcup_{f \\in R_h(\\lb)} \\frac{1}{h} \\NewtT(f) \\Bigg) \\subset  \\bconv\\Bigg( \\bigcup_{k=1}^{h-1} \\Bigg(\\bigcup_{f \\in R_k(\\lb)} \\frac{1}{k} \\NewtT(f) \\Bigg)  \\Bigg).\n}\nAs this holds for all $h>\\ell$, we conclude that\n\\eqn{\n\\Delta_{\\mathrm{BL}}(\\lb) = \\bconv\\Bigg( \\bigcup_{k=1}^{\\ell} \\Bigg(\\bigcup_{f \\in R_k(\\lb)} \\frac{1}{k} \\NewtT(f) \\Bigg)  \\Bigg) .\n}\n\\end{proof}\n\n\\begin{remark}\nIn dimension 2, Mandel \\cite{Man16} showed that the assumption in \\thref{taut} implies that $r=p+q$ in some seed. It is a very interesting problem to determine if this holds for higher dimensions. \n\\end{remark}\n\nNote that as we have (by assumption) a theta basis for $R(\\lb)$, the condition of \\thref{taut} is implied by the following condition:\n\n\\noindent\n\\begin{condition}\\thlabel{condition_section ring}\nThere exists a positve integer $\\ell$ such that for all $h>\\ell$, the natural map $R_i (\\lb) \\otimes R_j(\\lb) \\to R_h (\\lb)$ is surjective for some positive integers $i$ and $j$ with $i+j =h$.\n\\end{condition} \n\n\\begin{remark}\\label{rmk:borel weil bott}\nThe \\thref{condition_section ring} is satisfied in our main class of examples coming from representation theory: recall the setting of \\thref{exp:full_flag} where line bundles $\\mathcal L_\\lambda$ of the full flag variety $G/B$ are indexed by dominant weights $\\lambda$.\nBy the Borel--Weil--Bott Theorem the graded pieces $R_i(\\mathcal L_\\lambda)$ of the section rings of these line bundles satisfy\n\\[\nR_i(\\mathcal L_\\lambda)\\cong  V(i\\lambda)^*,\n\\]\nwhere $V(i\\lambda)$ is the irreducible $G$-representation of highest weight $i\\lambda$ and $i\\ge 0$. \nBy work of Baur \\cite{Baur_CartanComp} the tensor product $V(i\\lambda)\\otimes V(j\\lambda)$ contains among its irreducible components the unique component of maximal weight, called Cartan component, which is $V((i+j)\\lambda)$. \nHence,\n\\[\nR_i(\\mathcal L_\\lambda)\\otimes R_j(\\mathcal{L}_\\lambda)\\cong V(i\\lambda)^*\\otimes V(j\\lambda)^*\\twoheadrightarrow V((i+j)\\lambda)^*\\cong R_{i+j}(\\mathcal L_\\lambda).\n\\]\nAlthough in \\thref{exp:full_flag} we only treat the case of $SL_{n+1}(\\Bbbk)$ it is worth noticing that the Borel--Weil(--Bott) Theorem holds for semisimple Lie groups and algebraic groups over $\\Bbbk$ and Baur's result holds for irreducible representations of connected, simply-connected complex reductive groups.\nNotice further that these observations also hold for partial flag varieties, \\ie quotient $G/P$ by parabolic subgroups $P\\subset G$ as the cohomology of an equivariant line bundles on $G/P$ is equal to the cohomology of its pullback along the natural projection $G/B\\twoheadrightarrow G/P$. So the cohomology of the line bundle on $G/P$ can be calculated using the usual Borel--Weil(--Bott) Theorem for $G/B$, by the Leray spectral sequence.\n\\end{remark}\n\n\n\n\\section{The case of the Grassmannian}\n\\label{sec:NO_Grass}\n\n\nWe now consider in detail the case of the Grassmannians. \nThroughout this section we work over the complex numbers, fix two positive integers $k<n$ and let\n\\[\nY= \\Grass_{n-k}(\\C^n)\n\\]\nbe the corresponding Grassmannian. \nLet $\\widetilde{Y} $ be the affine cone of $Y$ in its Pl\\\"ucker embedding $Y\\hookrightarrow \\mathbb{P}^{\\binom{n}{n-k}-1}$ and $\\lb_e$ be the bundle over $Y$ obtained by pullback of $\\mathcal{O}(1)$ under this embedding.\nBy definition, the Pl\\\"ucker coordinates are a basis for $H^0(Y,\\lb_e)$.\nIt is well known that $\\Pic({Y})$ is free of rank one and $[\\lb_e]$ is a generator.\nMoreover, the universal torsor of $Y$ is \n\\[\n\\UT_{Y}\\cong \\widetilde{Y}\\setminus\\{ 0\\}\n\\]\nand the action of $T_{\\Pic({Y})^*}$ on $\\UT_{Y}$ coincides with the diagonal action of $\\C^*$. \nPl\\\"ucker coordinates are denoted by $p_{J}$ where $J\\in \\binom{[n]}{n-k}$ is an $n-k$-element subset of $\\{1, \\dots , n\\}$.\nWorking with cyclic intervals, we let $ D_i=\\{ p_{[i+1, i+k]} =0 \\}$ and consider the divisor \n\\[\nD=\\bigcup_{i=1}^nD_{i} \\subset Y.\n\\]\nFor any $i$ the line bundle $\\mathcal{O}_Y(D_i)$ is isomorphic to $\\lb_e$ and the Weil divisor $\\sum_{i=1}^nD_i$ is anticanonical.\nWe let $\\widetilde{D}_i \\subset \\UT_Y$ be the preimage of $D_i$ under the quotient map $\\UT_Y\\twoheadrightarrow Y$ and set $\\widetilde{D}= \\bigcup_{i=1}^n \\widetilde{D}_i$.\nThe divisor $\\sum_{i=1}^n\\widetilde{D}_i$ is anticanonical and $(\\UT_Y, \\widetilde{D})$ is a log Calabi--Yau pair.\nIt follows from the work of Scott \\cite{Sco06} that the log Calabi--Yau variety $\\UT_Y \\setminus \\widetilde{D}$ has a cluster structure of type $\\cA$ which is skew-symmetric (that is, for its fixed data all $d_i=1$ and $N=N^\\circ$) and such that the frozen variables are precisely the Pl\\\"ucker variables $\\{ p_{[i+1,i+k]}\\}_{i=1}^{n}$.\nThis cluster structure is given by an inclusion \n\\[\n\\cA\\hookrightarrow \\UT_Y  \\setminus \\widetilde{D}.\n\\]\nSince $\\widetilde{D}$ is the locus in $\\UT_{Y}$ where the frozen variables vanish,\nwe have that $ \\cA \\subset \\UT_Y$ is a partial minimal model (see the example below Definition~\\ref{def:cv_minimal_model}).\nIn \\cite{MS16} (see also \\cite{SW18}), the authors show that $\\cA$ has a seed with a maximal green sequence so we can use \\cite[Proposition 0.4]{GHKK} to conclude that the full Fock--Goncharov conjecture holds for $\\cA$.\nProposition 9.4 in \\cite{GHKK} together with \\thref{lemm:enough_tf} imply that $\\cA \\subset \\UT_Y$ is a partial minimal model with enough theta functions in the sense of Definition \\ref{def:enough_tf}.\nIn the following subsection we exhibit a cluster ensemble lattice map $p^*$ for $\\cA$ such that for $K := \\ker(p^*)$, the pair $(p,K)$ has the Picard property in the sense of \\thref{k_and_pic} with respect to $\\cA \\subset \\UT_Y$. \nThese considerations allow us to apply to all the results of \\S\\ref{sec:universal_torsors} and \\S\\ref{sec:NO_bodies}.\nIn particular, we can think of the Grassmannian as a minimal model for the quotient $\\cA/T_K$.\n\n\\begin{remark}\\label{rmk:open positroid}\nThe variety $Y \\setminus D$ is usually called the open positroid variety. This variety can be endowed with a cluster structure of any of the kinds we consider in this paper: $\\cA$, $\\cX$, a quotient of $\\cA$ or a fibre of $\\cX$.\n\\end{remark}\n\n\\subsection{The Picard property}\n\\label{sec:Pic_property}\nIn this section we verify that the Picard property (\\thref{k_and_pic}) holds for a certain choice of cluster ensemble map and sublattice. This condition is necessary in order to apply \\thref{thm:k_and_pic} to the Grassmannian.  \n\nWe rely on background from \\cite{RW} but recall important notions below.\nFor background on plabic graphs we refer the reader to \\emph{loc. cit.}. \nRecall, that plabic graphs\\footnote{To be precise, we are only interested in reduced plabic graphs with trip permutation $\\pi_{k,n}$.} are combinatorial objects encoding those seeds whose associated $\\cA$-cluster variables are Pl\\\"ucker coordinates.\nTo simplify the exposition we do not distinguish between a plabic graph and its associated seed.\nGiven an index set $J\\in\\binom{[n]}{n-k}$ we construct a Young diagram $\\mu_J$ inside an $(n-k)\\times k$ grid inside a rectangle.\nLet $w_J$ be the path along edges of the grid from north east  to south west corner whose south steps are in $J$. \nThen $\\mu_J$ is the Young diagram (inside the rectangle attached to the north west corner) whose south east border is $w_J$.\nAmong all plabic graphs there is a particularly symmetric one know as the {\\bf rectangles plabic graph} $G_{\\Yng(1)}:=G^{\\rm rec}_{k,n}$.\nThe associated cluster variables are naturally indexed by {\\it rectangular} Young diagrams (together with the {\\it empty} rectangle, denoted by $\\varnothing$). \nIn what follows we focus on this plabic graph as the initial seed and denote by\n\\begin{equation}\n\\label{eq_seed}\n\\seed_{\\Yng(1)}=(e_\\varnothing)\\cup(e_{i\\times j}\\mid  1\\le i\\le n-k, 1\\le j\\le k),    \n\\end{equation}\nthe induced basis of $N=N^\\circ\\cong \\mathbb Z^{k(n-k)+1}$. \nLet $\\{f_\\varnothing\\}\\cup\\{f_{i\\times j}\\mid  1\\le i\\le n-k, 1\\le j\\le k\\}$ denote the corresponding basis of $M^\\circ=M$.\nWe write $N_{\\seed}$ respectively $M_\\seed$ whenever we think of the lattices together with a choice of basis induced by a seed $\\seed$. \n\\begin{figure}\n\\centering\n\\begin{tikzpicture}[scale=.95]\n\\node[blue] at (-1.5,10) {$\\tiny{\\varnothing}$};\n\\draw[->] (-1.25,9.875) -- (-.25,9.125);\n\\draw[->,blue,dashed] (7,9.25) to [out=150,in=0] (-1.25,10.125);\n\\draw[->,blue,dashed] (-.25,4.75) to [out=155,in=-90] (-1.5,9.75);\n\\node at (0,9) {$\\tiny{\\yng(1)}$};\n\\node at (1.5,9) {$\\tiny{\\yng(2)}$};\n\\node at (3,9) {$\\tiny{\\yng(3)}$};\n\\node at (5,9) {$\\tiny{\\yng(4)}$};\n\\node[blue] at (7,9) {$\\tiny{\\yng(5)}$};\n%right arrow\n\\draw[->] (.25,9) -- (1.125,9);\n\\draw[->] (1.875,9) -- (2.5,9);\n\\draw[->] (3.5,9) -- (4.375,9);\n\\draw[->] (5.625,9) -- (6.25,9);\n%down arrows\n\\draw[->] (0,8.75) -- (0,8.25);\n\\draw[->] (1.5,8.75) -- (1.5,8.25);\n\\draw[->] (3,8.75) -- (3,8.25);\n\\draw[->] (5,8.75) -- (5,8.25);\n\\draw[->,blue,dashed] (7,8.75) -- (7,8.25);\n%diagonal arrows\n\\draw[<-] (.25,8.75) -- (1.25,8.25);\n\\draw[<-] (1.875,8.75) -- (2.75,8.25);\n\\draw[<-] (3.5,8.75) -- (4.375,8.25);\n\\draw[<-] (5.625,8.75) -- (6.5,8.25);\n\n\\node at (0,7.875) {$\\tiny{\\yng(1,1)}$};\n\\node at (1.5,7.875) {$\\tiny{\\yng(2,2)}$};\n\\node at (3,7.875) {$\\tiny{\\yng(3,3)}$};\n\\node at (5,7.875) {$\\tiny{\\yng(4,4)}$};\n\\node[blue] at (7,7.875) {$\\tiny{\\yng(5,5)}$};\n\\draw[->] (.25,7.875) -- (1.125,7.875);\n\\draw[->] (1.875,7.875) -- (2.5,7.875);\n\\draw[->] (3.5,7.875) -- (4.375,7.875);\n\\draw[->] (5.625,7.875) -- (6.25,7.875);\n\n\\draw[->] (0,7.5) -- (0,7);\n\\draw[->] (1.5,7.5) -- (1.5,7);\n\\draw[->] (3,7.5) -- (3,7);\n\\draw[->] (5,7.5) -- (5,7);\n\\draw[->,blue,dashed] (7,7.5) -- (7,7);\n%diagonal arrows\n\\draw[<-] (.25,7.5) -- (1.125,7);\n\\draw[<-] (1.875,7.5) -- (2.75,7);\n\\draw[<-] (3.5,7.5) -- (4.375,7);\n\\draw[<-] (5.625,7.5) -- (6.5,7);\n\n\\node at (0,6.5) {$\\tiny{\\yng(1,1,1)}$};\n\\node at (1.5,6.5) {$\\tiny{\\yng(2,2,2)}$};\n\\node at (3,6.5) {$\\tiny{\\yng(3,3,3)}$};\n\\node at (5,6.5) {$\\tiny{\\yng(4,4,4)}$};\n\\node[blue] at (7,6.5) {$\\tiny{\\yng(5,5,5)}$};\n%right arrows\n\\draw[->] (.25,6.5) -- (1.125,6.5);\n\\draw[->] (1.875,6.5) -- (2.5,6.5);\n\\draw[->] (3.5,6.5) -- (4.375,6.5);\n\\draw[->] (5.625,6.5) -- (6.25,6.5);\n%down arrows\n\\draw[->] (0,6) -- (0,5.375);\n\\draw[->] (1.5,6) -- (1.5,5.375);\n\\draw[->] (3,6) -- (3,5.375);\n\\draw[->] (5,6) -- (5,5.375);\n\\draw[->,blue,dashed] (7,6) -- (7,5.375);\n%diagonal arrows\n\\draw[<-] (.25,6) -- (1.125,5.375);\n\\draw[<-] (1.875,6) -- (2.75,5.375);\n\\draw[<-] (3.5,6) -- (4.375,5.375);\n\\draw[<-] (5.625,6) -- (6.5,5.375);\n\n\\node[blue] at (0,4.75) {$\\tiny{\\yng(1,1,1,1)}$};\n\\node[blue] at (1.5,4.75) {$\\tiny{\\yng(2,2,2,2)}$};\n\\node[blue] at (3,4.75) {$\\tiny{\\yng(3,3,3,3)}$};\n\\node[blue] at (5,4.75) {$\\tiny{\\yng(4,4,4,4)}$};\n\\node[blue] at (7,4.75) {$\\tiny{\\yng(5,5,5,5)}$};\n\\draw[->,blue,dashed] (.25,4.75) -- (1.125,4.75);\n\\draw[->,blue,dashed] (1.875,4.75) -- (2.5,4.75);\n\\draw[->,blue,dashed] (3.5,4.75) -- (4.375,4.75);\n\\draw[->,blue,dashed] (5.625,4.75) -- (6.25,4.75);\n\\end{tikzpicture}\n\\caption{The quiver of the plabic graph $G^{\\rm rec}_{5,9}$ forming the initial seed for $\\mathcal A\\subset \\widetilde{Y}=\\widetilde{\\text{Gr}}_{4}(\\mathbb C^9)$ with the frozen arrows determining the cluster ensemble map $p^*$.}\n\\label{fig:quivGrec59}\n\\end{figure}\n\nWe start by defining a lattice map \n\\[\n\\psi:N_{\\seed_{\\Yng(1)}} \\to  M_{\\seed_{\\Yng(1)}}\n\\]\nwhich is given with respect to the bases induced by $s_{\\Yng(1)}$ as follows:\nfor $i\\times j$ a mutable vertex and $a\\times b$ with either $a=n-k$ or $b=k$ a frozen vertex we define\n\\begin{eqnarray*}\n    e_{i\\times j} &\\mapsto& f_{(i-1)\\times(j-1)} - f_{(i-1)\\times j} + f_{i\\times(j+1)} - f_{(i+1)\\times (j+1)} + f_{(i+1)\\times j} - f_{i\\times (j-1)} \\\\\n   % e_{1\\times 1} &\\mapsto& f_{1\\times 2} - f_{2\\times 2} + f_{2\\times 1} - f_{\\varnothing}\\\\\n    e_{a\\times b} &\\mapsto& f_{a\\times b} - f_{(a-1)\\times b} + f_{(a-1)\\times(b-1)} - f_{a\\times (b-1)} \\\\\n    e_{\\varnothing} &\\mapsto& f_{\\varnothing} - f_{1\\times k} + f_{1\\times 1} - f_{(n-k)\\times 1}\n\\end{eqnarray*}\nwith the convention that $f_{0\\times j}=f_{i\\times 0}=0$ whenever $i,j\\not =0$ and $f_{0\\times 0}=f_{\\varnothing}$. \nWe may present the map pictorially by recording the coefficient of the basis element $e_{i\\times j}$ in the $i\\times j$'th position of the grid (with an extra position $0\\times 0$ representing  the vertex $\\varnothing$). \n\\begin{eqnarray}\\label{eq:pictorial p*}\n\\begin{tikzpicture}[scale=.4]\n\\node at (-3,0){$e_{i\\times j}$};\n\\draw[dashed,opacity=.5] (-1.5,.5) -- (1.5,.5);\n\\draw[dashed,opacity=.5] (-1.5,-0.5) -- (1.5,-0.5);\n\\draw[dashed,opacity=.5] (-0.5,-1.5) -- (-0.5,1.5);\n\\draw[dashed,opacity=.5] (0.5,-1.5) -- (0.5,1.5);\n\\node[opacity=.5] at (-1,-1) {\\small $0$};\n\\node[opacity=.5] at (-1,0) {\\small$0$};\n\\node[opacity=.5] at (-1,1) {\\small$0$};\n\\node[opacity=.5] at (0,-1) {\\small$0$};\n\\node at (0,0) {\\small$1$};\n\\node[opacity=.5] at (0,1) {\\small$0$};\n\\node[opacity=.5] at (1,-1) {\\small$0$};\n\\node[opacity=.5] at (1,0) {\\small$0$};\n\\node[opacity=.5] at (1,1) {\\small$0$};\n\n\\node at (2.5,0) {$\\mapsto$};\n\n\\begin{scope}[xshift=5cm]\n\\draw[dashed,opacity=.5] (-1.5,.5) -- (1.5,.5);\n\\draw[dashed,opacity=.5] (-1.5,-0.5) -- (1.5,-0.5);\n\\draw[dashed,opacity=.5] (-0.5,-1.5) -- (-0.5,1.5);\n\\draw[dashed,opacity=.5] (0.5,-1.5) -- (0.5,1.5);\n\\node[opacity=.5]  at (-1,-1) {\\small$0$};\n\\node at (-1,0) {\\small$-1$};\n\\node at (-1,1) {\\small$1$};\n\\node at (0,-1) {\\small$1$};\n\\node[opacity=.5]  at (0,0) {\\small$0$};\n\\node at (0,1) {\\small$-1$};\n\\node at (1,-1) {\\small$-1$};\n\\node at (1,0) {\\small$1$};\n\\node[opacity=.5]  at (1,1) {\\small$0$};\n\\end{scope}\n\n\n\\begin{scope}[xshift=15cm]\n\\node at (-3,0){$e_{a\\times b}$};\n\\draw[dashed,opacity=.5] (-1.5,.5) -- (1.5,.5);\n\\draw[dashed,opacity=.5] (-1.5,-0.5) -- (1.5,-0.5);\n\\draw[dashed,opacity=.5] (-0.5,-1.5) -- (-0.5,1.5);\n\\draw[dashed,opacity=.5] (0.5,-1.5) -- (0.5,1.5);\n\\node[opacity=.5] at (-1,-1) {\\small $0$};\n\\node[opacity=.5] at (-1,0) {\\small$0$};\n\\node[opacity=.5] at (-1,1) {\\small$0$};\n\\node[opacity=.5] at (0,-1) {\\small$0$};\n\\node at (0,0) {\\small$1$};\n\\node[opacity=.5] at (0,1) {\\small$0$};\n\\node[opacity=.5] at (1,-1) {\\small$0$};\n\\node[opacity=.5] at (1,0) {\\small$0$};\n\\node[opacity=.5] at (1,1) {\\small$0$};\n\n\\node at (2.5,0) {$\\mapsto$};\n\n\\begin{scope}[xshift=5cm]\n\\draw[dashed,opacity=.5] (-1.5,.5) -- (1.5,.5);\n\\draw[dashed,opacity=.5] (-1.5,-0.5) -- (1.5,-0.5);\n\\draw[dashed,opacity=.5] (-0.5,-1.5) -- (-0.5,1.5);\n\\draw[dashed,opacity=.5] (0.5,-1.5) -- (0.5,1.5);\n\\node[opacity=.5]  at (-1,-1) {\\small$0$};\n\\node at (-1,0) {\\small$-1$};\n\\node at (-1,1) {\\small$1$};\n\\node[opacity=.5]  at (0,-1) {\\small$0$};\n\\node at (0,0) {\\small$1$};\n\\node at (0,1) {\\small$-1$};\n\\node[opacity=.5]  at (1,-1) {\\small$0$};\n\\node[opacity=.5]  at (1,0) {\\small$0$};\n\\node[opacity=.5]  at (1,1) {\\small$0$};\n\\end{scope}\n\\end{scope}\n\\end{tikzpicture}\n\\end{eqnarray}\nAll entries in the grid above \\emph{not} corresponding to vertices in the particular case considered should simply be neglected.\nA straightforward computation reveals the following\n\n\\begin{proposition}\\thlabel{prop:-p* dual}\nWe have $\\ker(\\psi)=\\langle{(1,1,\\dots,1)}\\rangle=K_{\\seed_{\\Yng(1)}}$ and $\\psi(N_{\\seed_{\\Yng(1)}})={(1,1,\\dots,1)}^\\perp$. So, the induced map $\\psi:N/K\\to K^\\perp$ is a lattice isomorphism. \n\\end{proposition}\n\nIn fact, $\\psi$ defines a cluster ensemble lattice map (Definition~\\ref{def:p-star}), so we obtain\n\\begin{eqnarray}\\label{eq:p-map Gr}\n    p:\\cA\\to\\cX, \\quad \\text{determined by }\\quad  (p\\vert_{\\cA_{\\seed_{\\Yng(1)}}})^*=\\psi.\n\\end{eqnarray}\nThere is a combinatorial way to obtain the map $\\psi$ by introducing \\emph{frozen arrows} to the quiver of the initial seed to \\emph{close cycles} involving frozen vertices (see Figure~\\ref{fig:quivGrec59}).\nThese arrows are used to determine the submatrix denoted by $*$ in \\eqref{eq:Mp*}.\n\n\nAs a direct consequence of \\eqref{eq:p-map Gr} and \\thref{prop:-p* dual} we observe that the action of $T_K$ on $\\cA$ coincides with the $\\C^*$-action (of simultaneously scaling Pl\\\"ucker coordinates) on $\\UT_Y$ restricted to $\\cA$.\nIn particular:\n\\begin{corollary}\n    The Picard property holds for $(p,K)$ with respect to $\\mathcal A\\hookrightarrow\\UT_Y$.\n\\end{corollary}\n\n\\subsection{Valuations and Newton--Okounkov bodies}\n\\label{sec:GHKK_and_RW}\nThis subsection is the core of our application to the Grassmannian. \nWe show in Theorem~\\ref{thm: val and gv} that certain Newton--Okounkov bodies as they appear in \\thref{thm:k_and_pic} (see also Remark~\\ref{rem:comparing_NO_bodies}) are unimodularly equivalent to Newton--Okounkov bodies of Rietsch--Williams. \nWe first introduce the combinatorics that govern Rietsch--Williams' flow valuation and the ${\\bf g}$-vector valuation in this case.\n\n\\subsubsection{The flow valuation}\nBased on Postnokiv's \\emph{boundary measurement map} for plabic networks \\cite[\\S11]{Pos06} Rietsch--Williams associate a {\\bf flow valuation} \\cite[Definition 8.1]{RW} to every plabic graph $G$ or more generally every seed $\\seed$ making use of the $\\cX$-type cluster structure on the Grassmannian\nWe denote it by \n\\[\n\\val_\\seed:\\mathbb C(Y)\\setminus \\{0\\}\\to \\mathbb Z^{(n-k)\\times k}.\n\\]\nThe valuation is defined as the multidegree of the lowest degree summand (with respect a fixed graded lexicographic order) on Laurent polynomials in $\\cX$ variables and then extended to rational functions in the natural way.\nThe lattice is of dimension $(n-k)k$ (as apposed to $(n-k)k+1$ which is the number of vertices), as the the variable corresponding to $\\varnothing$ never appears (more details below in \\S\\ref{sec:NO_Grass_equal}).\nNotice that it therefore coincides with our definition of a {\\bf c}-vector valuation for cluster $\\cX$ varieties (Corollary~\\ref{cor:gv on midX}).\nFor $G=G_{\\Yng(1)}$ we simply write $\\val_G=\\val_{\\Yng(1)}$.\nThe flow valuation  with respect to the rectangles plabic graph can be computed in a particularly explicit way as Rietsch--Williams show in \\cite[\\S14]{RW}.\nWe briefly summarize some of their findings.\n\n\\begin{proposition}\\cite[Proposition 14.4 and Figure 18]{RW}\\thlabel{prop:val grec}\nFor $J\\in\\binom{[n]}{n-k}$, the valuation $\\val_{\\Yng(1)}(p_J)$ can be represented by a \\emph{GT tableau} (defined as follows, see \\cite[\\S14]{RW}) of size $(n-k)\\times k$ whose $(i\\times j)^{\\text{th}}$ entry represents the coefficient of the corresponding basis element.\nThe entries of the GT tableau are obtained as in four steps:\n\\begin{itemize}\n    \\item[\\bf Step 1:] draw the Young diagram $\\mu_J$ whose south border is the path $w_J$ associated to $J$ in the $(n-k)\\times k$-rectangle; \n    \\item[\\bf Step 2:] draw another copy of $w_{J}$ shifted by {\\it one step south} and {\\it one step east} (this implies that some steps of the new path $w_J^1$ lie outside of the $(n-k)\\times k$-rectangle);\n    \\item[\\bf Step 3:] continue repeating Step 2 until the new copy of $w_J$ lies \\emph{entirely} outside of the $(n-k)\\times k$-rectangle;\n    \\item[\\bf Step 4:] lastly, place an $i$ inside every box (that is part of the $(n-k)\\times k$-rectangle) in between the paths $w_{J}^{i-1}$ and $w_{J}^i$.\n\\end{itemize}\nAll other boxes are filled with zeros.\n\\end{proposition}\nRietsch--Williams compute the Newton--Okounkov bodies associated to this valuation. In our notation they are of form $\\Delta_{\\val_{\\Yng(1)}}(D_{n-k},p_{(n-k)\\times k})$, where $p_{(n-k)\\times k}=p_{[1,n-k]}$ is the Pl\\\"ucker coordinate (and hence section of $\\mathcal L_e$) associated to the frozen vertex $(n-k)\\times k$.\n\n\\begin{example}\\label{exp:Grec6,13}\n    The procedure of \\thref{prop:val grec} is depicted in Figure~\\ref{fig:val grec} for $J=\\{3,4,7,9,11,12\\}\\subset [13]$. \n\\end{example}\n\n\\begin{figure}\n    \\centering\n\\begin{tikzpicture}[scale=.4]\n\\node[left] at (-1,3) {$\\val_{\\Yng(1)}(p_J)=$};\n\\draw (0,0) -- (0,6) -- (5,6);\n\\draw[thick,magenta] (7,6) -- (5,6) -- (5,4) -- (3,4) -- (3,3) -- (2,3) -- (2,2) -- (1,2) -- (1,0) -- (0,0);\n\\node at (2,4.5) {$\\mu_J$};\n\\draw (7,6) -- (7,0) -- (1,0);\n\\draw[magenta,opacity=.4,thick] (2,0) -- (2,-1) -- (1,-1);\n\\draw[magenta,opacity=.4,thick] (2,0) -- (2,1) -- (3,1) -- (3,2) -- (4,2) -- (4,3) -- (6,3) -- (6,5) -- (7,5);\n\\draw[magenta,opacity=.4,thick] (7,5) -- (8,5);\n\\draw[magenta,opacity=.4,thick] (2,-2) -- (3,-2) -- (3,0) -- (4,0) -- (4,1) -- (5,1) -- (5,2) -- (7,2) -- (7,4) -- (9,4);% -- (10,3) -- (10,5) -- (11,5);;\n\\draw[magenta,opacity=.4,thick] (3,-3) -- (4,-3) -- (4,-1) -- (5,-1) -- (5,0) -- (6,0) -- (6,1) -- (8,1) -- (8,3) -- (10,3);\n\\draw[dashed,opacity=.5] (0,0) -- (3.5,-3.5);\n\\draw[dashed,opacity=.5] (7,6) -- (10.5,2.5);\n\\draw[opacity=.4,dashed] (5,6) -- (7,6);\n\\draw[opacity=.4,dashed] (5,5) -- (7,5);\n\\draw[opacity=.4,dashed] (5,4) -- (7,4);\n\\draw[opacity=.4,dashed] (3,3) -- (7,3);\n\\draw[opacity=.4,dashed] (2,2) -- (7,2);\n\\draw[opacity=.4,dashed] (1,1) -- (7,1);\n\\draw[opacity=.4,dashed] (6,6) -- (6,0);\n\\draw[opacity=.4,dashed] (5,5) -- (5,0);\n\\draw[opacity=.4,dashed] (4,4) -- (4,0);\n\\draw[opacity=.4,dashed] (3,3) -- (3,0);\n\\draw[opacity=.4,dashed] (2,2) -- (2,0);\n\\draw[opacity=.4,dashed] (1,1) -- (1,0); \n\\node at (1.5,.5) {1};\n\\node at (1.5,1.5) {1};\n\\node at (2.5,.5) {2};\n\\node at (2.5,1.5) {1};\n\\node at (2.5,2.5) {1};\n\\node at (3.5,.5) {2};\n\\node at (3.5,1.5) {2};\n\\node at (3.5,2.5) {1};\n\\node at (3.5,3.5) {1};\n\\node at (4.5,.5) {3};\n\\node at (4.5,1.5) {2};\n\\node at (4.5,2.5) {2};\n\\node at (4.5,3.5) {1};\n\\node at (5.5,.5) {3};\n\\node at (5.5,1.5) {3};\n\\node at (5.5,2.5) {2};\n\\node at (5.5,3.5) {1};\n\\node at (5.5,4.5) {1};\n\\node at (5.5,5.5) {1};\n\\node at (6.5,.5) {4};\n\\node at (6.5,1.5) {3};\n\\node at (6.5,2.5) {2};\n\\node at (6.5,3.5) {2};\n\\node at (6.5,4.5) {2};\n\\node at (6.5,5.5) {1};\n\n\\node at (11.5,3) {${\\longmapsto}$};\n\\node at (11.5,3.75) {\\small $-\\psi$};\n \n\\begin{scope}[xshift=14cm]\n\\draw (0,0) -- (0,6) -- (5,6) -- (5,4) -- (3,4) -- (3,3) -- (2,3) -- (2,2) -- (1,2) -- (1,0) -- (0,0);\n\\draw (5,6) -- (7,6) -- (7,0) -- (1,0);\n\\draw[opacity=.4,dashed] (0,6) -- (7,6);\n\\draw[opacity=.4,dashed] (0,5) -- (7,5);\n\\draw[opacity=.4,dashed] (0,4) -- (7,4);\n\\draw[opacity=.4,dashed] (0,3) -- (7,3);\n\\draw[opacity=.4,dashed] (0,2) -- (7,2);\n\\draw[opacity=.4,dashed] (0,1) -- (7,1);\n\\draw[opacity=.4,dashed] (6,6) -- (6,0);\n\\draw[opacity=.4,dashed] (5,6) -- (5,0);\n\\draw[opacity=.4,dashed] (4,6) -- (4,0);\n\\draw[opacity=.4,dashed] (3,6) -- (3,0);\n\\draw[opacity=.4,dashed] (2,6) -- (2,0);\n\\draw[opacity=.4,dashed] (1,6) -- (1,0); \n\\node at (.5,.5) {$1$};\n\\node[opacity=.5] at (.5,1.5) {$0$};\n\\node[opacity=.5] at (.5,3.5) {$0$};\n\\node[opacity=.5] at (.5,4.5) {$0$};\n\\node[opacity=.5] at (.5,5.5) {$0$};\n\\node at (.5,2.5) {$-1$};\n\\node at (1.5,2.5) {$1$};\n\\node at (1.5,3.5) {$-1$};\n\\node[opacity=.5] at (1.5,.5) {$0$};\n\\node[opacity=.5] at (1.5,1.5) {$0$};\n\\node[opacity=.5] at (1.5,4.5) {$0$};\n\\node[opacity=.5] at (1.5,5.5) {$0$};\n\\node at (2.5,3.5) {$1$};\n\\node[opacity=.5] at (2.5,.5) {$0$};\n\\node[opacity=.5] at (2.5,1.5) {$0$};\n\\node[opacity=.5] at (2.5,2.5) {$0$};\n\\node[opacity=.5] at (2.5,5.5) {$0$};\n\\node at (2.5,4.5) {$-1$};\n\\node[opacity=.5] at (3.5,.5) {$0$};\n\\node[opacity=.5] at (3.5,1.5) {$0$};\n\\node[opacity=.5] at (3.5,2.5) {$0$};\n\\node[opacity=.5] at (3.5,3.5) {$0$};\n\\node[opacity=.5] at (3.5,4.5) {$0$};\n\\node[opacity=.5] at (3.5,5.5) {$0$};\n\\node at (4.5,4.5) {$1$};\n\\node[opacity=.5] at (4.5,.5) {$0$};\n\\node[opacity=.5] at (4.5,1.5) {$0$};\n\\node[opacity=.5] at (4.5,2.5) {$0$};\n\\node[opacity=.5] at (4.5,3.5) {$0$};\n\\node[opacity=.5] at (4.5,5.5) {$0$};\n\\node[opacity=.5] at (5.5,.5) {$0$};\n\\node[opacity=.5] at (5.5,1.5) {$0$};\n\\node[opacity=.5] at (5.5,2.5) {$0$};\n\\node[opacity=.5] at (5.5,3.5) {$0$};\n\\node[opacity=.5] at (5.5,4.5) {$0$};\n\\node[opacity=.5] at (5.5,5.5) {$0$};\n\\node at (6.5,.5) {$-1$};\n\\node[opacity=.5] at (6.5,1.5) {$0$};\n\\node[opacity=.5] at (6.5,2.5) {$0$};\n\\node[opacity=.5] at (6.5,3.5) {$0$};\n\\node[opacity=.5] at (6.5,4.5) {$0$};\n\\node[opacity=.5] at (6.5,5.5) {$0$};\n\n\\node[right] at (7.5,3) {$=\\bar{\\bf g}_{{\\Yng(1)}}(p_{J})$};\n\\end{scope}\n\\end{tikzpicture}\n  \\caption{\n  On the left: the pictorial representation of the GT tableau for $J=\\{3,4,7,9,11,12\\} \\subset [13]$ (\\thref{prop:val grec}).  The south steps of the path cutting out the Young diagram $\\mu_J$ correspond to indices in $J$. \n  On the right, we depict its image under $-\\psi$ which coincides with the ${\\bf g}$-vector of $p_J$ up to homogenization, see \\eqref{eq:g-vectors for Grec} and \\eqref{eq:homogenized g vector op}.\n }\n    \\label{fig:val grec}\n\\end{figure}\n\n\n\\subsubsection{A combinatorial description of ${\\bf g}$-vectors}\\label{sec:g-vects}\nIn this subsection we consider the cluster variety $\\cA^{\\rm op}$ whose initial quiver is obtained by opposing the initial quiver for $\\cA$. It is well known that $\\cA$ and $\\cA^{\\rm op}$ are isomorphic (in general opposing the quiver gives rise to isomorphic cluster $ \\cA$-varieties). \nWe also have a partial minimal model $\\cA^{\\op} \\hookrightarrow \\UT_Y$.\nWe write $\\seed_{\\Yng(1)}^{\\rm op}$ to denote the seed $\\seed_{\\Yng(1)}$ of equation \\eqref{eq_seed} thought of as the initial seed for $\\cA^{\\rm op}$.\nNotice that $-\\psi$ determines a cluster ensemble map $p^{\\rm op}:\\cA^{\\rm op} \\to \\cX^{\\rm op}$\nso that the Picard property holds for $(p^{\\rm op},K)$ with respect to $\\cA^{\\rm op}\\hookrightarrow \\text{UT}_Y$.\n\nIn this setting an explicit combinatorial formula to compute {\\bf g}-vectors of Pl\\\"ucker coordinates can be deduced from the categorification of the Grassmannian cluster algebra developed in \\cite{JKS16,BKM16}.\nWe learned about it from Bernhard Keller in private email communication.\nThe below formula describes ${\\bf g}$-vectors with respect to the seed $\\seed_{\\Yng(1)}^{\\rm op}$ for the cluster variety $\\cA^{\\rm op}$ which we think of as another cluster structure on $\\UT_{Y}$.\n\n\n\\begin{corollary}\\thlabel{cor:gv Grec}\n(Hook formula for {\\bf g}-vectors)\nConsider the seed $\\seed_{\\Yng(1)}^{\\rm op}$ and $J \\in \\binom{[n]}{n-k}$. We let $i_1\\times j_1,\\dots ,i_s \\times j_s $ be the rectangles corresponding to the turning points in the path $w_J$ that cuts out $\\mu_J$ inside the $(n-k)\\times k$-rectangle.\nThen \n\\begin{eqnarray}\\label{eq:g-vectors for Grec}\n{\\bf g}_{\\Yng(1)^{\\rm op}}(p_J):={\\bf g}^{\\cA^{\\rm op}}_{\\seed_{\\Yng(1)}^{\\rm op}}(p_J)= \\sum_{p=1}^{s}f_{i_{p}\\times j_{p}}-f_{i_{p}\\times j_{p+1}},\n\\end{eqnarray}\nwhere we set $f_{i_s\\times j_{s+1}}:=0$.\n\\end{corollary}\n\n\\begin{example}\nThe \nConsider $n-k=4$, $n=9$, and $J=\\{2,4,6,7\\}$. We have that $\\mu_{J}=\\Yng(4,3,2,2)$ and by \\thref{cor:gv Grec}\n\\[\n{\\bf g}_{\\Yng(1)^{\\rm op}}\\lrp{p_{\\Yngs(4,3,2,2)}}= f_{\\Yngs(4)}- f_{\\Yngs(3)}+f_{\\Yngs(3,3)}-f_{\\Yngs(2,2)}+f_{\\Yngs(2,2,2,2)}.\n\\]\n\\end{example}\n\n\\subsubsection{Equality of the Newton--Okounkov bodies}\\label{sec:NO_Grass_equal}\n The aim of this section is to identify the Newton--Okounkov bodies of flow valuations with Newton--Okounkov bodies of {\\bf g}-vector valuations for $\\cA^{\\rm op}$. \nWe use a particular cluster ensemble lattice map for the identification and work in the initial seed $\\seed^{\\rm op}_{\\Yng(1)}$ (whose quiver is opposite to the quiver depicted in Figure~\\ref{fig:quivGrec59} for $n=9,k=5$).\n\n\n\nWe think of the open positroid variety inside $Y=\\text{Gr}_{n-k}(\\C^n)$ as the quotient of the cluster variety $\\cA^{\\rm op}$ by the torus $T_{K}$.\nWe choose a section \n\\[\n\\sigma: N/K \\to N \\quad \\text{with image} \\quad N  \\cap f_{\\varnothing}^\\perp;\n\\]\nthat is, a coset $n\\mod K$ is sent to its unique representative satisfying $\\langle n,f_{\\varnothing}\\rangle=0$.\nIt is not hard to see that $\\sigma$ induces an isomorphism between the rings of rational functions $\\mathbb C(T_{M/\\langle f_{\\varnothing}\\rangle})$ and $\\mathbb C(T_{K^\\perp})$ that commutes with cluster $\\mathcal X$ mutation.\nWe use $\\sigma$ to realize $\\Trop_{\\Z}(({\\cX_{\\bf 1}^\\vee})_{\\seed^\\vee})=\\Trop_{\\Z}((\\cA^{\\rm op}/T_K)_{\\seed^{\\rm op}})= N/K$ inside $ \\Trop_\\Z(\\cA^{\\rm op}_{\\seed^{\\rm op}})=N =\\Trop_\\Z(\\cX^\\vee_{\\seed^\\vee})$ for every seed.\nMoreover, the dual of $\\sigma$ induces an isomorphism of lattices\n\\[\n\\sigma^*:M/\\langle f_{\\varnothing}\\rangle \\to K^\\perp.\n\\]\nNotice that $T_{K^\\perp}=\\pi^{-1}(\\bf 1)$ where $\\pi:T_M\\to T_{K^*}$ is the restriction of $\\cX\\to T_{K^*}$ to a cluster chart.\nAs alluded to above, we obtain an isomorphism of cluster $\\cX$-varieties\n\\[\n\\sigma^*:\\cX_{\\setminus \\varnothing}\\to \\cX_{\\bf 1},\n\\]\nwhere $\\cX_{\\setminus \\varnothing}$ is the $\\cX$-variety associated with the initial data obtained by deleting the index $\\varnothing$ upon realizing $M/\\langle f_{\\varnothing}\\rangle$ as $\\langle f_{i\\times j}\\mid 1\\le i\\le n-k,1\\le j\\le k\\rangle \\subset M$.\nGiven a seed $\\seed$ we denote the corresponding seed of $\\cX_{\\setminus\\varnothing}$ by $\\seed_{\\setminus\\varnothing}$.\nIn particular, we have\n$\\Trop_{\\Z}((\\mathcal X^\\vee_{\\setminus\\varnothing})_{\\seed^\\vee_{\\setminus \\varnothing}})=N_{\\seed}\\cap f_{\\varnothing}^\\perp$.\nThe flow valuation is defined on ring of rational functions on the positroid variety which coincides with\n\\[\n\\mathbb C(\\cX_{\\setminus \\varnothing}) \\cong \\mathbb C(x_{i\\times j}:1\\le i\\le n-k,1\\le j\\le k).\n\\]\n\n\nThe next result follows from the preceding discussion and Corollary~\\ref{cor:gv on midX}.\n\n\\begin{proposition}\\label{prop:flow is gv for X}\nFor every choice of seed $\\seed$ the diagram commutes:\n\\[\n\\xymatrix{\n\\mathbb C(\\cX_{\\bf 1})\\setminus \\{0\\} \\ar[d]_{(\\sigma^*)^*}\\ar[r]^{\\cv_{\\seed}} & N/K\\ar[d]^{\\sigma}\\\\\n\\mathbb C(\\cX_{\\setminus \\varnothing})\\setminus \\{0\\} \\ar[r]_{\\val_{\\seed}} & N_{\\seed}\\cap f_{\\varnothing}^\\perp.\n}\n\\]\n\\end{proposition}\nThe flow valuation is a ${\\bf c}$-vector valuation for the variety $\\cX_{\\setminus \\varnothing}$ as both are defined by picking the lowest degree exponent of a Laurent polynomial with respect to the same order. That is:\n\\[\n\\val_{\\seed}= \\cv^{\\cX_{\\setminus \\varnothing}}_{\\seed\\setminus\\varnothing}.\n\\]\nAlternatively, in light of Proposition \\ref{prop:flow is gv for X} we may think of the flow valuation as a ${\\bf c}$-vector valuation for $\\cX_{\\bf 1}$.\nOur aim now is to identify the images of $\\val_\\seed$ with \nthose of a $\\bf g$-vector valuation for $\\cA^{\\rm op}$, or more precisely a ${\\bf g}$-vector valuation for $\\cA^{\\rm op}/T_{K}$.\nTo avoid confusion we introduce the following notation\n\\begin{eqnarray}\\label{eq:homogenized g vector op}\n\\bar {\\bf g}_{\\Yng(1)^{\\rm op}}:R\\setminus \\{0\\}\\longrightarrow K^\\perp\\cong M_{\\seed}/\\langle f_{(n-k)\\times k}\\rangle \n\\end{eqnarray}\ndefined for a homogeneous element $h\\in R_q\\setminus \\{0\\}$ by\n\\[\nh\\longmapsto \n{\\bf g}_{\\Yng(1)^{\\rm op}}\\lrp{\\frac{h}{\np_{(n-k)\\times k}^q}},\n\\]\nwhere ${\\bf g}_{\\Yng(1)^{\\rm op}}(p_{(n-k)\\times k})=f_{(n-k)\\times k}$.\nNotice that $\\bar {\\bf g}_{\\Yng(1)^{\\rm op}}$ is the restriction of ${\\bf g}_{\\Yng(1)^{\\rm op}}: \\mathbb C(Y)\\setminus \\{0\\}\\to M$ to the section ring $R\\hookrightarrow \\mathbb C(Y)$ where the embedding is defined by $R_q\\ni h\\mapsto h/p_{(n-k)\\times k}^q$.\n\n \n\\begin{theorem}\\thlabel{thm: val and gv}\n\\begin{samepage}\nFor every $J\\in \\binom{[n]}{n-k}$ we have\n\\begin{eqnarray}\n-\\psi(\\val_{\\Yng(1)}(p_J))=\\bar {\\bf g}_{\\Yng(1)^{\\rm op}}(p_{J}). \n\\end{eqnarray}\nIn particular, \nthe Newton--Okounkov bodies $\\Delta_{\\val_{\\Yng(1)}}(D_{n-k},p_{(n-k)\\times k})$ and $\\Delta_{\\bar{\\bf g}_{\\Yng(1)^{\\rm op}}}(D_{n-k},p_{(n-k)\\times k})$ are unimodularly equivalent with lattice isomorphism given by $-\\psi$. \n\\end{samepage}\n\\end{theorem}\n\n\\begin{proof}\nWe prove the claim in several steps. \nFirst, we need to describe $\\val_{\\Yng(1)^{\\rm op}}(p_J)$. \nFortunately, this is straightforward using \\thref{prop:val grec}.\nLet us analyze the image of the \\emph{$i$-strip}, \\emph{i.e.} the image of the elements of form $-ie_{a\\times b}$ corresponding to a box in position $a\\times b$ of the grid lying between the path $w_J^i$ and $w_J^{i-1}$. \nWe deduce\n\\begin{center}\n\t\\begin{tikzpicture}[scale=.6]\n\\draw[thick,teal] (1,3) -- (4,3) -- (4,6);\n\\draw[thick,magenta] (2,2) -- (5,2) -- (5,5);\n\\node at (4.5,5.5) {\\tiny $\\vdots$};\n\\node at (1.5,2.5) {\\tiny $\\cdots$};\n\\node at (2.5,2.5) {\\tiny $-i$};\n\\node at (3.5,2.5) {\\tiny $-i$};\n\\node at (4.5,2.5) {\\tiny $-i$};\n\\node at (4.5,3.5) {\\tiny $-i$};\n\\node at (4.5,4.5) {\\tiny $-i$};\n\\draw[dashed,opacity=.5] (1,2.5) -- (1,4.5);\n\\draw[dashed,opacity=.5] (2,1.5) -- (2,5.5);\n\\draw[dashed,opacity=.5] (3,.5) -- (3,6.5);\n\\draw[dashed,opacity=.5] (4,.5) -- (4,6.5);\n\\draw[dashed,opacity=.5] (5,.5) -- (5,5.5);\n\\draw[dashed,opacity=.5] (6,.5) -- (6,4.5);\n\\draw[dashed,opacity=.5] (1.5,5) -- (5.5,5);\n\\draw[dashed,opacity=.5] (.5,4) -- (6.5,4);\n\\draw[dashed,opacity=.5] (.5,3) -- (6.5,3);\n\\draw[dashed,opacity=.5] (1.5,2) -- (6.5,2);\n\\draw[dashed,opacity=.5] (2.5,1) -- (6.5,1);\n\\node[above right,magenta] at (5,5) {\\tiny $w_J^i$};\n\\node[above right,teal] at (4,6) {\\tiny $w_J^{i-1}$};\n\n\\node at (8,3) {$\\mapsto$};\n\n\\begin{scope}[xshift=9 cm]\n\\node at (.5,3.5) {\\tiny $\\cdots$};\n\\node at (1.5,2.5) {\\tiny $i$};\n\\node at (2.5,1.5) {\\tiny $-i$};\n\\node at (3.5,6.5) {\\tiny $\\vdots$};\n\\node at (4.5,5.5) {\\tiny $i$};\n\\node at (5.5,4.5) {\\tiny $-i$};\n\\draw[thick] (1,4) -- (3,4) -- (3,6);\n\\draw[thick,teal] (1,3) -- (4,3) -- (4,6);\n\\draw[thick,magenta] (2,2) -- (5,2) -- (5,5);\n\\draw[thick] (3,1) -- (6,1) -- (6,4);\n\\node at (1.5,3.5) {\\tiny $-i$};\n\\node at (2.5,3.5) {\\tiny $0$};\n\\node at (3.5,3.5) {\\tiny $i$};\n\\node at (3.5,4.5) {\\tiny $0$};\n\\node at (3.5,5.5) {\\tiny $-i$};\n\\node at (2.5,2.5) {\\tiny $i$};\n\\node at (3.5,2.5) {\\tiny $0$};\n\\node at (4.5,2.5) {\\tiny $-2i$};\n\\node at (4.5,3.5) {\\tiny $0$};\n\\node at (4.5,4.5) {\\tiny $i$};\n\\node at (3.5,1.5) {\\tiny $0$};\n\\node at (4.5,1.5) {\\tiny $0$};\n\\node at (5.5,1.5) {\\tiny $i$};\n\\node at (5.5,2.5) {\\tiny $0$};\n\\node at (5.5,3.5) {\\tiny $0$};\n\\draw[dashed,opacity=.5] (1,2.5) -- (1,4.5);\n\\draw[dashed,opacity=.5] (2,1.5) -- (2,5.5);\n\\draw[dashed,opacity=.5] (3,.5) -- (3,6.5);\n\\draw[dashed,opacity=.5] (4,.5) -- (4,6.5);\n\\draw[dashed,opacity=.5] (5,.5) -- (5,5.5);\n\\draw[dashed,opacity=.5] (6,.5) -- (6,4.5);\n\\draw[dashed,opacity=.5] (1.5,5) -- (5.5,5);\n\\draw[dashed,opacity=.5] (.5,4) -- (6.5,4);\n\\draw[dashed,opacity=.5] (.5,3) -- (6.5,3);\n\\draw[dashed,opacity=.5] (1.5,2) -- (6.5,2);\n\\draw[dashed,opacity=.5] (2.5,1) -- (6.5,1);\n\\end{scope}\n\\end{tikzpicture}\n\\end{center}\nNotice that unless $i=1$ all non zero entries in the picture cancel with the images of the $(i-1)$- and the $(i+1)$-strips.\nWhen $i=1$ however, the entry $i=1$ above the path  $w_J^{0}=w_J$ stays.\nHence, for every corner in $w_{J}$ corresponding to a south step followed by a west step $-\\psi\\left(\\val_{\\Yng(1)}(p_J)\\right)$ has coefficient $1$ for $f_{a\\times b}$ where $a\\times b$ corresponds to the box whose south east corner coincides with this corner of $w_J$.\nThe case of a corner in $w_{J}$ corresponding to a west step followed by a south step is very similar, with the only difference that the signs change. \nIn particular,\n$-\\psi\\left(\\val_{\\Yng(1)}(p_J)\\right)$ has coefficient $-1$ for $f_{a\\times b}$ where $a\\times b$ corresponds to the box whose south east corner is adjacent to this corner of $w_J$.\n\n\nIt is left to analyze the parts of $-\\psi\\left(\\val_{\\Yng(1)}(p_J)\\right)$ corresponding to \\emph{frozen} vertices. \nThe arguments here are very similar, the only special case being the south east corner of the $(n-k)\\times k$-rectangle.\nHence, we restrict our attention to this case and omit the others.\n\nConsider the vertex in position $(n-k)\\times k$ and assume in $\\val_{\\Yng(1)}(p_J)$ the corresponding entry is $i$.\nNotice, that coefficient for the vertex $(n-k-1)\\times (k-1)$ necessarily is $i-1$.\nSo, applying $-\\psi$ we see that\n\\begin{center}\n\t\\begin{tikzpicture}[scale=.6]\n\\draw (.5,0) -- (3,0) -- (3,2.5);\n\\node at (2.5,.5) {\\tiny $-i$};\n\\node at (1.5,1.5) {\\tiny $-i+1$};\n\\node at (1.5,.5) {\\tiny $\\cdots$};\n\\node at (2.5,1.5) {\\tiny $\\vdots$};\n\n\\draw[opacity=.5,dashed] (.5,1) -- (3,1);\n\\draw[opacity=.5,dashed] (.5,2) -- (3,2);\n\\draw[opacity=.5,dashed] (1,0) -- (1,2.5);\n\\draw[opacity=.5,dashed] (2,0) -- (2,2.5);\n\n\\node at (4,1.25) {$\\mapsto$};\n\n\\begin{scope}[xshift=5cm]\n\\draw (.5,0) -- (3,0) -- (3,2.5);\n\\node at (2.5,.5) {\\tiny $-1$};\n\\node at (1.5,1.5) {\\tiny $-i$};\n\\node at (1.5,.5) {\\tiny $1$};\n\\node at (2.5,1.5) {\\tiny $1$};\n\n\\draw[opacity=.5,dashed] (.5,1) -- (3,1);\n\\draw[opacity=.5,dashed] (.5,2) -- (3,2);\n\\draw[opacity=.5,dashed] (1,0) -- (1,2.5);\n\\draw[opacity=.5,dashed] (2,0) -- (2,2.5);\n\\end{scope}\n\\end{tikzpicture} \n\\end{center}\nObserve that the entries $1$ and $-i$ cancel by similar arguments as above.\nThe only non-zero coefficient in this picture is $-1$ for $f_{(n-k)\\times k}$.\nSummarizing, we have\n\\begin{eqnarray*}\n-\\psi\\left(\\val_{\\Yng(1)}(p_J)\\right) &=& \\sum_{\\begin{smallmatrix}\n  \\text{south to west}\\\\\n  \\text{corners of }w_J\n\\end{smallmatrix} } f_{a'\\times b'} -f_{(n-k)\\times k}\n-\\sum_{\\begin{smallmatrix}\n  \\text{west to south}\\\\\n  \\text{corners of }w_J\n\\end{smallmatrix} } f_{a\\times b}\\\\\n&\\overset{\\text{Equation~\\eqref{eq:g-vectors for Grec}}}{=}& {\\bf g}_{\\Yng(1)^{\\rm op}}(p_{J}) - f_{(n-k)\\times k}=\\bar{\\bf g}_{\\Yng(1)^{\\rm op}}(p_J).\n\\end{eqnarray*}\nThis implies $-\\psi(\\Delta_{\\val_{\\Yng(1)}}(D_{n-k},p_{(n-k)\\times k})=\\Delta_{\\bar{\\bf g}_{\\Yng(1)}^{\\rm op}}(D_{n-k},p_{(n-k)\\times k})$. \n\\end{proof}\n\n\\begin{remark}\\label{rmk:val and cval}\n    The attentive reader might notice that the Theorem~\\ref{thm: val and gv} and the discussion preceding it closely resemble Lemma~\\ref{lem:cval_gval}.\n    However, the difference in convention choices in \\cite{RW} and the present paper yield the necessity of a non-trivial change of coordinates.\n    To avoid lengthening the exposition even more we decided to give the result in a single seed but allude to the fact that Theorem~\\ref{thm: val and gv} indeed is an instance of Lemma~\\ref{lem:cval_gval} (after non-trivial changes of cluster coordinates).\n    After making the appropriate change of coordinate one can show that the map $-\\psi$ may be described as the tropicalization of a cluster ensemble map. In particular, Theorem~\\ref{thm: val and gv} can be extended to all seeds.\n\\end{remark}\n\n\\subsection{The intrinsic Newton--Okounkov body for Grassmannians}\n\\label{sec:Grass_intrinsic}\nAs before, let $\\lb_e$ be the bundle over $\\text{Gr}_{n-k}(\\C^n)$ obtained by pullback of $\\mathcal{O}(1)$ under the Pl\\\"ucker embedding $\\text{Gr}_{n-k}(\\C^n)\\hookrightarrow \\mathbb{P}^{\\binom{n}{k}-1}$. Recall the definition of the intrinsic Newton--Okounkov body from Definition~\\ref{def:intrinsic_lb}.\n\n\\begin{corollary}\\label{cor:intrinsicNO grassmannian}\nConsider the partial minimal model $\\cA^{\\rm op}\\hookrightarrow \\UT_{Y} $, the minimal model $\\cA^{\\op}/T_K \\hookrightarrow Y$ and the map ${\\bf g}:\\mathbb{B}_{\\tf}(\\cA^{\\op})\\to \\Trop_{\\Z}((\\cA^{\\op})^\\vee)$ of \\eqref{eq:nu_seed_free}. Then\n\\eqn{\n\\Delta_{\\mathrm{BL}}(\\lb_e) = \\bconv\\Bigg( \\lrc{ \\gv \\lrp{p_J} \\mid J \\in \\binom{[n]}{n-k}} \\Bigg).\n}\n\\end{corollary}\n\n\\begin{proof}\nThe Newton--Okounkov polytope for the flow valuation with respect to $\\seed_{\\Yng(1)}$ is the convex hull of the images of Pl\\\"ucker coordinates (see \\cite[\\S16.1]{RW}).\nSo by Theorem~\\ref{thm: val and gv} the same is true for the Newton--Okounkov body $\\Delta_{\\bar{\\gv}_{\\Yng(1)^{\\op}}}(D_{n-k},p_{(n-k)\\times k})$.\nBy Theorem~\\ref{NO_bodies_are_positive},  $\\Delta_{\\bar{\\gv}_{\\Yng(1)^{\\op}}}(D_{n-k},p_{(n-k)\\times k})$ is positive, hence it is broken line convex.\nTherefore, the broken line convex hull of the set $\\lrc{ \\bar{\\gv}_{\\Yng(1)^{\\op}} \\lrp{p_J} \\mid J \\in \\binom{[n]}{n-k}} $ in the lattice  $\\Trop_\\R({\\cA^{\\op}/T_{K}}^{\\vee}_{\\seed_{\\Yng(1)^{\\op}}})$  \ncoincides with its convex hull.\nWe take into account Remark~\\ref{rem:comparing_NO_bodies} to get that $ \\Delta_{\\gv_{\\Yng(1)^{\\op}}}(\\lb_e) = \\Delta_{\\bar{\\gv}_{\\Yng(1)^{\\op}}}(D_{n-k},p_{(n-k)\\times k}) + \\gv_{\\Yng(1)^{\\op}}(p_{(n-k)\\times k}) $.\nThis implies that $ \\Delta_{\\gv_{\\Yng(1)^{\\op}}}(\\lb_e) $ is the broken line convex hull of the set $\\lrc{ \\gv_{\\Yng(1)^{\\op}} \\lrp{p_J} \\mid J \\in \\binom{[n]}{n-k}} $.\nBeing a broken line convex set is independent of the choice of seed, the result follows.\n\\end{proof}\n\n\\begin{remark}\nIn the proof of Corollary~\\ref{cor:intrinsicNO grassmannian}, we implicitly use the $\\cA$ cluster structure to view the intrinsic Newton--Okounkov body $\\Delta_{\\mathrm{BL}}(\\lb_e)$ as the broken line convex hull of tropical points indexing Pl\\\"ucker coordinates.\nWe could alternatively use the $\\cX$ cluster structure as Rietsch--Williams do, and define theta functions with the corresponding $\\cX$ scattering diagram.\nBy identifying the Rietsch--Williams valuation with the $\\cv$-vector valuation (Corollary~\\ref{cor:gv on midX}) and noting that there is a cluster ensemble automorphism of the open positroid variety (see \\cite[Theorem~7.1, Corollary~5.11]{MullSp}, \\cite[Theorem~7.3, Proposition~7.4]{RW}),\nwe can apply Lemma~\\ref{lem:cval_gval} to give a completely analogous statement to Corollary~\\ref{cor:intrinsicNO grassmannian} which uses the Rietsch--Williams valuation rather than the $\\gv$-vector valuation.\nIn fact, in \\S\\ref{sec:Gr36} we present an example of an explicit computation related to the intrinsic Newton--Okounkov body $\\Delta_{\\mathrm{BL}}(\\lb_e)$ defined via the $\\Xnet$ scattering diagram.\n\\end{remark}\n\n\n\n\\subsubsection{Example}\\label{sec:Gr36}\nIn this subsection, we will give an example of the intrinsic Newton--Okounkov body for the case of $\\Grass_3\\lrp{\\C^6}$ and compare this to a Newton--Okounkov body of \\cite{RW}. \nIn particular, in \\cite[\\S9]{RW}, Rietsch--Williams discuss a non-integral vertex appearing in the Newton--Okounkov body $\\Delta_{\\val_{G}}(D_{3},p_{123})$ associated to the plabic graph $G$ of Figure~\\ref{fig:3-6}.\nWe illustrate how this non-integral vertex in the usual Newton--Okounkov body framework corresponds to a point in the interior of a broken line segment in $\\Delta_{\\mathrm{BL}}(\\lb_e)$ and thus is not a genuine vertex from the intrinsic Newton--Okounkov body perspective.\nHere, to facilitate comparison with \\cite{RW}, we will view the open positroid variety as $\\Xnet_{\\vb{1}}$ (up to codimension 2).\nSo, the scattering diagram we use to define $\\Delta_{\\mathrm{BL}}(\\lb_e)$ in the subsection will be $\\scat^{\\Xnet_{\\mathbf{1}}}_{\\text{in},\\seed_G}$ for a particular choice of initial seed $\\seed_G$.  \nThe choice of seed is encoded by the plabic graph illustrated in Figure~\\ref{fig:3-6}.\n\n\\begin{figure}[ht]\n    \\centering\n\t\n\\tikzexternaldisable\n\\begin{tikzpicture}[scale=.5]\n\\tikzset{->-/.style={decoration={\n  markings,\n  mark=at position #1 with {\\arrow{>}}},postaction={decorate}}}\n  \\tikzset{-<-/.style={decoration={\n  markings,\n  mark=at position #1 with {\\arrow{<}}},postaction={decorate}}}\n   \n\\draw (0,0) circle [radius=5];  \n \n\\draw[-<-=.5] (-1,2) -- (1,2);\n\\draw[->-=.5] (1,2) -- (2.25,0);\n\\draw[->-=.5] (2.25,0) -- (1,-2);\n\\draw[->-=.5] (1,-2)-- (-1,-2);\n\\draw[-<-=.5] (-1,-2) -- (-2.25,0);\n\\draw[-<-=.5] (-2.25,0)-- (-1,2);\n\\draw[-<-=.5] (-1,2)  -- (-1,3.5);\n\\draw[->-=.5] (-1,3.5)-- (1,3.5);\n\\draw[->-=.5] (1,3.5) -- (1,2);\n\\draw[-<-=.5] (2.25,0) -- (3.5,-1);\n\\draw[->-=.7] (3.5,-1) -- (2.375,-3);\n\\draw[-<-=.5] (2.375,-3) -- (1,-2);\n\\draw[->-=.5] (-2.25,0) -- (-3.5,-1);\n\\draw[->-=.5] (-3.5,-1) -- (-2.375,-3);\n\\draw[-<-=.5] (-2.375,-3) -- (-1,-2);\n\\draw[->-=.5] (-1,4.9) -- (-1,3.5);\n\\draw[->-=.5] (1,4.9) -- (1,3.5);\n\\draw[->-=.5] (4.7,-1.75) -- (3.5,-1);\n\\draw[->-=.5] (2.375,-3) -- (3.35,-3.75);\n\\draw[->-=.5] (-3.5,-1) -- (-4.7,-1.75);\n\\draw[->-=.5] (-2.375,-3) -- (-3.35,-3.75);\n\n   \n\\draw[fill] (1,3.5) circle [radius=.175];  \n\\draw[fill] (-1,2) circle [radius=.175];  \n\\draw[fill] (2.25,0) circle [radius=.175];  \n\\draw[fill] (2.375,-3) circle [radius=.175];  \n\\draw[fill] (-1,-2) circle [radius=.175];  \n\\draw[fill] (-3.5,-1) circle [radius=.175];  \n\n\\draw[fill, white] (-1,3.5) circle [radius=.175];  \n\\draw (-1,3.5) circle [radius=.175];  \n\\draw[fill, white] (1,2) circle [radius=.175];  \n\\draw (1,2) circle [radius=.175];  \n\\draw[fill, white] (3.5,-1) circle [radius=.175];  \n\\draw (3.5,-1) circle [radius=.175];\n\\draw[fill, white] (1,-2) circle [radius=.175];  \n\\draw (1,-2) circle [radius=.175];  \n\\draw[fill, white] (-2.25,0) circle [radius=.175];  \n\\draw (-2.25,0) circle [radius=.175];  \n\\draw[fill, white] (-2.375,-3) circle [radius=.175];  \n\\draw (-2.375,-3) circle [radius=.175];  \n\n\\node[above] at (-1,4.9) {1};\n\\node[above] at (1,4.9) {2};\n\\node[right] at (4.7,-1.75) {3};\n\\node[right] at (3.35,-3.75) {4};\n\\node[left] at (-4.7,-1.75) {6};\n\\node[left] at (-3.35,-3.75) {5};\n\n\\node at (0,0) {\\scalebox{.3}{ $\\yng(2,1)$}};\n\\node at (0,2.75) {\\scalebox{.3}{$\\yng(2)$}};\n\\node at (0,4.25) {\\scalebox{.3}{$\\yng(3)$}};\n\\node at (3,1.75) {\\scalebox{.3}{$\\yng(3,3)$}};\n\\node at (2.3,-1.5) {\\scalebox{.3}{$\\yng(3,3,2)$}};\n\\node at (3.7,-2.25) {\\scalebox{.3}{$\\yng(3,3,3)$}};\n\\node at (0,-3.5) {\\scalebox{.3}{$\\yng(2,2,2)$}};\n\\node at (-2.3,-1.5) {\\scalebox{.3}{$\\yng(1,1)$}};\n\\node at (-3.7,-2.25) {\\scalebox{.3}{$\\yng(1,1,1)$}};\n\\node at (-3,1.75) {{$\\varnothing$}};\n\n\\begin{scope}[xshift=14cm]\n\n\\def\\op{.4}\n\n\\draw[opacity=\\op, name path = boundary] (0,0) circle [radius=5];  \n \n\\node [circle, draw=black, fill=black, inner sep=0pt, minimum size=5pt, opacity=\\op] (1b) at (-1,2) {}; \n\\node [circle, draw=black, fill=black, inner sep=0pt, minimum size=5pt, opacity=\\op] (2b) at (1,3.5) {}; \n\\node [circle, draw=black, fill=black, inner sep=0pt, minimum size=5pt, opacity=\\op] (3b) at (2.25,0) {}; \n\\node [circle, draw=black, fill=black, inner sep=0pt, minimum size=5pt, opacity=\\op] (4b) at (2.375,-3) {}; \n\\node [circle, draw=black, fill=black, inner sep=0pt, minimum size=5pt, opacity=\\op] (5b) at (-1,-2) {}; \n\\node [circle, draw=black, fill=black, inner sep=0pt, minimum size=5pt, opacity=\\op] (6b) at (-3.5,-1) {}; \n\n\\node [circle, draw=black, fill=white, inner sep=0pt, minimum size=5pt, opacity=\\op] (1w) at (-1,3.5) {};\n\\node [circle, draw=black, fill=white, inner sep=0pt, minimum size=5pt, opacity=\\op] (2w) at (1,2) {};\n\\node [circle, draw=black, fill=white, inner sep=0pt, minimum size=5pt, opacity=\\op] (3w) at (3.5,-1) {};\n\\node [circle, draw=black, fill=white, inner sep=0pt, minimum size=5pt, opacity=\\op] (4w) at (1,-2) {};\n\\node [circle, draw=black, fill=white, inner sep=0pt, minimum size=5pt, opacity=\\op] (5w) at (-2.375,-3) {};\n\\node [circle, draw=black, fill=white, inner sep=0pt, minimum size=5pt, opacity=\\op] (6w) at (-2.25,0) {};\n\n\n\\draw[color=gray, opacity=\\op] (1b) -- (2w) ;\n\\draw[color=gray, opacity=\\op] (2w) -- (3b) ;\n\\draw[color=gray, opacity=\\op] (3b) -- (4w) ;\n\\draw[color=gray, opacity=\\op] (4w) -- (5b) ;\n\\draw[color=gray, opacity=\\op] (5b) -- (6w) ;\n\\draw[color=gray, opacity=\\op] (6w) -- (1b) ;\n\n\\draw[color=gray, opacity=\\op] (1b) -- (1w) ;\n\\draw[color=gray, opacity=\\op] (1w) -- (2b) ;\n\\draw[color=gray, opacity=\\op] (2b) -- (2w) ;\n\\draw[color=gray, opacity=\\op] (3b) -- (3w);\n\\draw[color=gray, opacity=\\op] (3w) -- (4b);\n\\draw[color=gray, opacity=\\op] (4b) -- (4w);\n\\draw[color=gray, opacity=\\op] (6w) -- (6b);\n\\draw[color=gray, opacity=\\op] (6b) -- (5w);\n\\draw[color=gray, opacity=\\op] (5w) -- (5b);\n\n\\path [name path = 1p] (1w) -- (-1,5);\n\\path [name intersections={of=boundary and 1p, by = 1e}];\n\\path [name path = 2p] (2w) -- (1,5);\n\\path [name intersections={of=boundary and 2p, by = 2e}];\n\\path [name path = 3p] (3b) --++ (2.5,-2);\n\\path [name intersections={of=boundary and 3p, by = 3e}];\n\\path [name path = 4p] (4b) --++ (1.25,-1);\n\\path [name intersections={of=boundary and 4p, by = 4e}];\n\\path [name path = 5p] (5w) --++ (-1.25,-1);\n\\path [name intersections={of=boundary and 5p, by = 5e}];\n\\path [name path = 6p] (6b) --++ (-1.25,-1);\n\\path [name intersections={of=boundary and 6p, by = 6e}];\n\n\n\\draw[opacity=\\op] (1w) -- (1e) node [pos=1, above,opacity=1] {$1$};\n\\draw[opacity=\\op] (2b) -- (2e) node [pos=1, above,opacity=1] {$2$};\n\\draw[opacity=\\op] (3w) -- (3e) node [pos=1, right,opacity=1] {$3$};\n\\draw[opacity=\\op] (4b) -- (4e) node [pos=1, right,opacity=1] {$4$};\n\\draw[opacity=\\op] (5w) -- (5e) node [pos=1, left,opacity=1] {$5$};\n\\draw[opacity=\\op] (6b) -- (6e) node [pos=1, left,opacity=1] {$6$};\n\n\n\n\\node (246) at (0,0) {\\footnotesize $246$};\n\\node (256) at (0,2.75) {\\footnotesize $256$};\n\\node (156) at (0,4.25) {\\footnotesize $156$};\n\\node (126) at (3,1.75) {\\footnotesize $126$};\n\\node (124) at (2.3,-1.5) {\\footnotesize $124$};\n\\node (123) at (3.8,-2.35) {\\footnotesize $123$};\n\\node (234) at (0,-4) { \\footnotesize $234$};\n\\node (346) at (-2.3,-1.5) {\\footnotesize $346$};\n\\node (345) at (-3.8,-2.35) {\\footnotesize $345$};\n\\node (456) at (-3,1.75) {\\footnotesize $456$};\n\n\\draw [->] (246) -- (456);\n\\draw [->] (246) -- (126);\n\\draw [->] (246) -- (234);\n\n\\draw [->] (256) -- (246);\n\\draw [->] (124) -- (246);\n\\draw [->] (346) -- (246);\n\n\\draw [->] (256) -- (156);\n\n\\draw [->]  (2.5,-1.8)  -- (3.35,-2.25) ;\n\n\\draw [->] (-2.5,-1.8)  -- (-3.35,-2.25) ;\n\n\\draw [->] (456) -- (346);\n\\draw [->] (234) -- (346);\n\\draw [->] (456) -- (256);\n\\draw [->] (126) -- (256);\n\\draw [->] (126) -- (124);\n\\draw [->] (234) -- (124);\n\n\n\\end{scope}\n\n\\end{tikzpicture}\n\n\\tikzexternalenable\n\n\n    \\caption{A plabic graph $G$ for $\\Grass_3\\lrp{\\C^6}$ for which $\\Delta_{\\val_{G}}(D_{3},p_{123})$ has non-integral vertex (see \\cite[\\S9]{RW}). On the left, the labels are in terms of Young diagrams. On the right, we display the quiver and label faces by Pl\\\"ucker coordinates.}\n    \\label{fig:3-6}\n\\end{figure}\n\n\n\n\nRecall that the Young diagrams in Figure~\\ref{fig:3-6} label the network parameters used in flow polynomial expressions (see \\cite[Equation (6.3)]{RW}).\nThe $\\cA$-cluster determined by trips in the plabic graph $G$ consists of the Pl\\\"ucker coordinates whose indices are given in Figure~\\ref{fig:3-6} (see \\cite[Definition 3.5]{RW}).\n\n\n\nAccording to \\cite[\\S9]{RW}, a non-integral vertex in the Newton--Okounkov polytope comes from half the valuation of the flow polynomial for the element $ f = (p_{124} p_{356} - p_{123} p_{456}) / {p_{123}^2}$.\nThey compute $\\frac{1}{2}\\val_G(f)$ and express its entries in tabular form (see \\cite[Table~3]{RW}) as we have reproduced in Table~\\ref{table}.\nThe function $p_{123}^2\\, f$ is one of the two $\\cA$ cluster variables that are not Pl\\\"ucker coordinates, see {\\it e.g.} \\cite[Eq.(4), p.42]{Sco06}. \nIt is obtained by mutation at $\\Yng(2,1)$.\n\n\\begin{table}\n\\begin{center}\n\\captionsetup{type=table}\n\\begin{tabular}{|C|C|C|C|C|C|C|C|C|} \n\\hline\n  \\vphantom{\\Yng(1,1,1,1)} \\Yng(3,3,3)  & \\Yng(3,3,2)  & \\Yng(2,2,2)  & \\Yng(1,1,1)  & \\Yng(3,3)  & \\Yng(2,1)  & \\Yng(1,1)  & \\Yng(3)  & \\Yng(2)\\\\\n\\hline\n     \\vphantom{\\Yng(1,1,1)_{\\Yng(1,1,1)}}\\frac{3}{2} &     \\frac{3}{2} & 1 &   \\frac{1}{2} & 1 & \\frac{1}{2} &\\frac{1}{2} &\\frac{1}{2} &\\frac{1}{2} \\\\\n\\hline\n\\end{tabular} \n\\captionof{table}{The rational vertex $\\frac{1}{2}\\val_G(f)$\\label{table}}\n\\end{center}\n\\end{table}\n\n\n\nNote we can re-interpret the expression for $f$ as the expansion of a product of theta functions.\nAll Pl\\\"ucker coordinates are $\\cA$ cluster variables, and all $\\cA$ cluster monomials are theta functions.\nThen\n\\eqn{p_{124}\\, p_{356} = \np_{123}^2\\, f + p_{123}\\, p_{456},}\nand the right hand side is a sum of two theta functions. \nThis means there are only two balanced pairs of broken lines contributing to the product.\nWe will see that the pair with no bending corresponds to the summand $ p_{123}\\, p_{456}$,\nwhile the other involves a maximal bend at an initial wall.\n(Since the bend is at an initial wall, we are able to see the relevant broken line segment without constructing the consistent scattering diagram.)\n\nWe interpret the Rietsch--Williams valuation as being valued in $ \\Trop_{\\Z}((\\cX_{\\mathbf{1}})^{\\vee})$ (see Proposition~\\ref{prop:flow is gv for X}) and consider broken lines in the associated $\\cX$ cluster scattering diagram.\nThe choice of seed identifies $\\Trop_{\\Z}((\\cX_{\\mathbf{1}})^\\vee)$ with $ N^\\vee / {\\lra{(1,1,\\dots, 1)}}$ and we draw the scattering in $\\lrp{ N^\\vee / {\\lra{(1,1,\\dots, 1)}}} \\otimes \\R$.\n\n\nWe will use Figure \\ref{fig:3-6} to define the fixed data and the seed data for the cluster structure. \nThe initial scattering diagram for the $\\Xnet$ variety is \n\\[ \\mathfrak{D}^{\\cX_{}}_{\\text{in},\\seed_G}= \\lrc{   \\left( (v_{\\mu})^{\\perp}, \\  1+ z^{e_{\\mu}}\\right) \\mid \\mu \\in \\{ \\Yng(2), \\Yng(2,1), \\Yng(3,3,2), \\Yng(1,1) \\}}.\\]\nTo get initial scattering diagram for the fibre over $\\mathbf{1}$ we take the quotient of the support $\\mathfrak{D}^{\\Xnet}_{\\text{in},\\seed_G}$ by $\\left(\\R \\cdot (1,1,\\dots, 1) \\right)$. (Observe that $(v_{\\mu})^{\\perp}$ is invariant under translations by $\\R \\cdot (1,1,\\dots, 1)$.) \n\n\\[ \\mathfrak{D}^{\\cX_{\\mathbf{1}}}_{\\text{in},\\seed_G}= \\lrc{   \\left( (v_{\\mu})^{\\perp}/ \\left(\\R \\cdot (1,1,\\dots, 1) \\right) , \\  1+ z^{e_{\\mu}}\\right) \\mid \\mu \\in \\{ \\Yng(2), \\Yng(2,1), \\Yng(3,3,2), \\Yng(1,1) \\}}\\]\n\nAll pertinent valuations may be found in \\cite[Table 3]{bossinger2019full}.\nWe record them here using the ordering of Table~\\ref{table}. We choose the representative whose coefficient of $e_\\varnothing$ is $0$ and do not record this entry.\n\\eqn{\\val_{G}({p}_{124}) = (1,0,0,0,0,0,0,0,0)}\n\\eqn{\\val_{G}({p}_{356}) = (2,2,2,1,2,1,1,1,1)}\n\\eqn{\\val_{G}({p}_{123}) = (0,0,0,0,0,0,0,0,0) }\n\\eqn{\\val_{G}({p}_{456}) = (3,2,2,1,2,1,1,1,1) }\nSo, we have $\\val_{G}({p}_{124}) + \\val_G( p_{356} ) = \\val_{G}( p_{123} \\, p_{456}  )$.\nThe summand $p_{123} \\, p_{456} $ in the product $p_{124} \\, p_{356}$ corresponds to the straight broken line segment from $ \\val_{G}({p}_{356})$ to $\\val_G( p_{124} )$, whose midpoint is $\\frac{1}{2}  \\val_{G}( p_{123} \\, p_{456}  )$. \n\nThe bending wall for the other broken line segment is $\\lrp{(v_{\\Yngs(3,3,2)})^{\\perp}, 1+z^{e_{\\Yngs(3,3,2)}}}$.\nNote that $v_{\\Yngs(3,3,2)}= f_{\\Yngs(3,3,3)} + f_{\\Yngs(2,1)}-f_{\\Yngs(3,3)} - f_{\\Yngs(2,2,2)}$, and $\\frac{1}{2}\\val(f) = \\frac{1}{2}\\val(p_{123}^2\\, f)$ is perpendicular to this vector.\nSo, $\\frac{1}{2}\\val(p_{123}^2\\, f)$ lies in the support of this wall.\nWe will see that there is a broken line segment from $\\val_{G}(p_{356})$ to $\\val_{G}(p_{124})$ passing through $\\frac{1}{2}\\val(f)$ and bending maximally here, as depicted in Figure~\\ref{fig:maxbend}.\n\n\\begin{figure}[ht]\n \\centering\n    \\begin{tikzpicture}\n    \\draw[->] (-2,0) -- (1,0) node[anchor=west]{$\\lrp{(v_{\\Yngs(3,3,2)})^{\\perp}, 1+z^{e_{\\Yngs(3,3,2)}}}$};\n    \\filldraw [gray] (0,0) circle (2pt);\n    \\filldraw [purple] (-1.5,0) circle (1pt) node[anchor=south east]{$\\frac{1}{2}\\val_G(f)$};\n    \\draw[purple] (-1.5,0) -- node[anchor=south east]{$ \\ell_2$} (0,1);\n    \\filldraw [purple]  (0,1) circle (1pt) node[anchor=south]{$\\val_G(p_{124})$};\n    \\draw[purple] (-1.5,0) --node[anchor=north east]{$\\ell_1$} (0,-1);\n    \\filldraw [purple]  (0,-1) circle (1pt) node[anchor=north]{$\\val_G(p_{356})$};\n    \\end{tikzpicture}\n \\caption{Rational point obtained from broken line bending maximally at an initial wall. In this broken line segment, $\\ell_1$ and $\\ell_2$ will take equal time, corresponding to the summand $p_{123}^2\\, f$ in the product $p_{124} \\, p_{356}$.\n\\label{fig:maxbend}} \n\\end{figure}\n\n\nRecall that the exponent vector of the decoration monomial along $\\ell_i$ is the negative of the velocity vector there. \nTraveling along $\\ell_1$, this velocity vector is positively proportional to \n\\[\\frac{1}{2}\\val_G(f) - \\val_G(p_{356}) = \n-\\left(\\frac{1}{2}, \\frac{1}{2}, 1, \\frac{1}{2},  1, \\frac{1}{2} , \\frac{1}{2}, \\frac{1}{2}, \\frac{1}{2}\\right).\\] \nWe can take a broken line with exponent vector $ v_1=(1,1,2,1,2,1,1,1,1) $ along $\\ell_1$.\nThe possible bendings after crossing the wall correspond to summands of\n\\eqn{z^{v_1}\\lrp{1+z^{e_{\\Yngs(3,3,2)}}}^{-\\lra{v_1,v_{\\Yngs(3,3,2)}}}=z^{v_1}\\lrp{1+z^{e_{\\Yngs(3,3,2)}}}^{2}.}\nThe maximal bending corresponds to the summand $z^{v_1+ 2 e_{\\Yngs(3,3,2)}} = z^{(1,3,2,1,2,1,1,1,1)}$. Let us call $v_2 := (1,3,2,1,2,1,1,1,1)$.\nThen observe that $\\frac{1}{2}\\val_G(f) - \\frac{1}{2}v_2 = \\val_G(p_{124})$. \nSo, we have a broken line segment $\\gamma$ traveling from $\\val_G(p_{356})$ to $\\frac{1}{2}\\val_G(f)$ with decoration monomial $z^{v_1}$, bending maximally and continuing to $\\val_G(p_{124})$ with decoration monomial $z^{v_2}$. Precisely $\\frac{1}{2}$ a unit of time is spent in each straight segment.\nFrom this perspective, $\\frac{1}{2}\\val_G(f)$ is not a genuine vertex; $\\frac{1}{2}\\val_G(f)$ is in the relative interior of the support of $\\gamma$, and the endpoints of $\\gamma$ are in the Newton--Okounkov body.\n\n\\footnotesize\n\\begin{thebibliography}{BFMMNC20}\n\n\\bibitem[AB22]{AB22}\nH\\\"ulya Arg\\\"uz and Pierrick Bousseau.\n\\newblock Fock-{G}oncharov dual cluster varieties and {G}ross-{S}iebert\n  mirrors.\n\\newblock {\\em arXiv:2206.10584[math.AG]}, 2022.\n\n\\bibitem[ADHL15]{ADHL}\nIvan Arzhantsev, Uwe Derenthal, Jürgen Hausen, and Antonio Laface.\n\\newblock {\\em Cox rings}, volume 144 of {\\em Cambridge Studies in Advanced\n  Mathematics}.\n\\newblock Cambridge University Press, Cambridge, 2015.\n\n\\bibitem[And13]{An13}\nDavid Anderson.\n\\newblock Okounkov bodies and toric degenerations.\n\\newblock {\\em Mathematische Annalen}, 356(3):1183--1202, 2013.\n\n\\bibitem[{Bau}03]{Baur_CartanComp}\nKarin {Baur}.\n\\newblock {Cartan components and decomposable tensors}.\n\\newblock {\\em {Transform. Groups}}, 8(4):309--319, 2003.\n\n\\bibitem[BECHL21]{Mandy_rank2_MS}\nSam Bardwell-Evans, Man-Wai Cheung, Hansol Hong, and Yu-Shen Lin.\n\\newblock Scattering diagrams from holomorphic discs in log {C}alabi-{Y}au\n  surfaces.\n\\newblock {\\em arXiv:2110.15234 [math.SG]}, 2021.\n\n\\bibitem[BF19]{BF}\nLara Bossinger and Ghislain Fourier.\n\\newblock String cone and superpotential combinatorics for flag and {S}chubert\n  varieties in type {A}.\n\\newblock {\\em J. Combin. Theory Ser. A}, 167:213--256, 2019.\n\n\\bibitem[BFMMNC20]{BFMNC}\nLara Bossinger, Bosco Fr\\'{\\i}as-Medina, Timothy Magee, and Alfredo\n  N\\'{a}jera~Ch\\'{a}vez.\n\\newblock Toric degenerations of cluster varieties and cluster duality.\n\\newblock {\\em Compos. Math.}, 156(10):2149--2206, 2020.\n\n\\bibitem[BFZ05]{BFZ05}\nArakady Berenstein, Sergey Fomin, and Andrei Zelevinsky.\n\\newblock Cluster algebras. {III}. {U}pper bounds and double {B}ruhat cells.\n\\newblock {\\em Duke Math. J.}, 126(1):1--52, 2005.\n\n\\bibitem[BH03]{BH03}\nFlorian Berchtold and Jürgen Hausen.\n\\newblock Homogeneous coordinates for algebraic varieties.\n\\newblock {\\em J. Algebra}, 266(2):636--670, 2003.\n\n\\bibitem[BKM16]{BKM16}\nKarin Baur, Alastair~D. King, and Bethany~R. Marsh.\n\\newblock Dimer models and cluster categories of {G}rassmannians.\n\\newblock {\\em Proc. Lond. Math. Soc. (3)}, 113(2):213--260, 2016.\n\n\\bibitem[BMNC21]{BMNC}\nLara Bossinger, Fatemeh Mohammadi, and Alfredo N\\'{a}jera~Ch\\'{a}vez.\n\\newblock Families of {G}r\\\"{o}bner degenerations, {G}rassmannians and\n  universal cluster algebras.\n\\newblock {\\em SIGMA Symmetry Integrability Geom. Methods Appl.}, 17:Paper No.\n  059, 46, 2021.\n\n\\bibitem[Bos21]{bossinger2019full}\nLara Bossinger.\n\\newblock Full-rank valuations and toric initial ideals.\n\\newblock {\\em Int. Math. Res. Not. IMRN}, (10):7433--7469, 2021.\n\n\\bibitem[Bos23]{B-toric}\nLara Bossinger.\n\\newblock A survey on toric degenerations of projective varieties.\n\\newblock {\\em arXiv preprint arXiv:2301.02545 [math.AG] (accepted for\n  publication in the Proceedings of the Nottingham Algebraic Geometry\n  Seminar)}, 2023.\n\n\\bibitem[BZ01]{BZ01}\nArakady Berenstein and Andrei Zelevinsky.\n\\newblock Tensor product multiplicities, canonical bases and totally positive\n  varieties.\n\\newblock {\\em Inventiones mathematicae}, 143(1):77--128, 2001.\n\n\\bibitem[CHM22]{CHM22}\nOliver Clarke, Akihiro Higashitani, and Fatemeh Mohammadi.\n\\newblock Combinatorial mutations and block diagonal polytopes.\n\\newblock {\\em Collect. Math.}, 73(2):305--335, 2022.\n\n\\bibitem[CILFS15]{Labardini_et_al_CC-alg}\nGiovanni Cerulli~Irelli, Daniel Labardini-Fragoso, and Jan Schr\\\"{o}er.\n\\newblock Caldero-{C}hapoton algebras.\n\\newblock {\\em Trans. Amer. Math. Soc.}, 367(4):2787--2822, 2015.\n\n\\bibitem[CMN22]{CMNcpt}\nMan-Wai {Cheung}, Timothy {Magee}, and Alfredo {N\\'ajera Ch\\'avez}.\n\\newblock Compactifications of cluster varieties and convexity.\n\\newblock {\\em Int. Math. Res. Not. IMRN}, 2022:10858--10911, 2022.\n\n\\bibitem[EH20]{EH20}\nLaura Escobar and Megumi Harada.\n\\newblock {Wall-crossing for Newton--Okounkov bodies and the tropical\n  Grassmannian}.\n\\newblock {\\em International Mathematics Research Notices}, 09 2020.\n\\newblock rnaa230.\n\n\\bibitem[FFL17]{FFL15}\nXin Fang, Ghislain Fourier, and Peter Littelmann.\n\\newblock Essential bases and toric degenerations arising from birational\n  sequences.\n\\newblock {\\em Adv. Math.}, 312:107--149, 2017.\n\n\\bibitem[FG06]{FG_Teich}\nVladimir Fock and Alexander Goncharov.\n\\newblock Moduli spaces of local systems and higher {T}eichm\\\"uller theory.\n\\newblock {\\em Publ. Math. Inst. Hautes \\'Etudes Sci.}, 103:1--211, 2006.\n\n\\bibitem[FG09]{FG_cluster_ensembles}\nVladimir~V. Fock and Alexander~B. Goncharov.\n\\newblock Cluster ensembles, quantization and the dilogarithm.\n\\newblock {\\em Ann. Sci. \\'Ec. Norm. Sup\\'er. (4)}, 42(6):865--930, 2009.\n\n\\bibitem[FH21]{FH21}\nNaoki Fujita and Akihiro Higashitani.\n\\newblock Newton--{O}kounkov bodies of flag varieties and combinatorial\n  mutations.\n\\newblock {\\em Int. Math. Res. Not. IMRN}, 12:9567--9607, 2021.\n\n\\bibitem[FO20]{FO20}\nNaoki Fujita and Hironori Oya.\n\\newblock Newton--{O}kounkov polytopes of {S}chubert varieties arising from\n  cluster structures.\n\\newblock {\\em arXiv preprint arXiv:2002.09912v1 [math.RT]}, 2020.\n\n\\bibitem[FZ02]{FZ_clustersI}\nSergey Fomin and Andrei Zelevinsky.\n\\newblock Cluster algebras. {I}. {F}oundations.\n\\newblock {\\em J. Amer. Math. Soc.}, 15(2):497--529, 2002.\n\n\\bibitem[FZ07]{FZ_clustersIV}\nSergey Fomin and Andrei Zelevinsky.\n\\newblock Cluster algebras. {IV}. {C}oefficients.\n\\newblock {\\em Compos. Math.}, 143(1):112--164, 2007.\n\n\\bibitem[GHK15a]{GHK_birational}\nMark Gross, Paul Hacking, and Sean Keel.\n\\newblock Birational geometry of cluster algebras.\n\\newblock {\\em Algebr. Geom.}, 2(2):137--175, 2015.\n\n\\bibitem[GHK15b]{GHK_logCY}\nMark Gross, Paul Hacking, and Sean Keel.\n\\newblock Mirror symmetry for log {C}alabi-{Y}au surfaces {I}.\n\\newblock {\\em Publ. Math. Inst. Hautes \\'Etudes Sci.}, 122:65--168, 2015.\n\n\\bibitem[GHKK18]{GHKK}\nMark Gross, Paul Hacking, Sean Keel, and Maxim Kontsevich.\n\\newblock Canonical bases for cluster algebras.\n\\newblock {\\em J. Amer. Math. Soc.}, 31(2):497--608, 2018.\n\n\\bibitem[GKS20]{GKS_polyhedral}\nVolker Genz, Gleb Koshevoy, and Bea Schumann.\n\\newblock Polyhedral parametrizations of canonical bases \\& cluster duality.\n\\newblock {\\em Adv. Math.}, 369:107178, 41, 2020.\n\n\\bibitem[GKS21]{GKS_typeA}\nVolker Genz, Gleb Koshevoy, and Bea Schumann.\n\\newblock Combinatorics of canonical bases revisited: type {A}.\n\\newblock {\\em Selecta Math. (N.S.)}, 27(4):Paper No. 67, 45, 2021.\n\n\\bibitem[GKS22]{GKS_string}\nVolker Genz, Gleb Koshevoy, and Bea Schumann.\n\\newblock Combinatorics of canonical bases revisited: string data in type\n  {$A$}.\n\\newblock {\\em Transform. Groups}, 27(3):867--895, 2022.\n\n\\bibitem[GS16]{GS12}\nMark Gross and Bernd Siebert.\n\\newblock Theta functions and mirror symmetry.\n\\newblock In {\\em Surveys in differential geometry 2016. {A}dvances in geometry\n  and mathematical physics}, volume~21 of {\\em Surv. Differ. Geom.}, pages\n  95--138. Int. Press, Somerville, MA, 2016.\n\n\\bibitem[GS22]{GS22}\nMark Gross and Bernd Siebert.\n\\newblock The canonical wall structure and intrinsic mirror symmetry.\n\\newblock {\\em Invent. Math.}, 229(3):1101--1202, 2022.\n\n\\bibitem[Hau02]{Hau02}\nJ\\\"{u}rgen Hausen.\n\\newblock Equivariant embeddings into smooth toric varieties.\n\\newblock {\\em Canad. J. Math.}, 54(3):554--570, 2002.\n\n\\bibitem[HK00]{HK00}\nYi~Hu and Sean Keel.\n\\newblock Mori dream spaces and {GIT}.\n\\newblock {\\em Michigan Math. J.}, 48:331--348, 2000.\n\\newblock Dedicated to William Fulton on the occasion of his 60th birthday.\n\n\\bibitem[HN23]{HN23}\nAkihiro Higashitani and Yusuke Nakajima.\n\\newblock Combinatorial mutations of {N}ewton--{O}kounkov polytopes arising\n  from plabic graphs.\n\\newblock {\\em Advanced Studies in Pure Mathematics}, the proceeding of the\n  conference \"The McKay correspondence, mutation and related topics\", to\n  appear., 2023.\n\n\\bibitem[{Iit}77]{Iitaka}\nShigeru {Iitaka}.\n\\newblock On logarithmic {K}odaira dimension of algebraic varieties.\n\\newblock In Walter L.~{Baily} Jr. and Tetsuji {Shioda}, editors, {\\em Complex\n  Analysis and Algebraic Geometry: A Collection of Papers Dedicated to K.\n  Kodaira}, pages 175--190. Cambridge University Press, 1977.\n\n\\bibitem[JKS16]{JKS16}\nBernt~Tore Jensen, Alastair~D. King, and Xiuping Su.\n\\newblock A categorification of {G}rassmannian cluster algebras.\n\\newblock {\\em Proc. Lond. Math. Soc. (3)}, 113(2):185--212, 2016.\n\n\\bibitem[KK12]{KK12}\nKiumars Kaveh and A.~G. Khovanskii.\n\\newblock Newton-{O}kounkov bodies, semigroups of integral points, graded\n  algebras and intersection theory.\n\\newblock {\\em Ann. of Math. (2)}, 176(2):925--978, 2012.\n\n\\bibitem[KLM12]{KLM_NObodies_spherical}\nAlex K\\\"{u}ronya, Victor Lozovanu, and Catriona Maclean.\n\\newblock Convex bodies appearing as {O}kounkov bodies of divisors.\n\\newblock {\\em Adv. Math.}, 229(5):2622--2639, 2012.\n\n\\bibitem[KM19]{KM_Khovanskii_bases}\nKiumars Kaveh and Christopher Manon.\n\\newblock Khovanskii bases, higher rank valuations, and tropical geometry.\n\\newblock {\\em SIAM J. Appl. Algebra Geom.}, 3(2):292--336, 2019.\n\n\\bibitem[KY23]{KY19}\nSean {Keel} and Tony~Yue {Yu}.\n\\newblock The {F}robenius structure theorem for affine log-{C}alabi-{Y}au\n  varities containing a torus.\n\\newblock {\\em arXiv preprint arXiv:1908.09861.v2 [math.AG], to appear in\n  Annals of Mathematics}, 2023.\n\n\\bibitem[Lit98]{Lit98}\nPeter Littelmann.\n\\newblock Cones, crystals, and patterns.\n\\newblock {\\em Transformation groups}, 3(2):145--179, 1998.\n\n\\bibitem[LM09]{LM09}\nRobert Lazarsfeld and Mircea Musta\\c{t}\\u{a}.\n\\newblock Convex bodies associated to linear series.\n\\newblock {\\em Ann. Sci. \\'Ec. Norm. Sup\\'er. (4)}, 42(5):783--835, 2009.\n\n\\bibitem[Mag15]{Mag15}\nTimothy Magee.\n\\newblock Fock-{G}oncharov conjecture and polyhedral cones for ${U}\\subset\n  {S}{L}_n$ and base affine space ${S}{L}_n/{U}$.\n\\newblock {\\em arXiv preprint arXiv:1502.03769 [math.AG]}, 2015.\n\n\\bibitem[Mag20]{Mag20}\nTimothy Magee.\n\\newblock Littlewood-{R}ichardson coefficients via mirror symmetry for cluster\n  varieties.\n\\newblock {\\em Proc. Lond. Math. Soc. (3)}, 121(3):463--512, 2020.\n\n\\bibitem[Man16]{Man16}\nTravis Mandel.\n\\newblock Tropical theta functions and log {C}alabi-{Y}au surfaces.\n\\newblock {\\em Selecta Math. (N.S.)}, 22(3):1289--1335, 2016.\n\n\\bibitem[Man19]{Man19}\nTravis Mandel.\n\\newblock Cluster algebras are {C}ox rings.\n\\newblock {\\em Manuscripta Math.}, 160(1-2):153--171, 2019.\n\n\\bibitem[{Mel}23]{ML23}\nCarolina {Melo}.\n\\newblock Ph.{D}. thesis.\n\\newblock {\\em in preparation}, 2023.\n\n\\bibitem[MS16]{MS16}\nBethany~R. Marsh and Jeanne Scott.\n\\newblock Twists of {P}l\\\"{u}cker coordinates as dimer partition functions.\n\\newblock {\\em Comm. Math. Phys.}, 341(3):821--884, 2016.\n\n\\bibitem[MS17]{MullSp}\nGreg Muller and David~E. Speyer.\n\\newblock The twist for positroid varieties.\n\\newblock {\\em Proc. Lond. Math. Soc. (3)}, 115(5):1014--1071, 2017.\n\n\\bibitem[NZ12]{NZ}\nTomoki Nakanishi and A.~Zelevinsky.\n\\newblock On tropical dualities in cluster algebras.\n\\newblock In {\\em Algebraic groups and quantum groups}, volume 565 of {\\em\n  Contemp. Math.}, pages 217--226. Amer. Math. Soc., Providence, RI, 2012.\n\n\\bibitem[Oko96]{Oko96}\nAndrei Okounkov.\n\\newblock Brunn-{M}inkowski inequality for multiplicities.\n\\newblock {\\em Invent. Math.}, 125(3):405--411, 1996.\n\n\\bibitem[Oko03]{Oko03}\nAndrei Okounkov.\n\\newblock Why would multiplicities be log-concave?\n\\newblock In {\\em The orbit method in geometry and physics ({M}arseille,\n  2000)}, volume 213 of {\\em Progr. Math.}, pages 329--347. Birkh\\\"{a}user\n  Boston, Boston, MA, 2003.\n\n\\bibitem[Pos06]{Pos06}\nAlexander Postnikov.\n\\newblock Total positivity, {G}rassmannians, and networks.\n\\newblock {\\em arXiv preprint arXiv:math/0609764 [math.CO]}, 2006.\n\n\\bibitem[Qin17]{Qin17}\nFan Qin.\n\\newblock Triangular bases in quantum cluster algebras and monoidal\n  categorification conjectures.\n\\newblock {\\em Duke Math. J.}, 166(12):2337--2442, 2017.\n\n\\bibitem[Qin22]{Qintropical}\nFan Qin.\n\\newblock Bases for upper cluster algerbas and tropical points.\n\\newblock {\\em Journal of the European Mathematical Society (2022),\n  DOI:10.4171/JEMS/1308.}, 2022.\n\n\\bibitem[RW19]{RW}\nKonstanze {Rietsch} and Lauren {Williams}.\n\\newblock Newton--{O}kounkov bodies, cluster duality, and mirror symmetry for\n  {G}rassmannians.\n\\newblock {\\em Duke Math. J.}, 168(18):3437--3527, 2019.\n\n\\bibitem[Sco06]{Sco06}\nJeanne Scott.\n\\newblock Grassmannians and cluster algebras.\n\\newblock {\\em Proc. London Math. Soc. (3)}, 92(2):345--380, 2006.\n\n\\bibitem[SW20]{SW18}\nLinhui Shen and Daping Weng.\n\\newblock Cyclic sieving and cluster duality of {G}rassmannian.\n\\newblock {\\em SIGMA Symmetry Integrability Geom. Methods Appl.}, 16:067, 41\n  pages, 2020.\n\n\\end{thebibliography}\n\n\n\\end{document}\n"
}

Extracted format

{
    "text": "abstract: Let $Y$ be a (partial) minimal model of a scheme $V$ with a\n  cluster structure (of type $\\mathcal{A}$, $\\mathcal{X}$ or of a\n  quotient of $\\mathcal{A}$ or a fibre of $\\mathcal{X}$). Under natural\n  assumptions, for every choice of seed we associate a Newton–Okounkov\n  body to every divisor on $Y$ supported on $Y \\setminus V$ and show\n  that these Newton–Okounkov bodies are positive sets in the sense of\n  Gross, Hacking, Keel and Kontsevich . This construction essentially\n  reverses the procedure in loc. cit. that generalizes the polytope\n  construction of a toric variety to the framework of cluster varieties.\n  .\n  In a closely related setting, we consider cases where $Y$ is a\n  projective variety whose universal torsor $\\text{UT} _Y$ is a partial\n  minimal model of a scheme with a cluster structure of type\n  $\\mathcal{A}$. If the theta functions parametrized by the integral\n  points of the associated superpotential cone form a basis of the ring\n  of algebraic functions on $\\text{UT} _Y$ and the action of the torus\n  $T_{\\text{Pic}(Y)^*}$ on $\\text{UT} _Y$ is compatible with the cluster\n  structure, then for every choice of seed we associate a\n  Newton–Okounkov body to every line bundle on $Y$. We prove that any\n  such Newton–Okounkov body is a positive set and that $Y$ is a minimal\n  model of a quotient of a cluster $\\mathcal{A}$-variety by the action\n  of a torus.\n  .\n  Our constructions lead to the notion of the intrinsic Newton–Okounkov\n  body associated to a boundary divisor in a partial minimal model of a\n  scheme with a cluster structure. This notion is intrinsic as it relies\n  only on the geometric input, making no reference to the auxiliary data\n  of a valuation or a choice of seed. The intrinsic Newton–Okounkov body\n  lives in a real tropical space rather than a real vector space. A\n  choice of seed gives an identification of this tropical space with a\n  vector space, and in turn of the intrinsic Newton–Okounkov body with a\n  usual Newton–Okounkov body associated to the choice of seed. In\n  particular, the Newton–Okounkov bodies associated to seeds are related\n  to each other by tropicalized cluster transformations providing a wide\n  class of examples of Newton-Okoukov bodies exhibiting a wall-crossing\n  phenomenon in the sense of Escobar–Harada .\n  .\n  This approach includes the partial flag varieties that arise as\n  minimal models of cluster varieties (for example full flag varieties\n  and Grassmannians). For the case of Grassmannians, our approach\n  recovers, up to interesting unimodular equivalences, the\n  Newton–Okounkov bodies constructed by Rietsch–Williams in .\naddress: Instituto de Matemáticas Unidad Oaxaca, Universidad Nacional\nAutónoma de México, León 2, altos, Centro Histórico, 68000 Oaxaca,\nMexico;  School of Mathematics, Kavli IPMU (WPI), UTIAS, The University\nof Tokyo, Kashiwa, Japan, 277-8583; Department of Mathematics, King’s\nCollege London, Strand, London WC2R 2LS, UK;  Consejo Nacional de\nCiencia y Tecnología - Instituto de Matemáticas Unidad Oaxaca,\nUniversidad Nacional Autónoma de México, León 2, altos, Centro\nHistórico, 68000 Oaxaca, Mexico\nauthor: Lara Bossinger, Man-Wai Cheung, Timothy Magee and Alfredo Nájera\nChávez\ndate: 2024-10-13\ntitle: Newton–Okounkov bodies and minimal models for cluster varieties\n\n# Introduction\n\n## Overview\n\nCluster varieties are certain schemes constructed by gluing a (possibly\ninfinite) collection of algebraic tori using distinguished birational\nmaps called cluster transformations. These schemes were introduced in\nand can be studied from many different points of view. They are closely\nrelated to cluster algebras and $Y$-patterns defined by Fomin and\nZelevinsky in . In this paper we approach them from the perspectives of\nbirational and toric geometry, mainly following . In , the authors show\nthat certain sets called *positive polytopes* can be used to produce\ncompactifications of cluster varieties and toric degenerations of such\ncompactifications. In the trivial case where the cluster variety in\nquestion is just a torus, a positive lattice polytope is simply a usual\nconvex lattice polytope and this construction produces the toric variety\nassociated to such a polytope. One of the main goals of this paper is to\nreverse this construction in a systematic way and understand this\nprocess from the view-point of Newton–Okounkov bodies. We also study the\nwall-crossing phenomenon for Newton–Okounkov bodies arising from cluster\nstructures. We treat independently the case of the Grassmannians as, in\nthis context, we compare the Newton–Okounkov bodies we construct with\nthose constructed in and explore some consequences. Moreover, throughout\nthe text we systematically consider not only cluster varieties but also\nquotients and fibres associated to them (see § for the precise\ndefinitions of these quotients and fibres). For simplicity, in this\nintroduction our main focus is on cluster varieties. We fix once and for\nall an algebraically closed field $\\Bbbk$ of characteristic zero. Unless\notherwise stated, all the schemes we consider are over $\\Bbbk$.\n\n## The tropical spaces\n\nLet $\\mathcal{V}$ be a cluster variety. By definition, $\\mathcal{V}$ is\nendowed with an atlas of algebraic tori of the form\n$$\\mathcal{V} = \\bigcup_{\\textbf{s}} T_{L;\\textbf{s}},$$ where $L$ is a\nfixed lattice, $T_{L; \\textbf{s}}$ is a copy of the algebraic torus\n$T_L= \\mathop{\\mathrm{Spec}}(\\Bbbk[L^*])$ associated to $L$ (so\n$L^*=\\text{Hom}(L, \\mathbb{Z} )$) and the tori in the atlas are\nparametrized by *seeds $\\textbf{s}$ for* $\\mathcal{V}$. We will exploit\nthe fact that $\\mathcal{V}$ is a log-Calabi–Yau variety. This property\nimplies that $\\mathcal{V}$ is endowed with a canonical up-to-scaling\nvolume form $\\Omega$. Moreover, recall that a cluster variety is of one\nof the types: $\\mathcal{A}$ or $\\mathcal{X}$.\n\nJust like in toric geometry where one can consider the dual torus\n$T_L^{\\vee}:=T_{L^*}$, the *dual* of $\\mathcal{V}$ is a cluster variety\n$\\mathcal{V} ^{\\vee}$ whose defining atlas consists of tori of the form\n$T^\\vee_L$. It is well known that the ring\n$H^{0} (T_L,\\mathcal{O}_{T_{L}})$ of algebraic functions on $T_L$ has a\ndistinguished basis –the set of characters of $T_L$– parametrized by\n$L^*$. For nearly 10 years it was conjectured that this fact can be\ngeneralized for $\\mathcal{V}$ using this notion of duality. In order to\nstate such a generalization, we consider the integral tropicalization of\n$\\mathcal{V} ^{\\vee}$, which we denote by\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$. The precise\ndefinition of $\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$ can be\nfound in §. For this introduction the key fact that we need is that a\nprime divisor $D$ on a variety birational to $\\mathcal{V} ^\\vee$\ndetermines a point of\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$ if $\\Omega$ has a\npole along $D$. In Fock–Goncharov conjectured that\n$H^{0}(\\mathcal{V} , \\mathcal{O}_\\mathcal{V} )$ has a canonical vector\nspace basis parametrized by\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$. Although false in\ngeneral, this conjecture does hold in many of the cases of wide\ninterest. In the authors linked this conjecture to the log Calabi–Yau\nmirror symmetry conjecture , suggesting that the canonical basis\nproposed by Fock–Goncharov is the *theta basis*. As we would like to be\nas close to toric geometry as possible we systematically assume that the\nfull Fock–Goncharov conjecture holds for the cluster variety\n$\\mathcal{V}$ under consideration. So, under under the assumption that\nthe full Fock–Goncharov conjecture holds for $\\mathcal{V}$, one may\nconsider $\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$ as\nreplacing $L^*$ and the characters of $T_L$ are replaced by the theta\nfunctions on $\\mathcal{V}$. Moreover, the real vector space\n$L^*\\otimes \\mathbb{R}$ is replaced by the real tropicalization\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$ and convex polyhedra\ninside $L^*\\otimes \\mathbb{R}$ are replaced by positive sets in the real\ntropical space\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})\\supset \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$\n(see § and Definition for the definitions of\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$ and of positive set,\nrespectively).\n\nBesides the trivial case where $\\mathcal{V}$ is just a torus (and hence\n$\\mathcal{V} ^{\\vee}$ is just the dual torus), the tropical spaces\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$ and\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$ do not possess a\nlinear structure (there is no natural notion of addition in these spaces\nand only multiplication by positive scalars makes sense). However, in\ncertain situations these tropical spaces do contain subsets where\naddition and scalar multiplication make sense, which we call *linear\nsubsets*. In any case, every choice of seed $\\textbf{s}^\\vee$ for\n$\\mathcal{V} ^{\\vee}$ gives rise to a bijection\n$\\mathfrak{r}_{\\textbf{s}^\\vee}:\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee}) \\longrightarrow \\mathbb{R} ^d$\nthat restricts to a bijection\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee}) \\overset{\\sim}{\\longrightarrow} \\mathbb{Z} ^d$,\nwhere $d$ is the dimension of both $\\mathcal{V}$ and\n$\\mathcal{V} ^\\vee$. In general, different seeds lead to different\nbijections. When we fix one such identification\n$\\mathfrak{r}_{\\textbf{s}^\\vee}$ and talk about linear subsets of\n$\\mathbb{Z} ^d$ and positive subsets of $\\mathbb{R} ^d$, what we mean is\nthat the inverse image of such a set under\n$\\mathfrak{r}_{\\textbf{s}^\\vee}$ has the given property.\n\n## Positive Newton–Okounkov bodies and minimal models\n\nNewton–Okounkov bodies are convex closed sets in real vector spaces.\nTheir systematic study was developed by Lazarsfeld–Mustaţă and\nKaveh–Khovanskii based on the work of Okounkov . This concept is a far\nreaching generalization of both the Newton polytope of a Laurent\npolynomial and the polytope of a polarized projective toric variety. In\nthe authors introduced Newton–Okounkov bodies for Cartier divisors on\nirreducible varieties. In this paper we consider Newton–Okounkov bodies\nassociated to Weil divisors in the setting of minimal models for cluster\nvarieties. More precisely, let $D$ be a Weil divisor on a\n$d$-dimensional normal variety $Y$ admitting a non-zero global section,\nthat is, the space $H^0(Y, \\mathcal{O}(D))$ is non-zero, where\n$\\mathcal{O}(D)$ is the coherent sheaf associated to $D$. The section\nring of $D$ is a graded ring\n$$R(D)=\\bigoplus_{k\\in \\mathbb{Z} _{\\geq 0}}{R}_k(D)$$ whose $k$-th\nhomogeneous component is the vector space\n$R_k(D)=H^0(Y, \\mathcal{O}(kD)) \\subset \\Bbbk (Y)$. Fix a non-zero\nelement $\\tau\\in R_1(D)$, and suppose we are given a total order on\n$\\mathbb{Z} ^d$ and a valuation $\\nu: \\Bbbk(Y)^* \\to \\mathbb{Z} ^d$.\nThen the Newton–Okounkov body associated to this data is:\n$$\\begin{split} \n\\Delta_\\nu(D,\\tau) := \\overline{\\mathop{\\mathrm{conv}}\\Bigg( \\bigcup_{k\\geq 1}  \\left\\{\\frac{\\nu\\left(f/\\tau^k\\right)}{k} \\mid f\\in R_k(D)\\setminus \\{0\\} \\right\\} \\Bigg) }\\subseteq \\mathbb{R} ^d.\n \\end{split}$$\n\nGiven a cluster variety $\\mathcal{V}$, our first goal is to use its\ncluster structure to construct Newton–Okounkov bodies associated to\ndivisors in compactifications of $\\mathcal{V}$, generalizing the\nconstruction of the polytope of a torus invariant divisors on a toric\nvariety. Hence, we need to establish the class of compactifications of\n$\\mathcal{V}$, the divisors therein and the valuations we consider.\n\nWe begin discussing valuations obtained from the cluster structure. In\ncase $\\mathcal{V}$ is a cluster $\\mathcal{A}$-variety, this is closely\nrelated to the work of Fujita and Oya . However, our approach includes\nthe cases where $\\mathcal{V}$ is a cluster $\\mathcal{X}$-variety, a\nquotient of a cluster $\\mathcal{A}$-variety, or a fibre of a cluster\n$\\mathcal{X}$-variety. In order to be able to use the cluster structure\nof $\\mathcal{V}$ to construct a valuation on $\\Bbbk(\\mathcal{V} )$\ncertain conditions (depending on whether $\\mathcal{V}$ is of type\n$\\mathcal{A}$ or of type $\\mathcal{X}$) need to be fulfilled. For\ninstances, if $\\mathcal{V}$ is of type $\\mathcal{A}$, a sufficient\ncondition is that the rectangular matrix $\\widetilde{B}$ determining the\ncluster structure of $\\mathcal{V}$ has full rank[^1]; if $\\mathcal{V}$\nis of type $\\mathcal{X}$ we need that the full Fock–Goncharov conjecture\nholds for $\\mathcal{X}$ (as we are assuming), see § for more details,\nincluding the cases of quotients of $\\mathcal{A}$ and fibres of\n$\\mathcal{X}$. In case the necessary conditions are satisfied then for\nevery $\\textbf{s}$ for $\\mathcal{V}$ we have a cluster valuation\n$$\\nu_\\textbf{s}: \\Bbbk(\\mathcal{V} ) \\setminus\\{0\\} \\to (\\mathbb{Z} ^d, <_{\\textbf{s}}).$$\nThe total order $<_{\\textbf{s}}$ on $\\mathbb{Z} ^d$ depends also on the\ntype of $\\mathcal{V}$. Moreover, in case $\\mathcal{V}$ is of type\n$\\mathcal{A}$ in the literature this valuation is generally denoted by\n$\\mathbf{g} _{\\textbf{s}}$ and called a $\\mathbf{g}$-*vector valuation*\nas it is closely related to the $\\mathbf{g}$-vectors associated to\ncluster monomials introduced in . In case $\\mathcal{V}$ is of type\n$\\mathcal{X}$ the associated cluster valuation has not been\nsystematically defined yet in the literature to the best of our\nknowledge. In this case we also denote $\\nu_{\\textbf{s}}$ by\n$\\mathbf{c} _{\\textbf{s}}$ and call it a $\\mathbf{c}$-*vector valuation*\nsince this valuation is closely related to the $\\mathbf{c}$-vectors\nassociated to $Y$-variables introduced in and more generally to\n**c**-vectors of theta functions on $\\mathcal{X}$ defined in , and\ncurrently investigated in . In any case, for every seed $\\textbf{s}$ the\ntheta basis of $H^0(\\mathcal{V} , \\mathcal{O}_{\\mathcal{V} })$ is\nadapted for the cluster valuation $\\nu_{\\textbf{s}}$. In particular, if\n$Y$ is a variety birational to $\\mathcal{V}$ and $D$ is a divisor in\n$Y$, then, upon a choice of non-zero section $\\tau \\in R_1(D)$ and a\nseed $\\textbf{s}$, we can construct a Newton–Okounkov body\n$\\Delta_{\\nu_\\textbf{s}}(D,\\tau)$. We are primarily interested in\nconditions ensuring that such a Newton–Okounkov body is a positive set.\nOn the one hand this is a condition that needs to be satisfied if one\nseeks to reverse Gross–Hacking–Keel–Kontsevich’s construction of a\ncompactification of a cluster variety from a positive set. On the other\nhand, we are further interested in describing how the change of seed\naffects the Newton–Okounkov body and positivity plays the key role in\nunderstanding this. If $\\Delta_{\\nu_\\textbf{s}}(D,\\tau)$ is positive\nthen any other $\\Delta_{\\nu_{\\textbf{s}'}}(D,\\tau)$ is obtained from\n$\\Delta_{\\nu_\\textbf{s}}(D,\\tau)$ by a composition of tropicalized\ncluster transformations. This will be discussed in more detail in the\nnext subsection of the introduction. In order to be able to show that\n$\\Delta_{\\nu_\\textbf{s}}(D,\\tau)$ is positive we restrict the class of\ncompactifications of $\\mathcal{V}$, the divisors we consider, and the\nsections we choose.\n\nOne can define a partial minimal model for $\\mathcal{V}$[^2] is an\ninclusion $\\mathcal{V} \\subset Y$ such that $Y$ is normal and $\\Omega$\nhas a simple pole along every irreducible divisorial component of the\nboundary $D=Y \\setminus \\mathcal{V}$, see . It is a minimal model if $Y$\nis projective over $\\Bbbk$. These are the kind of (partial)\ncompactifications of $\\mathcal{V}$ we consider. The main reason for this\nis that any prime divisor supported on $D$ determines a primitive point\nof $\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} )$. Let $D'$ be a divisor\nsupported on $D$. We say that $R(D')$ has a *graded theta basis* if for\neach $k$ the set of theta functions on $\\mathcal{V}$ contained in\n$H^0(Y,\\mathcal{O}(kD'))$ forms a basis (see Definition ). Then we can\nprove the following result.\n\n**Theorem 1**. (Theorem ) Let $D'$ be a Weil divisor supported on the\nboundary $D$ of the minimal model $\\mathcal{V} \\subset Y$ such that\n$R(D')$ has a graded theta basis. Let $\\tau\\in R_1(D')$ be such that\n$\\nu_{\\textbf{s}}(\\tau)$ belongs to a linear subset of $\\mathbb{Z} ^d$.\nThen the Newton–Okounkov body $\\Delta_{\\nu_{\\textbf{s}}}(D',\\tau)$ is a\npositive polytope.\n\nIn Lemma  we provide sufficient conditions ensuring that $R(D)$ has a\ngraded theta basis. Moreover, the work of Mandel provides conditions\nensuring that a line bundle on a cluster $\\mathcal{X}$-variety has a\ngraded theta basis.\n\nWe further study another setting where we can use cluster structures to\nconstruct Newton–Okounkov bodies and show that they are positive\npolytopes: suppose that $Y$ is a normal projective variety such that its\nPicard group is free and finitely generated. The universal torsor of $Y$\nis a scheme $\\text{UT} _Y$ whose ring of algebraic functions is\nisomorphic to the direct sum of all the spaces of sections associated to\nall (isomorphism classes of) line bundles over $Y$. We assume that\n$\\text{UT} _Y$ is a partial minimal model of a cluster\n$\\mathcal{A}$-variety, which we denote by\n$\\mathcal{A} \\subset \\text{UT} _Y$. For example, we encounter this\nsituation frequently in the study of homogeneous spaces, where moreover\nthe ring of global functions on $\\text{UT} _Y$ has a representation\ntheoretic interpretation due to the Borel–Weil–Bott Theorem (Remark ).\nThis fact is commonly used when constructing Newton–Okounkov bodies in\nLie theory, see e.g. and the references therein.\n\nLet $D_1, \\dots, D_s$ be the irreducible divisorial components of\n$D= \\text{UT} _Y \\setminus \\mathcal{V}$ and let\n$\\vartheta ^{\\mathcal{A} ^{\\vee}}_{i}$ be the theta function on\n$\\mathcal{A} ^{\\vee}$ parametrized by the point in\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} )$ associated to $D_i$. The\n(theta) superpotential[^3] associated to the inclusion\n$\\mathcal{A} \\subset \\text{UT} _Y$ is\n$$W_{\\text{UT} _Y} = \\sum_{i=1}^s \\vartheta ^{\\mathcal{A} ^{\\vee}}_i.$$\nThe associated superpotential cone is the subset $\\Xi_{\\text{UT} _Y}$ of\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$ where the\ntropicalized superpotential takes non-negative values. Given a choice of\nseed $\\textbf{s}^\\vee$ for $\\mathcal{A} ^\\vee$, $\\Xi_{\\text{UT} _Y}$ is\nidentified with a polyhedral cone\n$\\Xi_{\\text{UT} _Y, \\textbf{s}^\\vee}\\subset \\mathbb{R} ^d$. As discussed\nin , in many cases the integral points of $\\Xi_{\\text{UT} _Y}$\nparametrize the set of theta functions on $\\mathcal{A}$ that extend to\n$\\text{UT} _Y$. This happens for example if $\\mathcal{A}$ has *theta\nreciprocity* (see Definition ), a condition that is conjectured to be\ntrue in situations more general than ours. Even stronger, in many of the\nexamples arising in nature the integral points of $\\Xi_{\\text{UT} _Y}$\nparametrize a basis of $H^0(\\text{UT} , \\mathcal{O}_{\\text{UT} _Y})$. In\n, Gross–Hacking–Keel–Kontsevich give criteria ensuring that this is\nsatisfied. These conditions hold true in many cases of interest in\nrepresentation theory, as was proven in several papers including and .\nMoreover, for special choices of seeds $\\textbf{s}^{\\vee}$, in these\ncases the cone $\\Xi_{\\text{UT} _Y, \\textbf{s}^\\vee}$ agrees with known\npolyhedral cones such as the Gelfand–Tsetlin cone, string cones or the\nKnudson–Tao hive cone. Much of the inspiration of this paper is due to\nthe representation theoretic results that precede it. In the case where\nthe integral points of $\\Xi_{\\text{UT} _Y}$ parametrize the set of theta\nfunctions on $\\mathcal{A}$ that extend to $\\text{UT} _Y$, we can\nrestrict a **g**-vector valuation $\\mathbf{g} _\\textbf{s}$ from\n$\\Bbbk(\\mathcal{A} )$ to\n$H^0(\\text{UT} _Y, \\mathcal{O}_{\\text{UT} _Y})$. Therefore, given a line\nbundle $\\mathcal{L}$ on $Y$ we can construct a Newton–Okounkov body\n$\\Delta_{\\mathbf{g} _\\textbf{s}}(\\mathcal{L} )$ in a similar way as\nbefore. In order to show that\n$\\Delta_{\\mathbf{g} _{\\textbf{s}}}(\\mathcal{L} )$ is a positve polytope\nwe need to consider torus actions on $\\mathcal{A}$ and fibrations of\n$\\mathcal{A} ^{\\vee}$ over a torus as we now explain.\n\nThe universal torsor $\\text{UT} _Y$ is endowed with the action of the\ntorus $T_{\\text{Pic}(Y)^*}$ associated to the dual of the Picard group\nof $Y$. We first need this torus action to preserve $\\mathcal{A}$ and\nthat the induced action on $\\mathcal{A}$ is cluster in the sense of\n(roughly speaking this means that the restricted action can be\nidentified with the action induced by the choice of a sublattice of the\nkernel of $\\widetilde{B}$). In such situations we have a cluster\nfibration $$w:\\mathcal{A} ^{\\vee}\\to T_{\\text{Pic}(Y)}.$$ Recall that\nthe choice of seed gives rise to the identification\n$\\mathfrak{r}_{\\textbf{s}^\\vee}:\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{A} ^{\\vee}) \\to \\mathbb{R} ^d$.\nThe tropicalization of $w$ expressed using such an identification is a\nlinear map $w^T:\\mathbb{R} ^d \\to \\text{Pic}(Y)\\otimes \\mathbb{R}$.\nUnder the conditions above the Newton–Okounkov body\n$\\Delta_{\\mathbf{g} _{\\textbf{s}}}(\\mathcal{L} )$ can be described as a\nslicing of the superpotential cone. More precisely, we have the\nfollowing result (see Definition ).\n\n**Theorem 2**. () Assume that the theta functions on $\\mathcal{V}$\nparametrized by the integral points of $\\Xi_{\\text{UT} _Y}$ form a basis\nof $H^0(\\text{UT} _Y, \\mathcal{O}_{\\text{UT} _Y})$. If the action of\n$T_{\\text{Pic}(Y)^*}$ restricts to a cluster action of\n$T_{\\text{Pic}(Y)^*}$ on $\\mathcal{A}$ then for any class\n$[\\mathcal{L} ]\\in \\text{Pic}(Y)$ the Newton–Okounkov body\n$\\Delta_{{\\bf g}_{\\textbf{s}}}(\\mathcal{L} )$ can be describe as\n$$\\Delta_{{\\bf g}_{\\textbf{s}}}(\\mathcal{L} )=\\mathrm{Trop} _{\\mathbb{R} }(w)^{-1}([ \\mathcal{L} ])\\cap \\Xi_{\\text{UT} _Y, \\textbf{s}}.$$\nIn particular, $\\Delta_{{\\bf g}_{\\textbf{s}}}(\\mathcal{L} )$ is a\npositive subset of $\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$\nand $Y$ is a minimal model of the quotient of $\\mathcal{A}$ by the\naction of $T_{\\text{Pic}(Y)^*}$.\n\nThe case where $Y$ is the Grassmannian $\\text{Gr}_{n-k}(\\mathbb{C} ^n)$\nfits the framework above so it is possible to use the cluster\n$\\mathcal{A}$ structure to construct Newton–Okounkov bodies associated\nto arbitrary line bundles over $\\text{Gr}_{n-k}(\\mathbb{C} ^n)$. We show\nthat the Newton–Okounkov bodies we construct are unimodular to the\nNewton–Okounkov bodies constructed for $\\text{Gr}_{n-k}(\\mathbb{C} ^n)$\nby Rietsch and Williams in using the cluster $\\mathcal{X}$ structure on\nGrassmannians (see Theorem ). Moreover, the flow valuations of are\ninstances of $\\bf c$-vector valuations.\n\nThis comparison result already has interesting consequences related to\ntoric degenerations:\n\n1.  Given a rational polytopal Newton–Okounkov body $\\Delta$ for a (very\n    ample) line bundle $\\mathcal{L}$ over $Y$ Anderson’s main result in\n    applies and it yields a toric degeneration of $Y$ to a toric variety\n    (whose normalization is) defined by $\\Delta$. As the semigroup\n    algebras of the **g**-vector valuations are saturated, no\n    normalization is necessary.\n\n2.  The construction of Gross–Hacking–Keel–Kontsevich in associates to a\n    positive polytope $P$ a minimal model $\\mathcal{V} \\subset Y$ and\n    moreover, using Fomin–Zelevinsky’s principal coefficients, a toric\n    degneration of $Y$ to the toric variety defined by $P$. As our\n    Newton–Okounkov bodies are positive polytopes, this construction\n    applies in our setting.\n\nThe identification of the Newton–Okounkov bodies constructed by\nRietsch–Williams and our Newton–Okounkov bodies constructed from\n**g**-vectors implies the following result.\n\n**Theorem 3**. (Theorem  and Remark ) The toric degenerations of\n$\\text{Gr}_{n-k}(\\mathbb{C} ^n)$ determined by the Newton–Okounkov\npolytopes constructed by Rietsch–Williams using Anderson’s result\ncoincide with the toric degenerations of\n$\\text{Gr}_{n-k}(\\mathbb{C} ^n)$ given by Gross–Hacking–Keel–Kontsevich\nconstruction using principal coefficients.\n\n## The intrinsic Newton–Okounkov body\n\nUnderstanding how Newton–Okounkov bodies change upon changing the\nvaluation is an interesting problem that has attracted the attention of\nseveral authors, see for example . So let us return to the discussion on\nhow the Newton–Okounkov bodies constructed above transform if we change\nthe choice of seed. Given any two seeds $\\textbf{s}$ and $\\textbf{s}'$\nfor $\\mathcal{V} ^\\vee$ there is a piecewise linear bijection\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mu^{\\mathcal{V} ^\\vee}_{\\textbf{s},\\textbf{s}'}):\\mathbb{R} ^d \\to \\mathbb{R} ^d$\nrelating the identifications of\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$ with\n$\\mathbb{R} ^d$. More precisely, we have a commutative diagram\n$$\\xymatrix{\n&\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})\n\\ar_{\\mathfrak{r}_{\\textbf{s}}}[dl] \\ar^{\\mathfrak{r}_{\\textbf{s}'}}[dr] & \\\\\n\\mathbb{R} ^d \\ar^{\\mathrm{Trop} _{\\mathbb{R} }(\\mu^{\\mathcal{V} ^\\vee}_{\\textbf{s},\\textbf{s}'})}[rr]&   & \\mathbb{R} ^d.\n}$$ Every map\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mu^{\\mathcal{V} ^\\vee}_{\\textbf{s},\\textbf{s}'})$\nrestricts to a piecewise linear bijection of $\\mathbb{Z} ^d$ and, by\nconstruction, the maps\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mu^{\\mathcal{V} ^\\vee}_{\\textbf{s},\\textbf{s}'})$\nare composition of tropicalized cluster transformations for\n$\\mathcal{V} ^\\vee$ (see § for a more concise description). For a subset\n$P\\subseteq \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$ we let\n$P_{\\textbf{s}}=\\mathfrak{r}_{\\textbf{s}}(P)$. One of the main\nproperties behind our interest in showing that the Newton–Okounkov\nbodies we have constructed are positive sets is the following: if\n$P\\subseteq \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$ is a\npositive set then\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mu^{\\mathcal{V} ^\\vee}_{\\textbf{s},\\textbf{s}'})(P_{\\textbf{s}})=P_{\\textbf{s}'}$\nfor any two seeds, $\\textbf{s}$ and $\\textbf{s}'$. In particular, in\nthis situation the entire collection of sets\n$\\{P_\\textbf{s}\\}_{\\textbf{s}}$ parametrized by the seed for\n$\\mathcal{V} ^{\\vee}$ may be replaced by $P$, a single intrinsic object\nthat can be used to recover any $P_\\textbf{s}$ in the family.\n\nIn the case where a Newton–Okounkov body $\\Delta_{\\nu_{\\textbf{s}}}$\n(associated to a line bundle $\\mathcal{L}$ or a pair $(D',\\tau)$ as in\nthe previous subsection) is positive, any other Newton–Okounkov body\n$\\Delta_{\\nu_{\\textbf{s}'}}$ associated to the same data is also\npositive. In this situation there is a single intrinsic object\n$\\Delta_{\\mathrm{BL}} \\subset\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$\nrepresenting the entire collection\n$\\{ \\Delta_{\\nu_{\\textbf{s}}}\\}_{\\textbf{s}}$. We call\n$\\Delta_{\\mathrm{BL}}$ the *intrinsic Newton–Okounkov body* (associate\nto the data we begin with). The subindex $\\mathrm{BL}$ in\n$\\Delta_{\\mathrm{BL}}$ stands for *broken line*, the choice of this\nnotation goes back to where the last three authors of this paper\nintroduce *broken line convexity*– a notion of convexity defined in a\ntropical space that ensures positivity. Broken lines are pieces of\ntropical curves in $\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$\nused to define theta functions on $\\mathcal{V}$ and describe their\nmultiplication (see §). Straight line segments defining convexity in a\nlinear space are replaced by broken line segments in the tropical space\nto define broken line convexity. The main result of is that a closed set\nis broken line convex if and only if it is positive.\n\nIn the situations where we are able to show that\n$\\Delta_{\\nu_{\\textbf{s}}}\\subset \\mathbb{R} ^d$ is positive, it turns\nout that it is moreover polyhedral, a property that fails in general,\nsee e.g. . Since\n$\\Delta_{\\nu_{\\textbf{s}'}}=\\mathrm{Trop} _{\\mathbb{R} }(\\mu^{\\mathcal{V} ^\\vee}_{\\textbf{s},\\textbf{s}'})(\\Delta_{\\nu_{\\textbf{s}}})$\nany other $\\Delta_{\\nu_{\\textbf{s}'}}$ is also polyhedral. The integral\npoints of the convex bodies we consider are naturally associated to\ntheta functions, which suggests is the following question: does there\nexist a finite set of theta functions such that\n$\\Delta_{\\nu_{\\textbf{s}'}}$ is the convex hull of their images under\n$\\nu_{\\textbf{s}'}$ for any seed $\\textbf{s}'$? Such a collection of\npoints might vary as we change seeds as exhibited in the case of the\nGrassmannians in an example in and generalized to an infinite family of\nexamples in . Given the notion of broken line convexity, a slight\nreformulation of the question becomes more natural: does there exist a\nfinite set of theta functions such that the broken line convex hull of\ntheir images under $\\nu_{\\textbf{s}'}$ is $\\Delta_{\\nu_{\\textbf{s}'}}$\nfor some (and hence any) seed $\\textbf{s}'$? In fact, from the intrinsic\nNewton–Okounkov body perspective, the valuation is replaced by integral\ntropical points parametrizing theta functions and there is no reference\nto a seed at all. Using this perspective, $\\Delta_{\\mathrm{BL}}$ becomes\na broken line convex subset of\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$ whose integral\npoints parametrize the theta basis of the first graded piece $R_1$ of\nthe corresponding graded ring. In we give sufficient conditions ensuring\nthat $\\Delta_{\\mathrm{BL}}$ can be described as the broken line convex\nhull of a finite collection of points and describe this collection.\nApplying this result to the setting of Grassmannians we obtain that if\n$\\mathcal{L} _e$ is line bundle over\n$\\mathop{\\mathrm{Gr}}_{n-k}(\\mathbb{C} ^n)$ obtained by pullback of\n$\\mathcal{O}(1)$ under the Plücker embedding\n$\\mathop{\\mathrm{Gr}}_{n-k}(\\mathbb{C} ^n)\\hookrightarrow \\mathbb P^{\\binom{n}{k}-1}$\nthen the intrinsic Newton–Okounkov body\n$\\Delta_{\\mathrm{BL}}(\\mathcal{L} _e)$ is the broken line convex hull of\nthe ${\\bf g}$-vectors of the Plücker coordinates (Corollary ).\n\nBroken line convexity also allows to generalize the Newton polytope of a\nLaurent polynomial to the the world of cluster varieties. In particular,\nin § we introduce the *theta function analog of the Newton polytope* of\n$f$, for any $f\\in H^0(\\mathcal{V} , \\mathcal{O}_\\mathcal{V} )$. The\nintrinsic Newton–Okounkov bodies $\\Delta_{\\mathrm{BL}}$ can be described\nusing this notion. The key idea is exploiting the bijection between the\ntheta basis (a special case of an *adapted basis*) and integral tropical\npoints parametrizing them. This idea is explained for full rank\nvaluations with finitely generated value semigroup in the survey . It is\ntherefore interesting to continue studying this new class of objects.\n\n## Organization of the paper\n\nIn § we review background material on cluster varieties their quotients\nand their fibres (§), and on tropicalization (§). In § we recall the\nconstruction of cluster scattering diagrams and the theta functions on\n(quotients and fibres of) cluster varieties. In § we elaborate on the\nexistence of a theta basis on the ring of regular functions on a partial\nminimal model of (a quotient or a fibre of) a cluster variety. This\nsection largely follows . In § we recall the **g**-vector valuations for\n(quotients) $\\mathcal{A}$-varieties. We introduce **c**-vector\nvaluations for (fibres of) $\\mathcal{X}$-varieties. The main results of\nthe paper are contained in §. The study of Newton–Okoukov bodies\nassociated to Weil divisors on minimal models is treated in § while the\nNewton–Okoukov bodies for line bundles are treated in §. The intrinsic\nNewton–Okounkov body and the wall-crossing phenomenon for these are\naddressed in §. Finally, in § we apply the results of the previous\nsection to Grassmannians. One of the main technical conditions to be\nsatisfied is verified in §. In § we prove a unimodular equivalence\nbetween the Newton–Okounkov bodies we construct and those constructed by\nRietsch–Williams in . In § we describe the intrinsic Newton–Okounkov\nbodies for Grassmannians as the broken line convex hull of the\n**g**-vectors of Plücker coordinates (in arbitrary seeds).\n\n### Acknowledgements\n\nThe authors L. Bossinger and A. Nájera Chávez were partially supported\nby PAPIIT project IA100122 dgapa UNAM 2022 and by CONACyT project\nCF-2023-G-106. M. Cheung was supported by World Premier International\nResearch Center Initiative (WPI Initiative), MEXT, Japan. T. Magee was\nsupported by EPSRC grant EP/V002546/1.\n\n# Preliminaries\n\n## Cluster varieties, quotients and fibres\n\nWe briefly recall the construction of cluster varieties, their quotients\nand their fibres. The reader is invited to consult for the details we\nshall omit in this section.\n\nUnless otherwise stated, all tensor products are taken with respect to\n$\\mathbb{Z}$. Moreover, given a lattice $L$ we denote by\n$L^*:= \\mathop{\\mathrm{Hom}}(L,\\mathbb{Z} )$ its $\\mathbb{Z}$-dual and\nlet $\\langle \\cdot , \\cdot \\rangle: L\\times L^* \\to \\mathbb{Z}$ be the\ncanonical pairing given by evaluation. We further denote by\n$L_\\mathbb{R} := L \\otimes \\mathbb{R}$ the real vector space associated\nto $L$. We fix an algebraically closed field $\\Bbbk$ of characteristic\n$0$ and let $T_L:= \\text{Spec}(\\Bbbk [L^*])$ be the algebraic torus\nwhose character lattice is $L^*$.\n\n### Cluster varieties and their dualities\n\nThe **fixed data** $\\Gamma$ consist of the following:\n\n-   a finite set $I$ of **directions** and a distinguished subset\n    $I_{\\text{uf}}\\subseteq I$ of **mutable** (or **unfrozen**)\n    **directions**. Elements of $I \\setminus I_{\\text{uf}}$ are the\n    **frozen directions**;\n\n-   a lattice $N$ of rank $|I|$ together with a saturated sublattice\n    $N_{\\text{uf}}\\subseteq N$ of rank $|I_{\\text{uf}}|$;\n\n-   a skew-symmetric bilinear form\n    $\\{ \\cdot , \\cdot \\} : N \\times N \\rightarrow \\mathbb{Q}$;\n\n-   a finite index sublattice $N^\\circ \\subseteq N$ such that\n    $\\{ N, N_{\\text{uf}}\\cap N^{\\circ}\\}\\subset \\mathbb{Z}$ and\n    $\\{ N_{\\text{uf}}, N^{\\circ} \\}\\subset \\mathbb{Z}$;\n\n-   a collection of positive integers $\\{d_i\\}_{i \\in I}$ with greatest\n    common divisor $1$;\n\n-   the dual lattices $M = \\mathop{\\mathrm{Hom}}(N, \\mathbb{Z} )$ and\n    $M^{\\circ}=\\mathop{\\mathrm{Hom}}(N^{\\circ},\\mathbb{Z} )$.\n\nA ${\\bf seed}$ for $\\Gamma$ is a tuple $\\textbf{s}:= ( e_i )_{i \\in I}$\nsuch that $\\{ e_i \\}_{i\\in I}$ is a basis for $N$,\n$\\{e_i\\}_{i \\in I_{\\text{uf}}}$ is a basis for $N_{\\text{uf}}$ and\n$\\{d_i e_i \\}_{i \\in I }$ is a basis for $N^{\\circ}$. We let\n$f_i := {d_i}^{-1} e_i^*$ and observe that $\\{f_i\\}_{i\\in I}$ is a basis\nof $M^{\\circ}$. For $i,j\\in I$ we write\n$\\epsilon_{ij}:= \\lbrace e_{i},d_j e_{j} \\rbrace$ and define the matrix\n$\\epsilon=(\\epsilon_{ij})_{i,j\\in I}$. When we work with various seeds\nat the same time we introduce labels of the form $e_{i;\\textbf{s}}$,\n$f_{i;\\textbf{s}}$, $\\epsilon_{\\textbf{s}}=(\\epsilon_{ij;\\textbf{s}})$,\netc. to distinguish the data associated to $\\textbf{s}$. We can\n**mutate** a seed $\\textbf{s}=(e_i)_{i\\in I}$ in a mutable direction\n$k\\in I_{\\text{uf}}$ to obtain a new seed\n$\\mu_k(\\textbf{s})=(e'_i)_{i\\in I}$ given by $$\\label{e_mutation}\ne_i':=\\begin{cases} e_i+[\\epsilon_{ik}]_+e_k & i\\neq k,\\\\\n-e_k&i=k,\n\\end{cases}$$ where $[x]_+:= \\text{max}(0,x)$ for $x \\in \\mathbb{R}$.\n\nLet $r:=|I_{\\text{uf}}|$ and let $\\mathbb{T}_r$ denote the $r$-regular\ntree whose edges are labeled by the elements of $I_{\\text{uf}}$. We\nrefer to $r$ as the **rank** and fix it one and for all. By a common\nabuse of notation, the set of vertices of this tree is also denoted by\n$\\mathbb T_r$. We fix once and for all a distinguished vertex\n$v_0\\in \\mathbb{T}_r$ and let $%\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ be the unique\norientation of $\\mathbb{T}_r$ such that the $r$ edges incident to $v_0$\nare oriented in outgoing direction from $v_0$, and every vertex\ndifferent from $v_0$ has one incoming edge and $r-1$ outgoing edges. We\nwrite $v\\overset{k}{\\longrightarrow}v'\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ to indicate that the\nedge in between the vertices $v,v'$ of $%\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ is oriented from $v$ to\n$v'$ and is labeled by $k$.\n\nFix once and for all a seed $\\textbf{s}_0=(e_i\\mid i \\in I)$ and call it\nthe **initial seed**. To every vertex $v\\in \\mathbb{T}_r$ we attach a\nseed $\\textbf{s}_v$ as follows: we let $\\textbf{s}_{v_0}=\\textbf{s}_0$,\nif $v\\overset{k}{\\longrightarrow}v'\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ then\n$\\textbf{s}_{v'}=\\mu_k(\\textbf{s}_{v})$. For simplicity we write\n$\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ if\n$\\textbf{s}=\\textbf{s}_v$ for some $v\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$.\n\nFor every seed $\\textbf{s}=(e_{i;\\textbf{s}}\\mid i\\in I)\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ we introduce the **seed\ntori** $\\mathcal{A} _{\\textbf{s}} = T_{N^{\\circ}}$ and\n$\\mathcal{X} _{\\textbf{s}} = T_{M}$ which are endowed with the **cluster\ncoordinates** $\\{A_{i;\\textbf{s}} := z^{f_{i;\\textbf{s}}}\\}_{i \\in I}$\nand $\\{X_{i;\\textbf{s}} := z^{e_{i;\\textbf{s}}}\\}_{i \\in I}$,\nrespectively. The **$\\mathcal{A}$-cluster transformation** associated to\n$\\textbf{s}$ and $k \\in I_{\\text{uf}}$ is the birational map\n$\\mu^{\\mathcal{A} }_{k}:\\mathcal{A} _{\\textbf{s}} \\dashrightarrow \\mathcal{A} _{\\mu_k(\\textbf{s})}$\nspecified by the pullback formula $$\\label{A_mut}\n(\\mu^{\\mathcal{A} }_{k})^*(z^m):=z^{m} (1+z^{v_{k;\\textbf{s}}})^{-\\langle d_k e_{k;\\textbf{s}},m\\rangle} \\ \\ \\text{ for }m\\in M^{\\circ},$$\nwhere $v_{k;\\textbf{s}}:=\\{e_{k;\\textbf{s}}, \\cdot \\}\\in M^{\\circ}$.\nSimilarly, the **$\\mathcal{X}$-cluster transformation** associated to\n${\\mathbf{s}}$ and $k$ is the birational map\n$\\mu^{\\mathcal{X} }_{k}:\\mathcal{X} _{\\textbf{s}} \\dashrightarrow \\mathcal{X} _{\\mu_k(\\textbf{s})}$\nspecified by the pull-back formula $$\\label{X_mut}\n(\\mu^{\\mathcal{X} }_{k})^*(z^n):=z^{n} (1+z^{e_{k;\\textbf{s}}})^{-[ n,e_{k;\\textbf{s}} ]}\\ \\ \\text{ for }n\\in N,$$\nwhere $[\\cdot, \\cdot]:N\\times N \\to \\mathbb{Q}$ is the bilinear form\ndetermined by setting $[e_i,e_j]=\\left\\{e_i, d_je_j\\right\\}$.\n\nFor seeds $\\textbf{s}, \\textbf{s}'\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ connected by iterated\nmutation in a sequence of directions $k_1, \\dots, k_s\\in I_{\\text{uf}}$,\nwe let $\\mu^{\\mathcal{A} }_{\\textbf{s}, \\textbf{s}'}$ (resp.\n$\\mu^{\\mathcal{X} }_{\\textbf{s}, \\textbf{s}'}$) be the composition of\ncluster transformations in the same sequence of directions and in the\nsame order. A birational transformation of the form\n$\\mu^{\\mathcal{A} }_{\\textbf{s}, \\textbf{s}'}$ (or\n$\\mu^{\\mathcal{X} }_{\\textbf{s}, \\textbf{s}'}$) can be used to glue its\ndomain and range by identifying the largest open subschemes where the\ntransformation is an isomorphism. We use this kind of gluing to define\ncluster varieties. More precisely, the cluster $\\mathcal{A}$-variety\nassociated to $\\Gamma$ and $\\textbf{s}_0$ is\n$$\\mathcal{A} _{\\Gamma,\\textbf{s}_0}:=\\bigcup\\limits_{\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}} \\mathcal{A} _{\\textbf{s}}/ \\left( \\text{gluing by } \\mu^{\\mathcal{A} }_{\\textbf{s}', \\textbf{s}''} \\right)_{\\textbf{s}',\\textbf{s}''\\in%\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}}.$$ The cluster\n$\\mathcal{X}$-variety associated to $\\Gamma$ and $\\textbf{s}_0$ is\n$$\\mathcal{X} _{\\Gamma,\\textbf{s}_0}:=\\bigcup\\limits_{\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}} \\mathcal{X} _{\\textbf{s}}/ \\left( \\text{gluing by } \\mu^{\\mathcal{X} }_{\\textbf{s}', \\textbf{s}''} \\right)_{\\textbf{s}',\\textbf{s}''\\in%\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}}.$$\n\nFrom now on an element $\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ will be referred to as\na seed for $\\mathcal{A}$ (or $\\mathcal{X}$). It is important to recall\nthat declaring another $\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ as an initial seed\ngives rise to isomorphic cluster varieties. We fix the pair\n$(\\Gamma,\\textbf{s})$ once and for all and denote\n$\\mathcal{A} _{\\Gamma, \\textbf{s}_0}$ (resp.\n$\\mathcal{X} _{\\Gamma, \\textbf{s}_0}$) simply by $\\mathcal{A}$ (resp.\n$\\mathcal{X}$).\n\n### Quotients of $\\mathcal{A}$-varieties and fibres of $\\mathcal{X}$-varieties\n\nLet\n$N^{\\perp}_{\\text{uf}}:= \\{ m\\in M \\mid \\langle n, m \\rangle=0 \\ \\forall \\  n\\in N_{\\text{uf}} \\}$.\nIn particular,\n$M/ N^{\\perp}_{\\operatorname{uf}}\\cong (N_{\\text{uf}})^*$. By a slight\nabuse of notation we also write $M^{\\circ}/ N_{\\text{uf}}^{\\perp}$. Here\n$N_{\\text{uf}}^\\perp$ is taken in $M^\\circ$ rather than $M$, so\n$M^{\\circ}/ N_{\\text{uf}}^{\\perp}$ is torsion free. Since\n$\\{ N_{\\text{uf}},N \\}\\subseteq \\mathbb{Z}$ the following homomorphisms\nare well defined $$\\begin{aligned}\n \\label{eq:p12star}\n  \\begin{matrix}\n    p_1^*: & N_{\\operatorname{uf}} & \\rightarrow & M^\\circ &\\qquad \\phantom{aaaaa} \\qquad \\qquad & p_2^* : & N & \\rightarrow& M^{\\circ}/ N^{\\perp}_{\\operatorname{uf}}. \\\\\n    & n &\\mapsto & \\{ n, \\cdot \\}\n  & \\qquad  & &  n &\\mapsto &  \\{ n, \\cdot \\} + N_{\\text{uf}}^{\\perp}\n  \\end{matrix}\n\\end{aligned}$$ The matrix representing $p_2^*$ with respect to a seed\n$\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ is the *extended\nexchange matrix* $\\widetilde{B}_{\\textbf{s}}$ of .\n\n**Definition 1**.\n\nA **cluster ensemble lattice map** for $\\Gamma$ is a homomorphism\n$p^*: N \\to M^\\circ$ such that $p^*|_{N_{\\text{uf}}} = p^*_1$ and the\ncomposition\n$N \\overset{p^*}{\\longrightarrow} M^\\circ \\twoheadrightarrow M^{\\circ}/ N^{\\perp}_{\\operatorname{uf}}$\nagrees with $p_2^*$, where\n$M^\\circ \\twoheadrightarrow M^{\\circ}/ N^{\\perp}_{\\operatorname{uf}}$\ndenotes the canonical projection. Note that different choices of $p^*$\ndiffer by a homomorphism\n$N/ N_{\\operatorname{uf}} \\rightarrow  N^{\\perp}_{\\operatorname{uf}}$.\n\nIn other words, given a seed $\\mathbf{s}$, the $|I|\\times|I|$ square\nmatrix $B_{p^*;\\mathbf{s}}$ associated to a cluster ensemble lattice map\n$p^*$ with respect to the bases $(e_i)_{i\\in I}$ and $(f_i)_{i\\in I}$\nsatisfies $$\\label{eq:Mp*}\nB_{p^*;\\mathbf{s}} - \\epsilon^{\\rm{tr}}_\\textbf{s}=\n\\left[\\begin{matrix}\n0 & 0 \\\\\n0 & \\ast\n\\end{matrix}\\right],$$ where the $0$ entries represent the blocks\n$I_{\\text{uf}}\\times I_{\\text{uf}}$,\n$I_{\\text{uf}}\\times (I\\setminus I_{\\text{uf}})$, and\n$(I\\setminus I_{\\text{uf}})\\times I_{\\text{uf}}$, and the $\\ast$ entry\nindicates that the\n$(I\\setminus I_{\\text{uf}})\\times(I\\setminus I_{\\text{uf}})$ block has\nno constraints. Every cluster ensemble lattice map $p^*:N\\to M^{\\circ}$\ncommutes with mutation. Therefore, $p^*$ gives rise to a **cluster\nensemble map** $$p:\\mathcal{A} \\to \\mathcal{X} .$$\n\nThe map $p:\\mathcal{A} \\to \\mathcal{X}$ yields both, torus actions on\n$\\mathcal{A}$ and fibrations of $\\mathcal{X}$ over a torus, as we\nexplain subsequently. Let $$\\label{eq:define K}\nK=\\ker(p_2^*)=\\left\\{k\\in N\\mid \\{k,n\\}=0\\,\\forall\\, n\\in N_{\\rm uf}^\\circ\\right\\} \\quad \\text{and} \\quad K^{\\circ}=K \\cap N^{\\circ}.$$\nTo obtain an action on $\\mathcal{A}$ we consider a saturated sublattice\n$$H_{\\mathcal{A} } \\subseteq K^\\circ.$$ The inclusion\n$H_{\\mathcal{A} } \\hookrightarrow N^\\circ$ gives rise to an inclusion\n$T_{H_{\\mathcal{A} }}\\hookrightarrow T_{N^{\\circ}}$ as a subgroup. Since\n$p^*$ commutes with mutation and $H_{\\mathcal{A} }\\subseteq K$ we have a\nnon-canonical inclusion\n$$T_{H_{\\mathcal{A} }}\\hookrightarrow \\mathcal{A} .$$ The action of\n$T_{H_{\\mathcal{A} }}$ on $T_{N^\\circ}$ given by multiplication extends\nto a free action of $T_{H_{\\mathcal{A} }}$ on $\\mathcal{A}$ and gives\nrise to a geometric quotient\n$\\mathcal{A} \\to \\mathcal{A} /T_{H_{\\mathcal{A} }}$. The scheme\n$\\mathcal{A} /T_{H_{\\mathcal{A} }}$ is obtained by gluing tori of the\nform\n$T_{N^{\\circ}/H_{\\mathcal{A} }}\\cong T_{N^{\\circ}}/T_{H_{\\mathcal{A} }}$;\nthe gluing is induced by the $\\mathcal{A}$-mutations used to glue the\nseed tori for $\\mathcal{A}$. More precisely, for every seed $\\textbf{s}$\nfor $\\mathcal{A}$ we let\n$(\\mathcal{A} /T_{H_{\\mathcal{A} }})_{\\textbf{s}}$ be a copy of the\ntorus $T_{N^{\\circ}/H_{\\mathcal{A} }}$. For $k\\in I_{\\text{uf}}$ the\nmutation\n$\\mu^{\\mathcal{A} /T_{H_\\mathcal{A} }}_{k}: (\\mathcal{A} /T_{H_{\\mathcal{A} }})_{\\textbf{s}}  \\dashrightarrow  (\\mathcal{A} /T_{H_{\\mathcal{A} }})_{\\mu_k(\\textbf{s})}$\nis given by $$\\label{A/T_mut}\n\\left(\\mu^{\\mathcal{A} /T_{H_{\\mathcal{A} }}}_{k}\\right)^*(z^m):=z^{m} (1+z^{v_{k;\\textbf{s}}})^{-\\langle d_k e_{k;\\textbf{s}},m\\rangle} \\ \\ \\text{ for }m\\in H_{\\mathcal{A} }^{\\perp}.$$\nLet $\\mu^{\\mathcal{A} /T_{H_{\\mathcal{A} }}}_{\\textbf{s}, \\textbf{s}'}$\ndenote the composition of mutations determined by the path in $%\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ connecting\n$\\textbf{s}, \\textbf{s}'\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$. Then\n$$\\mathcal{A} /T_{H_{\\mathcal{A} }}:=\\bigcup\\limits_{\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}} (\\mathcal{A} /T_{H_{\\mathcal{A} }})_{\\textbf{s}}/ \\left( \\text{gluing by } \\mu^{\\mathcal{A} /T_{H_{\\mathcal{A} }}}_{\\textbf{s}', \\textbf{s}''} \\right)_{\\textbf{s}', \\textbf{s}'' \\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}}.$$\n\nTo obtain the fibration of $\\mathcal{X}$ over a torus we consider a\nsaturated sublattice $$H_{\\mathcal{X} } \\subseteq K.$$ The inclusion\n$H_{\\mathcal{X} } \\hookrightarrow N$ induces a surjection\n$T_M:= \\mathop{\\mathrm{Spec}}(\\Bbbk[N]) \\to \\mathop{\\mathrm{Spec}}(\\Bbbk[H_{\\mathcal{X} }])=:T_{H_{\\mathcal{X} }^*}$.\nThis extends to a globally defined map $$\\label{eq:weight_map}\n    w_{H_{\\mathcal{X} }}:\\mathcal{X} \\to T_{H^*_\\mathcal{X} }.$$\n\n**Remark 2**. The subindex $\\mathcal{V}$ in the lattice\n$H_{\\mathcal{V} }$ stands for the cluster variety $\\mathcal{V}$ for\nwhich the choice of sublattice is relevant. When there is no risk of\nconfusion, we drop the subindex $\\mathcal{V}$ from $H_{\\mathcal{V} }$\n(see the end of §).\n\nWe let $\\mathcal{X} _{\\phi}$ be the fibre of the map over a closed point\n$\\phi\\in T_{H^*_{\\mathcal{X} }}$. In this work we mainly focus on the\nfibre $\\mathcal{X} _{{\\bf 1}_{T_{H^*_{\\mathcal{X} }}}}$, where\n${\\bf 1}_{T_{H^*_{\\mathcal{X} }}}\\in T_{H^*_{\\mathcal{X} }}$ is the\nidentity element. When there is no risk of confusion on the fibration we\nare considering we will denote this scheme simply by\n$\\mathcal{X} _{\\bf 1}$.\n\nThe fibre $\\mathcal{X} _{\\bf 1}$ is obtained by gluing tori isomorphic\nto $T_{H^\\perp_{\\mathcal{X} }}$ via the restrictions of the\n$\\mathcal{X}$-mutations used to glue the seed tori for $\\mathcal{X}$\n(see for a detailed treatment of this construction). As in the previous\nsituations, we have a description of the form\n$$\\mathcal{X} _{\\bf 1}:=\\bigcup\\limits_{\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}} (\\mathcal{X} _{\\bf 1})_{\\textbf{s}}/ \\left( \\text{gluing by } \\mu^{\\mathcal{X} _{\\bf 1}}_{\\textbf{s}', \\textbf{s}''} \\right)_{\\textbf{s}', \\textbf{s}'' \\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}},$$ where\n$(\\mathcal{X} _{\\bf 1})_{\\textbf{s}}$ is a torus isomorphic to\n$T_{H_{\\mathcal{X} }^\\perp}$,\n$\\mu^{\\mathcal{X} _{\\bf 1}}_{k}: (\\mathcal{X} _{\\bf 1})_{\\textbf{s}}  \\dashrightarrow  (\\mathcal{X} _{\\bf 1})_{\\mu_k(\\textbf{s})}$\nis given by $$\\label{X_phi_mut}\n\\left(\\mu^{\\mathcal{X} _{\\bf 1}}_{k}\\right)^*(z^{n+H_{\\mathcal{X} }}):=z^{n+H_{\\mathcal{X} }}(1+z^{e_{k;\\textbf{s}}+H_{\\mathcal{X} }})^{-[ n,e_{k;\\textbf{s}} ]}\\ \\ \\text{ for } n+H_{\\mathcal{X} } \\in N/H_{\\mathcal{X} }$$\nand $\\mu^{\\mathcal{X} _{\\bf 1}}_{\\textbf{s},\\textbf{s}'}$ is defined as\nfor the other varieties we have introduced so far.\n\n**Definition 3**. A variety of the form\n$\\mathcal{A} /T_{H_{\\mathcal{A} }}$ is referred to as a **quotient of\n$\\mathcal{A}$**. A variety of the form $\\mathcal{X} _{\\bf 1}$ is\nreferred to as a **fibre of $\\mathcal{X}$**. A **cluster action** on\n$\\mathcal{A}$ is the action of a torus of the form\n$T_{H_{\\mathcal{A} }}$.\n\nLet $T$ be an algebraic torus endowed with a set of coordinates\n$z_1, \\dots , z_r$ and let $\\omega_T$ be its canonical bundle. A\n**volume form** on $T$ is a nowhere vanishing form in\n$H^0(T, \\omega_T)$. The **standard volume form** on $T$ is (any non-zero\nscalar multiple of)\n$$\\Omega_T= \\frac{dz_1 \\wedge \\dots \\wedge dz_r}{z_1 \\cdots z_r}.$$\n\n**Definition 4**. A **log Calabi–Yau pair** $(Y, D)$ is a smooth complex\nprojective variety $Y$ together with a reduced normal crossing divisor\n$D\\subset Y$ such that $K_X+D=0$. We say a scheme $V$ is log Calabi–Yau\nif there exists a log Calabi–Yau pair $(Y,D)$ such that $V$ is\n$Y \\setminus D$ up to codimension 2.\n\nIt follows from that any log Calabi–Yau variety $V$ is endowed with a\nunique up to scaling holomorphic volume form (*i.e.* a nowhere vanishing\nholomorphic top form) $\\Omega_V$ which has at worst a simple pole along\neach component of $D$ for any such $(Y,D)$. See for further details.\n\nAs explained in both $\\mathcal{A}$ and $\\mathcal{X}$ are log Calabi–Yau,\nthe key point being that these schemes are obtained by gluing tori via\nbirational maps that preserve the standard volume form on each seed\ntorus (endowed with cluster coordinates). For the same reason, the\nschemes of the form $\\mathcal{A} /T_{H_{\\mathcal{A} }}$ and\n$\\mathcal{X} _{\\phi}$ are also log Calabi–Yau. The canonical volume form\non $\\mathcal{A} /T_{H_{\\mathcal{A} }}$ (resp. $\\mathcal{X} _{\\phi}$) is\ninduced by (resp. the restriction of) the canonical volume form of\n$\\mathcal{A}$ (resp. $\\mathcal{X}$).\n\n### Principal coefficients, $\\mathcal{X}$ as a quotient of $\\mathcal{A}_{\\mathrm{prin}}$ and $\\mathcal{A}$ as a fibre of $\\mathcal{X}_{\\mathrm{prin}}$\n\nFor the fixed data\n$\\Gamma=\\left(I, I_{\\text{uf}}, N,N^{\\circ}, M, M^{\\circ}, \\{ \\cdot, \\cdot \\}, \\{d_i\\}_{i\\in I} \\right)$,\nwe consider its principal counterpart\n$$\\Gamma_{{\\mathrm{prin}} }=\\left(I_{{\\mathrm{prin}} }, (I_{{\\mathrm{prin}} })_{\\text{uf}}, N_{{\\mathrm{prin}} }, N_{{\\mathrm{prin}} }^{\\circ}, M_{{\\mathrm{prin}} }, M^{\\circ}_{{\\mathrm{prin}} }, \\{ \\cdot, \\cdot \\}_{{\\mathrm{prin}} }, \\{d_i\\}_{i\\in I_{{\\mathrm{prin}} }} \\right),$$\nwhere the index set $I_{\\mathrm{prin}}$ is the disjoint union of two\ncopies of $I$, its subset $(I_{\\mathrm{prin}} )_{\\text{uf}}$ is the set\n$I_{\\text{uf}}$ thought of as a subset of the first copy of $I$,\n$$N_{{\\mathrm{prin}} } = N \\oplus M^\\circ, \\quad  N_{{\\mathrm{prin}} }^{\\circ}= N^{\\circ}\\oplus M, \\quad (N_{{\\mathrm{prin}} })_{\\text{uf}}=N_{\\text{uf}}\\oplus 0, \\quad M_{{\\mathrm{prin}} } = M \\oplus N^\\circ, \\quad M_{{\\mathrm{prin}} }^{\\circ}=M^{\\circ}\\oplus N.$$\nFor $i \\in I_{{\\mathrm{prin}} }$ belonging to either the first or second\ncopy of $I$, the corresponding integer in the tuple\n$\\{d_i \\mid i\\in I_{{\\mathrm{prin}} }\\}$ is equal to integer indexed by\n$i$ for $\\Gamma$, and\n$$\\{(n_1,m_1),(n_2,m_2)\\}_{{\\mathrm{prin}} }= \\{n_1, n_2\\} + \\langle n_1,m_2 \\rangle - \\langle n_2,m_1 \\rangle.$$\nRecall that $\\textbf{s}_0=(e_i)_{i \\in I}$ is the initial seed for\n$\\Gamma$. Then the initial seed for $\\Gamma_{{\\mathrm{prin}} }$ is\n${\\textbf{s}_0}_{{\\mathrm{prin}} }=\\left((e_i,0),(0,f_i)\\right)_{i\\in I}$.\nSince $\\Gamma$ and $\\textbf{s}_0$ were already fixed, we denote the\ncluster variety\n$\\mathcal{A} _{\\Gamma_{{\\mathrm{prin}} },{{\\textbf{s}{_0}}_{{\\mathrm{prin}} }}}$\n(resp.\n$\\mathcal{X} _{\\Gamma_{{\\mathrm{prin}} },{{\\textbf{s}{_0}}_{{\\mathrm{prin}} }}}$)\nsimply by $\\mathcal{A}_{\\mathrm{prin}}$ (resp.\n$\\mathcal{X} _{{\\mathrm{prin}} }$). It is moreover worth pointing out\nthat $\\mathcal{A}_{\\mathrm{prin}}$ is in fact independent of the choice\nof initial seed $\\textbf{s}_0$ as explained in .\n\nIn the authors show that the scheme $\\mathcal{X}$ can be described as a\nquotient of $\\mathcal{A}_{\\mathrm{prin}}$ in the sense of Definition .\nTo obtain such a description we need to choose a cluster ensemble\nlattice map $p^*:N \\to M^{\\circ}$ for $\\Gamma$. This choice determines\nthe cluster ensemble map $$\\label{eq:def_p_prin}\np_{{\\mathrm{prin}} }: \\mathcal{A}_{\\mathrm{prin}}\\to \\mathcal{X}_{\\mathrm{prin}}.$$\nThe map $p_{{\\mathrm{prin}} }$ is induced by the cluster ensemble\nlattice map $$\\begin{aligned}\np_{{\\mathrm{prin}} }^*:N_{{\\mathrm{prin}} } &\\to M^\\circ_{{\\mathrm{prin}} }\\\\\n(n,m) &\\mapsto \\left(p^*(n)-m,n\\right)\n\\end{aligned}$$ for $\\Gamma_{\\mathrm{prin}}$. Set\n$K_{{\\mathrm{prin}} }:=\\ker(p_{{\\mathrm{prin}} ,2}^*)$ and\n$K_{{\\mathrm{prin}} }^\\circ:= K_{{\\mathrm{prin}} }\\cap N^\\circ_{\\mathrm{prin}}$,\nwhere $p_{{\\mathrm{prin}} ,2}^*$ corresponds to the map $p_2^*$ in for\n$\\Gamma_{{\\mathrm{prin}} }$. We let $$\\label{eq:H_Aprin} \n  H_{\\mathcal{A}_{\\mathrm{prin}}}:= \\left\\{\\left.\\left(n,-(p^*)^*(n)\\right)\\in N^\\circ_{\\mathrm{prin}} \\, \\right| \\, n \\in N^\\circ\\right\\}.$$\n\nIt is straightforward to verify that $H_{\\mathcal{A}_{\\mathrm{prin}}}$\nis a saturated sublattice of $K^\\circ_{\\mathrm{prin}}$ that is\nisomorphic to $N^\\circ$. In particular, we have a quotient\n$\\mathcal{A}_{\\mathrm{prin}}/ T_{H_{\\mathcal{A}_{\\mathrm{prin}}}}$\nendowed with an atlas of seed tori isomorphic to $T_M$ (indeed,\n$T_{N^\\circ_{{\\mathrm{prin}} }}/T_{H_{\\mathcal{A}_{\\mathrm{prin}}}}\\cong T_{N^\\circ \\oplus M}/T_{N^\\circ}\\cong T_M$).\nThere is an isomorphism $$\\label{eq:def_chi}\n    \\chi : \\mathcal{A}_{\\mathrm{prin}}/T_{H_{\\mathcal{A}_{\\mathrm{prin}}}}\\overset{\\sim}{\\longrightarrow} \\mathcal{X}$$\nrespecting the cluster tori of domain and range. The restriction of\n$\\chi$ to a seed torus is a monomial map whose pullback is given by\n$$\\begin{split} \n\\chi^*: N &\\to (H_{\\mathcal{A}_{\\mathrm{prin}}})^\\perp \n\\\\\nn &\\mapsto (p^*(n),n).\n \\end{split}$$ There is also a surjective map $$\\label{eq:def_tilde_p}\n    \\tilde{p}:\\mathcal{A}_{\\mathrm{prin}}\\to \\mathcal{X} .$$ respecting\nseed tori. The restriction of $\\tilde{p}$ to a seed torus is a monomial\nmap whose pullback is given by $$\\begin{aligned}\n    \\tilde{p}^*:  N &\\to M^\\circ_{{\\mathrm{prin}} }\\\\\n       \\ \\ n &\\mapsto  (p^*(n),n).\n\\end{aligned}$$ In particular, we have $\\tilde{p}= \\chi\\circ \\varpi$,\nwhere $$\\label{eq:def_varpi}\n\\varpi: \\mathcal{A}_{\\mathrm{prin}}\\to \\mathcal{A}_{\\mathrm{prin}}/T_{H_{\\mathcal{A}_{\\mathrm{prin}}}}$$\nis the canonical projection.\n\nIt is also possible to describe $\\mathcal{A}$ as a fibre of\n$\\mathcal{X}_{\\mathrm{prin}}$. There is an injective map\n$$\\label{eq:def_xi}\n\\xi:\\mathcal{A} \\to \\mathcal{X}_{\\mathrm{prin}}$$ respecting seed tori.\nThe restriction of $\\xi$ to a seed torus is a monomial map whose\npullback is given by $$\\begin{aligned}\n\\xi^*: N_{{\\mathrm{prin}} } &\\to M^\\circ\n\\\\\n(n,m) &\\mapsto p^*(n)-m.\n\\end{aligned}$$ Let $$\\label{eq:H_Xprin} \nH_{\\mathcal{X}_{\\mathrm{prin}}}:= \\left\\{\\left(n,p^*(n)\\right)\\in N_{\\mathrm{prin}} \\mid n \\in N\\right\\}.$$\nIt is routine to check that $H_{\\mathcal{X}_{\\mathrm{prin}}}$ is a valid\nchoice to construct a fibration of $\\mathcal{X}_{\\mathrm{prin}}$ over\nthe torus $T_{H^*_{\\mathcal{X}_{\\mathrm{prin}}}}$. Hence, we can\nconsider the fibre\n$(\\mathcal{X}_{\\mathrm{prin}})_{\\bf 1}=(\\mathcal{X}_{\\mathrm{prin}})_{{\\bf 1}_{T_{H^*_{ \\mathcal{X}_{\\mathrm{prin}}}}}}$\nassociated to this fibration. There is an isomorphism\n$$\\label{eq:def_delta}\n\\delta:\n\\mathcal{A} \\overset{\\sim}{\\longrightarrow} (\\mathcal{X}_{\\mathrm{prin}})_{\\bf 1}$$\nrespecting seed tori. The restriction of $\\delta$ to a seed torus is a\nmonomial map whose pullback is given by $$\\begin{aligned}\n\\delta^*: N_{{\\mathrm{prin}} } /H_{\\mathcal{X}_{\\mathrm{prin}}} &\\to M^\\circ\n\\\\\n(n,m) + H_{\\mathcal{X}_{\\mathrm{prin}}} &\\mapsto p^*(n)-m.\n\\end{aligned}$$ In particular, we have that $$\\xi=\\iota \\circ \\delta,$$\nwhere\n$\\iota: (\\mathcal{X}_{\\mathrm{prin}})_{\\bf 1}\\hookrightarrow \\mathcal{X}_{\\mathrm{prin}}$\nis the canonical inclusion. For later reference we also introduce the\nmap $$\\label{eq:def_rho}\n\\rho: \\mathcal{X}_{\\mathrm{prin}}\\to \\mathcal{X} .$$ respecting seed\ntori. The restriction of $\\rho$ to a seed torus is a monomial map whose\npullback is given by $$\\begin{aligned}\n    \\rho^*:  N &\\to N_{{\\mathrm{prin}} }\\\\\n       \\ \\ n & \\mapsto  (n,p^*(n)).\n\\end{aligned}$$ In particular,\n$\\rho \\circ p_{{\\mathrm{prin}} }= \\tilde{p}$. The maps we have\nconsidered so far fit into the following commutative diagram\n$$\\xymatrix{\n(\\mathcal{X}_{\\mathrm{prin}})_{\\bf 1} \\ar@{^{(}->}^{\\ \\iota}[r] & \\mathcal{X}_{\\mathrm{prin}}\\ar_{\\rho}[d] & \\mathcal{A}_{\\mathrm{prin}}\\ar_{p_{{\\mathrm{prin}} }}[l] \\ar@{->>}^{\\varpi}[d] \\ar_{\\tilde{p}}[dl] \\\\\n\\mathcal{A} \\ar^{\\delta}_{\\cong}[u] \\ar_{p}[r] \\ar^{\\xi}[ru] & \\mathcal{X} & \\mathcal{A}_{\\mathrm{prin}}/T_{H_{\\mathcal{A}_{\\mathrm{prin}}}.} \\ar_{\\cong \\ \\ }^{\\chi \\ \\ }[l]\n}$$\n\n**Remark 5**. The maps introduced in this section are associated with\n$\\Gamma$, hence, we label the maps with the subindex $\\Gamma$ to stress\nthe fixed data $\\Gamma$ they are associated with.\n\n## Tropicalization\n\nIn this section we discuss tropicalizations of cluster varieties. We\nmainly follow , and .\n\nLet $T_L$ be the torus associated to a lattice $L$. A rational function\n$f$ on $T_L$ is called positive if it can be written as a fraction\n$f=f_1/f_2$, where both $f_1$ and $f_2$ are a linear combination of\ncharacters of $T_L$ with coefficients in $\\mathbb{Z} _{>0}$. The\ncollection of positive rational functions on $T_L$ forms a semifield\ninside $\\Bbbk(T_L)$ denoted by $Q_{\\rm sf}(L)$. A rational map\n$f:T_L\\dashrightarrow T_{L'}$ between two tori is a **positive rational\nmap** if the pullback $f^*:\\Bbbk(T') \\to \\Bbbk(T)$ restricts to an\nisomorphism $f^*:Q_{\\rm sf}(L') \\to Q_{\\rm sf}(L)$. If $P$ is a\nsemifield, then the $P$ valued points of $T_L$ form the set\n$$\\label{eq:FG_tropicalization}\nT_L(P):=\\mathop{\\mathrm{Hom}}_{\\rm sf}(Q_{\\rm sf} (L), P)$$ of semifield\nhomomorphisms from $Q_{\\rm sf} (L)$ to $P$. In particular, a positive\nbirational isomorphism $\\mu:T\\dashrightarrow T'$ induces a bijection\n$$\\begin{aligned}\n\\mu_*: T(P) & \\to T'(P)\\\\\n h \\ & \\mapsto \\ h \\circ f^*.    \n\\end{aligned}$$ By a slight but common abuse of notation the sublattice\nof monomials of $Q_{\\rm sf}(L)$ is denoted by $L^*$. Considering $P$\njust as an abelian group the restriction of an element of\n$Q_{\\rm sf}(L)$ to $L^*$ determines a canonical bijection\n$T_L(P) \\overset{\\sim}{\\longrightarrow} \\mathop{\\mathrm{Hom}}_{\\rm groups} (L^*, P)$.\n\n**Remark 6**. We systematically identify $T_L(P)$ with $L\\otimes P$ by\ncomposing the canonical bijection\n$T_L(P) \\overset{\\sim}{\\longrightarrow} \\mathop{\\mathrm{Hom}}_{\\rm groups} (L^*, P)$\nwith the canonical isomorphism\n$\\mathop{\\mathrm{Hom}}_{\\rm groups}(L^*, P) \\cong L \\otimes P$.\n\nLet $\\mathcal{V}$ be a (quotient or a fibre of a) cluster variety. For\nevery $\\textbf{s}, \\textbf{s}'\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ the gluing map\n$\\mu^{\\mathcal{V} }_{\\textbf{s}, \\textbf{s}'}: \\mathcal{V} _\\textbf{s}\\dashrightarrow  \\mathcal{V} _\\textbf{s}'$\nis a positive rational map. So we can glue\n$\\mathcal{V} _{\\textbf{s}}(P)$ and $\\mathcal{V} _{\\textbf{s}'} (P)$\nusing $(\\mu^{\\mathcal{V} }_{\\textbf{s}, \\textbf{s}'})_*$ and define\n$$\\mathcal{V} (P):= \\coprod_{\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}} \\mathcal{V} _{\\textbf{s}}(P) / \\left(\\text{gluing by } (\\mu^{\\mathcal{V} }_{\\textbf{s}, \\textbf{s}'})_*\\right)_{\\textbf{s}, \\textbf{s}'\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}}.$$ Every point\n${\\bf a}\\in \\mathcal{V} (P)$ can be represented as a tuple\n$(a_{\\textbf{s}})_{\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}}$ such that\n$(\\mu^{\\mathcal{V} }_{\\textbf{s}, \\textbf{s}'})_*(a_\\textbf{s})=(a_{\\textbf{s}'})$\nfor all $\\textbf{s},\\textbf{s}'\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$. Since all of the maps\n$(\\mu^\\mathcal{V} _{\\textbf{s},\\textbf{s}'})_*$ are bijections, the\nassignment $$\\label{not:tropical_space} \\begin{split} \n   \\mathfrak{r}_{\\textbf{s}}:\\mathcal{V} (P)&\\to \\mathcal{V} _{\\textbf{s}}(P)\\quad \\text{given by} \\quad {\\bf a}=(a_{\\textbf{s}})_{\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}} \\mapsto a_{\\textbf{s}}.\n  \\end{split}$$ determines an identification of $\\mathcal{V} (P)$ with\n$\\mathcal{V} _{\\textbf{s}}(P)$. If $S\\subset \\mathcal{V} (P)$ we let\n$$\\label{eq:identification}\nS_{\\textbf{s}}(P):=\\mathfrak{r}_{\\textbf{s}} (S) \\subset \\mathcal{V} _\\textbf{s}(P)$$\nand write $S_{\\textbf{s}}$ instead of $S_{\\textbf{s}}(P)$ when the\nsemifield $P$ is clear from the context.\n\nThe semifields we consider in this note are the integers, the rationals\nand the real numbers with their additive structure together with the\nsemifield operation determined by taking the maximum (respectively,\nminimum). We denote these semifields by $\\mathbb{Z} ^T$, $\\mathbb{Q} ^T$\nand $\\mathbb{R} ^T$ (respectively, $\\mathbb{Z} ^t$, $\\mathbb{Q} ^t$ and\n$\\mathbb{R} ^t$). The canonical inclusions\n$\\mathbb{Z} \\hookrightarrow \\mathbb{Q} \\hookrightarrow \\mathbb{R}$ give\nrise to canonical inclusions\n$$\\mathcal{V} (\\mathbb{Z} ^T) \\hookrightarrow \\mathcal{V} (\\mathbb{Q} ^T) \\hookrightarrow \\mathcal{V} (\\mathbb{R} ^T) \\quad \\quad \\text{ and } \\quad \\quad  \\mathcal{V} (\\mathbb{Z} ^t) \\hookrightarrow \\mathcal{V} (\\mathbb{Q} ^t) \\hookrightarrow \\mathcal{V} (\\mathbb{R} ^t).$$\nFor a set $S\\subseteq \\mathcal{V} (\\mathbb{R} ^T)$ (resp.\n$S\\subseteq \\mathcal{V} (\\mathbb{R} ^t)$) we let\n$S(\\mathbb{Z} ):= S\\cap \\mathcal{V} (\\mathbb{Z} ^T)$ (resp.\n$S(\\mathbb{Z} ):= S\\cap \\mathcal{V} (\\mathbb{Z} ^t)$). Moreover, for\n$G=\\mathbb{Z} , \\mathbb{Q}$ or $\\mathbb{R}$, there is an isomorphism of\nsemifields $G^T\\to G^t$ given by $x \\mapsto -x$ induces a canonical\nbijection $$\\begin{aligned}\n \\label{eq:imap}\n    i: \\mathcal{V} (G^T) \\rightarrow \\mathcal{V} (G^t). \n\\end{aligned}$$ Since $i$ amounts to a sign change (see Remark below),\nwe think of $i$ as an involution and denote its inverse again by $i$.\n\n**Remark 7**. The set $\\mathcal{V} (\\mathbb{Z} ^t)$ can be identified\nwith the **geometric tropicalization** of $\\mathcal{V}$, defined as\n$$\\mathcal{V} ^{\\mathrm{trop} }(\\mathbb{Z} ) \n     \\coloneqq \\{ \\text{divisorial discrete valuations } \\nu: \\Bbbk(\\mathcal{V} ) \\setminus \\{ 0\\} \\rightarrow \\mathbb Z \\mid \\nu (\\Omega_{\\mathcal{V} }) <0 \\} \\cup \\{ 0\\},$$\nwhere a discrete valuation is divisorial if it is given by the order of\nvanishing of a $\\mathbb{Z} _{>0}$-multiple of a prime divisor on some\nvariety birational to $\\mathcal{V}$.\n\n**Remark 8**. Let $G=\\mathbb{Z} , \\mathbb{Q}$ or $\\mathbb{R}$.\nIdentifying $\\mathcal{V} (G^T)$ with $\\mathcal{V} _{\\textbf{s}}(G^T)$\nvia the bijection $\\mathfrak{r}_\\textbf{s}$ the map $i$ in can be\nthought of as the multiplication by $-1$ (*cf.* Remark ).\n\nA positive rational function $g$ on $\\mathcal{V}$ is a rational function\non $\\mathcal{V}$ such that the restriction of $g$ to every seed torus\n$\\mathcal{V} _{\\textbf{s}}$ is a positive rational function.\n\n**Definition 9**. The **tropicalization** of a positive rational\nfunction $g: \\mathcal{V} \\dashrightarrow \\Bbbk$ with respect to\n$\\mathbb{R} ^T$ is the function\n$g^T:\\mathcal{V} (\\mathbb{R} ^T)\\to \\mathbb{R}$ given by\n$$\\label{eq:restriction}\n{\\bf a}\\mapsto a_{\\textbf{s}}(g),$$ where\n${\\bf a}=(a_{\\textbf{s}})_{\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}}$. The tropicalization\nof $g$ with respect to $\\mathbb{R} ^t$ is the function\n$g^t:\\mathcal{V} (\\mathbb{R} ^t)\\to \\mathbb{R}$ defined as\n$${\\bf v} \\mapsto -v_{\\textbf{s}}(g),$$\n\nwhere ${\\bf v}=(v_{\\textbf{s}})_{\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}}$. A direct computation\nshows that both $g^T$ and $g^t$ are well defined. Namely, one checks\nthat for $\\textbf{s},\\textbf{s}'\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$\n$$a_{\\textbf{s}} (g)=a_{\\textbf{s}'}(g),$$ where in the left (resp.\nright) side of the equality we think of $g$ as a rational function on\n$\\mathcal{V} _\\textbf{s}$ (resp. $\\mathcal{V} _{\\textbf{s}'}$).\nMoreover, we have that $$\\label{eq:comparing_tropicalizations}\n    g^T({\\bf a})=g^t(i({\\bf a})),$$ for all\n${\\bf a} \\in \\mathcal{V} (\\mathbb{R} ^T)$.\n\n**Remark 10**. In order to keep notation lighter we adopt the following\nconventions:\n\n-   given a positive rational function\n    $g\\in \\Bbbk (\\mathcal{V} )=\\Bbbk(\\mathcal{V} _\\textbf{s})$ the\n    tropicalizations of $g$ with domains $\\mathcal{V} (\\mathbb{R} ^T)$\n    and $\\mathcal{V} _{\\textbf{s}}(\\mathbb{R} ^T)$ are denoted by the\n    same symbol $g^T$ for all $\\textbf{s}\\in %\n      \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n        \\vbox to0ex{\\kern-0.5\\ex@\n        \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$;\n\n-   the restriction of $g^T$ (resp. $g^t$) to\n    $\\mathcal{V} (\\mathbb{Z} ^T)$ (resp. $\\mathcal{V} (\\mathbb{Z} ^t)$)\n    is also denoted by $g^T$ (resp. $g^t$);\n\n-   when $P$ is one of $\\mathbb{Z} ^T, \\mathbb{Q} ^T$ or $\\mathbb{R} ^T$\n    (resp. $\\mathbb{Z} ^t, \\mathbb{Q} ^t$ or $\\mathbb{R} ^t$) the map\n    $(\\mu^{\\mathcal{V} }_{\\textbf{s}, \\textbf{s}'})_*$ is denoted by\n    $(\\mu^{\\mathcal{V} }_{\\textbf{s}, \\textbf{s}'})^T$ (resp.\n    $(\\mu^{\\mathcal{V} }_{\\textbf{s}, \\textbf{s}'})^t$).\n\n**Remark 11**. Later we will need to systematically consider\n$\\mathcal{V} (\\mathbb{R} ^t)$ when $\\mathcal{V}$ is a variety of the\nform $\\mathcal{A}$ or $\\mathcal{A} /T_H$ and\n$\\mathcal{V} (\\mathbb{R} ^T)$ when $\\mathcal{V}$ is a variety of the\nform $\\mathcal{X}$ or $\\mathcal{X} _{\\bf 1}$. In particular, from § on\nwe use the notation $\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} )$ that\ntakes into account the different kinds of tropicalizations that we use\nfor different kinds of varieties, see equation .\n\nFor latter use we record the following formulae associated to the\nmutations determined by $\\Gamma$: $$\\label{eq:tropical_A_mutation}\n    \\left(\\mu^{\\mathcal{A} }_{k}\\right)^T(n)=n+[\\langle v_k,n\\rangle]_+(-d_ke_k)$$\nand $$\\label{eq:tropical_X_mutation}\n    \\left(\\mu^{\\mathcal{X} }_{k}\\right)^T(m)=m+[\\langle d_ke_k,m \\rangle]_+v_k.$$\nIn case we tropicalize these mutations with respect to $\\mathbb{R} ^t$\nwe replace $[\\ \\cdot\\ ]_+$ by $[\\ \\cdot\\ ]_-$.\n\nFinally, if we think of $T_L (\\mathbb{R} ^T)$ (resp.\n$T_L(\\mathbb{R} ^t)$) as a vector space (see Remark ), the\ntropicalization of a positive Laurent polynomial\n$g= \\sum_{\\ell\\in L^*}c_{\\ell} z^{\\ell} \\in Q_{\\rm sf}(L)$ with respect\nto $\\mathbb{R} ^T$ (resp. $\\mathbb{R} ^t$) is the function\n$g^T:  T_L(\\mathbb{R} ^T) \\to \\mathbb{R}$ (resp.\n$g^t:  T_L(\\mathbb{R} ^t) \\to \\mathbb{R}$) given by $$\\begin{aligned}\nx &\\mapsto& - \\max \\{ \\langle \\ell , x  \\rangle \\mid \\ell\\in L^* \\text{ such that } c_{\\ell} \\neq 0 \\}\\\\\n (\\text{resp. } x &\\mapsto& \\min \\{ \\langle \\ell , x  \\rangle \\mid \\ell\\in L^* \\text{ such that } c_\\ell \\neq 0 \\}).\n\\end{aligned}$$\n\n# Theta functions and their labeling by tropical points\n\n## Fock–Goncharov duality\n\nFor\n$\\Gamma=(I, I_{\\text{uf}}, N,N^{\\circ}, M, M^{\\circ}, \\{ \\cdot, \\cdot \\}, \\{d_i\\}_{i\\in I} )$\nthe Langlands dual fixed data is\n$\\Gamma^\\vee=(I, I_{\\text{uf}}, N^\\vee, (N^\\vee)^{\\circ}, M^\\vee, (M^\\vee)^{\\circ}, \\{ \\cdot, \\cdot \\}^\\vee, \\{d^\\vee_i\\}_{i\\in I} )$,\nwhere $d:=\\text{lcm}(d_i)_{i\\in I}$,\n$$N^\\vee = N^\\circ, \\quad  (N^\\vee)^{\\circ}= d\\cdot N, \\quad  M^\\vee = M^\\circ, \\quad (M^\\vee)^{\\circ}=d^{-1}\\cdot M, \\quad \\{\\cdot, \\cdot \\}^\\vee= d^{-1}\\{\\cdot, \\cdot \\} \\quad \\text{and} \\quad d^\\vee_i:=d\\,d_i^{-1}.$$\nIf $\\textbf{s}=(e_i)_{ i\\in I}$ is a seed for $\\Gamma$ then the\nLanglands dual seed is $\\textbf{s}^\\vee:=(e_i^\\vee)_{i\\in I}$, where\n$e_i^\\vee:=d_ie_i$. We also set $v^\\vee_i:=\\{e^\\vee_i, \\cdot \\}^\\vee$\nThese constructions give rise to **Langlands dual cluster varieties**\nwhich we denote as follows $$\\begin{aligned}\n    \\begin{array}{l l l l}\n    {}^L(\\mathcal{A} _{\\Gamma;\\textbf{s}_0}) := \\mathcal{A} _{\\Gamma^\\vee;\\textbf{s}_0^\\vee} \\qquad \\qquad &     \\text{and} \\qquad \\qquad & {}^L(\\mathcal{X} _{\\Gamma; \\textbf{s}_0}) := \\mathcal{X} _{\\Gamma^\\vee; \\textbf{s}_0^\\vee}.\n\\end{array}\n\\end{aligned}$$ Since $\\Gamma$ and $\\textbf{s}_0$ were already fixed, we\ndenote ${}^L(\\mathcal{A} _{\\Gamma;\\textbf{s}_0})$ (resp.\n${}^L(\\mathcal{X} _{\\Gamma;\\textbf{s}_0})$) simply by\n${}^{L}\\mathcal{A}$ (resp. ${}^{L}\\mathcal{X}$).\n\n**Definition 12**. The **Fock–Goncharov dual** of $\\mathcal{A}$ (resp.\n$\\mathcal{X}$) is the cluster variety $\\mathcal{A} ^{\\vee}$ (resp.\n$\\mathcal{X} ^{\\vee}$) given by\n$$\\mathcal{A} ^{\\vee} := {}^L\\mathcal{X} \\qquad \\qquad      \\text{and} \\qquad \\qquad  \\mathcal{X} ^{\\vee} := {}^L\\mathcal{A} .$$\n\nIn particular, we have that\n$$\\mathcal{A}_{\\mathrm{prin}}^\\vee = {}^L(\\mathcal{X} _{\\mathrm{prin}} )=\\mathcal{X} _{(\\Gamma_{\\mathrm{prin}} )^\\vee} \\qquad \\qquad     \\mathcal{X}_{\\mathrm{prin}}^{\\vee}= {}^L(\\mathcal{A}_{\\mathrm{prin}})=\\mathcal{A} _{(\\Gamma_{\\mathrm{prin}} )^\\vee}.$$\n\n**Remark 13**. Notice that\n$\\mathcal{A} _{(\\Gamma_{{\\mathrm{prin}} })^\\vee}$ (resp.\n$\\mathcal{X} _{(\\Gamma_{{\\mathrm{prin}} })^\\vee}$) is canonically\nisomorphic to $\\mathcal{A} _{(\\Gamma^\\vee)_{{\\mathrm{prin}} }}$ (resp.\n$\\mathcal{X} _{(\\Gamma^\\vee)_{{\\mathrm{prin}} }}$). Hence, we frequently\nidentify these schemes without making reference to the canonical\nisomorphisms between them.\n\nIt is not hard to see that the map $$\\begin{aligned}\n\\label{eq:L p}\n    {(p^\\vee)^*:= -d^{-1}(p^*)^*:N^\\vee \\to (M^\\vee)^{\\circ}}\n\\end{aligned}$$ is well defined and is a cluster ensemble lattice map\nfor the Langlands dual data ${}^{L}\\Gamma$, where $(p^*)^*$ is the\nlattice map dual to $p^*$. Indeed, in the bases for $N^\\vee$ and\n$(M^\\vee)^{\\circ}$ determined by $\\textbf{s}^\\vee$, and in comparison\nwith the matrix $B_{p^*;\\mathbf{s}}$ in , the matrix of $(p^\\vee)^*$ is\nof the form\n$$B_{(p^\\vee)^*;\\textbf{s}^\\vee}= -B_{p^*;\\textbf{s}}^{\\rm{tr}}.$$ In\nparticular, we have an associated dual cluster ensemble map\n$$p^\\vee:\\mathcal{A} ^\\vee \\to \\mathcal{X} ^\\vee.$$\n\nWe proceed to introduce the Fock–Goncharov dual for a quotient of\n$\\mathcal{A}$. So consider a cluster ensemble lattice map\n$p^*:N\\to M^{\\circ}$ for $\\Gamma$ and the cluster ensemble lattice map\n$(p^\\vee)^*:N^\\vee\\to (M^\\vee)^\\circ$ for $\\Gamma^\\vee$. Recall from\nthat $K=\\ker(p_2^*)$. Similarly, we set\n\n$$K^\\vee=\\ker((p^\\vee)_2^*)=\\{k\\in N^\\circ\\mid \\{k,n\\}=0 \\text{ for all } n\\in d\\cdot N_{\\rm uf}\\},$$\nwhere $(p^\\vee)_2^*$ is the map $p^*_2$ of for $\\Gamma^\\vee$. Let\n$H_{\\mathcal{A} }\\subseteq K^\\circ$ be a saturated sublattice and\nconsider the quotient $\\mathcal{A} /T_{H_\\mathcal{A} }$. Recall from §\nthat $\\mathcal{A} /T_{H_\\mathcal{A} }$ is obtained by gluing tori of the\nform $T_{N^{\\circ}/H_{\\mathcal{A} }}$. Since\n$N^{\\circ}/H_{\\mathcal{A} }$ and\n$H_{\\mathcal{A} }^{\\perp} \\subset M^\\circ$ are dual lattices the\nFock–Goncharov dual of $\\mathcal{A} /T_{H_{\\mathcal{A} }}$ should be a\nfibre of $\\mathcal{A} ^\\vee$ obtained by gluing tori of the form\n$T_{H_{\\mathcal{A} }^{\\perp}}$. In order to construct it notice that for\n$n$ in $N_{\\text{uf}}$ we have\n$\\langle k,p^*(n)\\rangle = -{d}^{-1}\\{k,dn\\}=\\langle dk,-(p^\\vee)^*(n)\\rangle$.\nThis implies that $$K^\\circ=p^*(N_{\\rm uf})^\\perp=K^\\vee.$$ In\nparticular, $H_{\\mathcal{A} }$ is a saturated sublattice of $K^\\vee$ as\nit is saturated in $K^\\circ$. It is therefore possible to find\n$T_{H_{\\mathcal{A} }^*}$ as the base of a fibration of the form for\n$\\mathcal{A} ^{\\vee}$ as we are allowed to set\n$$H_{\\mathcal{A} ^{\\vee}}=H_{\\mathcal{A} }\\subseteq K^\\vee.$$ So\nconsider the fibration\n$$w_{H_{\\mathcal{A} }}:\\mathcal{A} ^{\\vee}\\to T_{H_\\mathcal{A} ^*}.$$\nNotice that the fibre\n$(\\mathcal{A} ^{\\vee})_{{\\bf 1}_{T_{H_\\mathcal{A} ^*}}}$ is obtained\ngluing tori of the form $T_{H_{\\mathcal{A} }^\\perp}$ as desired.\nTherefore, we define the Fock–Goncharov dual of the quotient\n$\\mathcal{A} /T_{H_\\mathcal{A} }$ as\n$$(\\mathcal{A} /T_{H_{\\mathcal{A} }})^{\\vee}:= (\\mathcal{A} ^{\\vee})_{{\\bf 1}_{T_{H_\\mathcal{A} ^*}}}=\\left({}^L\\mathcal{X} \\right)_{{\\bf 1}_{T_{H_\\mathcal{A} ^*}}}.$$\n\nSimilarly, let $H_{\\mathcal{X} }\\subseteq K$ be a saturated sublattice\nand let $w_{H_{\\mathcal{X} }}:\\mathcal{X} \\to T_{H^*_\\mathcal{X} }$ be\nthe associated fibration. Recall that\n$\\mathcal{X} _{{\\bf 1}_{T_{H^*_\\mathcal{X} }}}$ is obtained by gluing\ntori of the form $T_{H_{\\mathcal{X} }^{\\perp}}$. Its Fock–Goncharov dual\nis a quotient of $\\mathcal{X} ^{\\vee}$ glued from tori of the form\n$T_{(H^\\perp_\\mathcal{X} )^*}$ which we construct next. A direct\ncomputation shows that $d\\cdot H_{\\mathcal{X} }$ is a saturated\nsublattice of $(K^\\vee)^\\circ$. In particular, we are allowed to choose\n$$H_{\\mathcal{X} ^{\\vee}}= d\\cdot H_{\\mathcal{X} }\\subseteq (K^\\vee)^\\circ$$\nas a sublattice giving rise to a quotient\n${}^{L}\\mathcal{A} /T_{d\\cdot H_{\\mathcal{X} }}$. This quotient is\nobtained by gluing tori of the form\n$T_{d\\cdot N}/T_{d\\cdot H_\\mathcal{X} }\\cong T_{N/H_\\mathcal{X} }\\cong T_{(H^{\\perp}_\\mathcal{X} )^*}$.\nTherefore, we define the Fock–Goncharov dual of\n$\\mathcal{X} _{{\\bf 1}_{T_{H^*_{\\mathcal{X} }}}}$ as\n$$\\left(\\mathcal{X} _{{\\bf 1}_{T_{H^*_{\\mathcal{X} }}}}\\right)^{\\vee}:= \\mathcal{X} ^{\\vee}/T_{ H_{\\mathcal{X} ^{\\vee}}}={}^L\\mathcal{A} /T_{d\\cdot H_\\mathcal{X} }.$$\n\nIn what follows, when we consider a saturated sublattice $H$ of\n$K^\\circ$ and write expressions such as $\\mathcal{A} /T_{H}$ or\n$w_{H}:\\mathcal{A} ^{\\vee}\\to T_{H^*}$ we will be implicitly assuming\nthat we have set $$H_{\\mathcal{A} }= H = H_{\\mathcal{A} ^{\\vee}}.$$\nSimilarly, when $H$ is a saturated sublattice of $K$ and we write\nexpressions such as $w_{H}:\\mathcal{X} \\to T_{H^*}$,\n$\\mathcal{X} _{\\bf 1}$ or $\\left(\\mathcal{X} _{\\bf 1}\\right)^{\\vee}$ we\nwill be implicitly assuming that we have set\n$$H_{\\mathcal{X} }= H = d^{-1}\\cdot H_{\\mathcal{X} ^{\\vee}},$$\n$$\\quad  \\mathcal{X} _{\\bf 1}= \\mathcal{X} _{{\\bf 1}_{T_{H^*_{\\mathcal{X} }}}}\\quad \\quad \\text{and} \\quad \\quad \\left(\\mathcal{X} _{\\bf 1}\\right)^{\\vee}= \\mathcal{X} ^{\\vee}/T_{H_{\\mathcal{X} ^{\\vee}}}.$$\n\n**Remark 14**. Let $\\mathcal{V}$ be (a quotient of) $\\mathcal{A}$ or (a\nfibre of) $\\mathcal{X}$. In the skew-symmetric case Argüz and Bousseau\nshowed that $\\mathcal{V}$ and $\\mathcal{V} ^{\\vee}$ are mirror dual\nschemes from the point of view of . A similar result is proven for the\nskew-symmetrizable case when $\\mathcal{V}$ has dimension $2$ in with\narguments that may be generalized to arbitrary dimension.\n\n## Scattering diagrams and theta functions\n\nTheta functions are a particular class of global function on (quotients\nand fibres of) cluster varieties introduced in . In this subsection we\noutline their construction. The main case to consider is the one of\n$\\mathcal{A}_{\\mathrm{prin}}$ since scattering diagrams and theta\nfunctions for (quotients of) $\\mathcal{A}$ and (fibres of) $\\mathcal{X}$\ncan be constructed from this case.\n\n**Remark 15**. From now on, whenever we consider the variety\n$\\mathcal{A} =\\mathcal{A} _{\\Gamma,\\textbf{s}_0}$ we will assume\n$\\Gamma$ is of **full-rank**. By definition this means that the map\n$p_1^*:N_{\\text{uf}}\\to M^\\circ$ given by $n \\mapsto \\{ n , \\cdot \\}$ is\ninjective. There are various results of this article for $\\mathcal{A}$\nthat are valid even if $\\Gamma$ is not of full-rank. However, various\nkey results we shall use do need the full-rank condition (*cf.* Remark\n). Even though we are imposing full-rank assumption we will frequently\nrecall that we are assuming it to insist on the necessity of the\nassumption.\n\n### Theta functions on full-rank $\\mathcal{A}$\n\nThroughout this section we systematically identify\n$\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee}(\\mathbb{R} ^T)$ with\n$M^\\circ_{\\mathbb{R} }$, see §. A **wall** in $M^{\\circ}_\\mathbb{R}$ is\na pair $(\\mathfrak{d}, f_{\\mathfrak{d}})$ where\n$\\mathfrak{d}\\subseteq M^{\\circ}_\\mathbb{R}$ is a convex rational\npolyhedral cone of codimension one, contained in $n^{\\perp}$ for some\n$n \\in N_{\\operatorname{uf}, \\textbf{s}}^+$, and\n$f_{\\mathfrak{d}}  = 1+ \\sum_{k \\geq 1} c_k z^{kp^*_1(n)}$ is called a\n**scattering function**, where $c_k \\in \\Bbbk$. A **scattering diagram**\n$\\mathfrak{D}$ in $M^{\\circ}_\\mathbb{R}$ is a (possibly infinite)\ncollection of walls satisfying a certain finiteness condition (see ).\nThe **support** and the **singular locus** of $\\mathfrak{D}$ are defined\nas\n$$\\mathop{\\mathrm{Supp}}(\\mathfrak{D}):= \\bigcup_{\\mathfrak{d}\\in \\mathfrak{D}} \\mathfrak{d}\\ \\ \\ \\text{and} \\ \\ \\ \\mathop{\\mathrm{Sing}}(\\mathfrak{D}):= \\bigcup_{\\mathfrak{d}\\in \\mathfrak{D}} \\partial\\mathfrak{d}\\ \\cup \\bigcup_{\\overset{\\mathfrak{d}_1,\\mathfrak{d}_2 \\in \\mathfrak{D}}{\\text{dim}(\\mathfrak{d}_1 \\cap \\mathfrak{d}_2) = |I|-2}} \\mathfrak{d}_1 \\cap \\mathfrak{d}_2.$$\n\nA wall $(\\mathfrak{d}, f_{\\mathfrak{d}})$ defines a **wall-crossing\nautomorphism** $\\mathfrak{p}_{\\mathfrak{d}}$ of $\\Bbbk (M)$ given in a\ngenerator $z^m$ by\n$\\mathfrak{p}_{\\mathfrak{d}}(z^m)=z^m f_{\\mathfrak{d}}^{\\langle n_{\\mathfrak{d}}, m \\rangle }$,\nwhere $n_{\\mathfrak{d}}$ is the primitive normal vector of the wall\n$\\mathfrak{d}$ with a choice of direction going against the flow of the\npath $\\gamma$. If we fix a scattering diagram $\\mathfrak{D}$ and a\npiecewise linear proper map\n$\\gamma:[0,1]\\to M^\\circ_{\\mathbb{R} }\\setminus \\mathop{\\mathrm{Sing}}(\\mathfrak{D})$\nintersecting $\\text{Supp}( \\mathfrak{D})$ transversely then the **path\nordered product** $\\mathfrak{p}_{\\gamma , \\mathfrak{D}}$ is defined as\nthe composition of automorphisms of the form\n$\\mathfrak{p}_{\\mathfrak{d}}$, where we consider the walls\n$\\mathfrak{d}$ that are transversely crossed by $\\gamma$. However,\nobserve that $\\gamma$ might cross an infinite number of walls,\ntherefore, we would be potentially composing an infinite number of\nautomorphisms and such infinite composition is well defined. Again, the\nreader is referred to for a detailed discussion.\n\n**Definition 16**. A scattering diagram $\\mathfrak{D}$ is **consistent**\nif for all $\\gamma$ as above $\\mathfrak{p}_{\\gamma, \\mathfrak{D}}$ only\ndepends on the endpoints of $\\gamma$. Two scattering diagrams\n$\\mathfrak{D}$ and $\\mathfrak{D}'$ are **equivalent** if\n$\\mathfrak{p}_{\\gamma, \\mathfrak{D}}= \\mathfrak{p}_{\\gamma, \\mathfrak{D}'}$\nfor all $\\gamma$.\n\nTo define cluster scattering diagrams for $\\mathcal{A}$ one first\nconsiders\n$$\\mathfrak{D}_{{\\rm in}, \\textbf{s}}^{\\mathcal{A} } := \\left\\{\\left.\\left( e_i^{\\perp} , 1+z^{ p_{1}^*\\left( e_i \\right)}\\right) \\right| \\  i \\in I_{\\text{uf}}\\right\\}.$$\nA **cluster scattering diagram** for $\\mathcal{A}$ is a consistent\nscattering diagram in $M^{\\circ}_{ \\mathbb{R} }$ containing\n$\\mathfrak{D}_{{\\rm in}, \\textbf{s}}^{\\mathcal{A} }$. By the following\ntheorem, cluster scattering diagrams for $\\mathcal{A}$ do exist\n(provided $\\Gamma$ is of full-rank).\n\n**Theorem 17**. Assume $\\Gamma$ is of full-rank. Then for every seed\n$\\textbf{s}$ there is a consistent scattering diagram\n$\\mathfrak{D}_{\\textbf{s}}^{\\mathcal{A} }$ such that\n$\\mathfrak{D}_{{\\rm in}, \\textbf{s}}^{\\mathcal{A} } \\subset \\mathfrak{D}_{\\textbf{s}}^{\\mathcal{A} }$.\nFurthermore $\\mathfrak{D}_{\\textbf{s}}^{\\mathcal{A} }$ is equivalent to\na scattering diagram all of whose scattering functions are of the form\n${ f_{\\mathfrak{d}} = (1+ z^{p_{1}^*(n)})^c}$, for some $n \\in N$, and\n$c$ a positive integer.\n\n**Definition 18**.\n\nFix a cluster scattering diagram\n$\\mathfrak{D}^{\\mathcal{A} }_{\\textbf{s}}$. Let\n$m\\in M^{\\circ} \\setminus \\{0\\}$ and\n$x_0 \\in M^{\\circ}_{\\mathbb{R} } \\setminus \\text{Supp}(\\mathfrak{D})$. A\n(generic) **broken line** for $\\mathfrak{D}^{\\mathcal{A} }_{\\textbf{s}}$\nwith initial exponent $m$ and endpoint $x_0$ is a piecewise linear\ncontinuous proper path\n$\\gamma : ( - \\infty , 0 ] \\rightarrow M^\\circ_{\\mathbb{R} } \\setminus \\mathop{\\mathrm{Sing}}(\\mathfrak{D}^{\\mathcal{A} }_{\\textbf{s}})$\nbending only at walls, with a finite number of domains of linearity $L$\nand a monomial $c_L z^{m_L} \\in \\Bbbk[M^\\circ]$ for each of these\ndomains. The path $\\gamma$ and the monomials $c_L z^{m_L}$ are required\nto satisfy the following conditions:\n\n-   $\\gamma(0) = x_0$.\n\n-   If $L$ is the unique unbounded domain of linearity of $\\gamma$, then\n    $c_L z^{m_L} = z^{m}$.\n\n-   For $t$ in a domain of linearity $L$, $\\gamma'(t) = -m_L$.\n\n-   If $\\gamma$ bends at a time $t$, passing from the domain of\n    linearity $L$ to $L'$ then $c_{L'}z^{m_{L'}}$ is a term in\n    $\\mathfrak{p}_{{\\gamma}|_{(t-\\epsilon,t+\\epsilon)},\\mathfrak{D}_t} (c_L z^{m_L})$,\n    where\n    ${\\mathfrak{D}_t = \\left\\{\\left.(\\mathfrak{d}, f_{\\mathfrak{d}}) \\in \\mathfrak{D}^{\\mathcal{A} }_{\\textbf{s}}\\right| \\gamma (t) \\in \\mathfrak{d}\\right\\}}$.\n\nWe refer to $m_L$ as the **slope** or **exponent vector** of $\\gamma$ at\n$L$ and set\n\n-   $I(\\gamma) = m$;\n\n-   $\\text{Mono} (\\gamma) = c(\\gamma)z^{F(\\gamma)}$ to be the monomial\n    attached to the unique domain of linearity of $\\gamma$ having $x_0$\n    as an endpoint.\n\n**Definition 19**.\n\nChoose a point $x_0$ in the interior of\n$\\mathcal{C}_\\textbf{s}^+:=\\{m\\in M^{\\circ}_{\\mathbb{R} }\\mid \\langle e_i, m \\rangle \\geq 0 \\text{ for all  } i \\in I_{\\text{uf}}\\}$\nand let\n$m\\in \\mathcal{A} ^\\vee_{\\textbf{s}^\\vee}(\\mathbb{Z} ^T)=M^\\circ$. The\n**theta function** on $\\mathcal{A}$ associated to $m$ is $$\\label{eq:tf}\n    \\vartheta ^{\\mathcal{A} }_{ m}:= \\sum_{\\gamma} \\text{Mono} (\\gamma),$$\nwhere the sum is over all broken lines $\\gamma$ with $I(\\gamma)=m$ and\n$\\gamma(0)=x_0$. For $m= 0$ we define $\\vartheta^{\\mathcal{A} }_{0} =1$.\nWe say $\\vartheta ^{\\mathcal{A} }_{ m}$ is **polynomial** if the sum in\nis finite.\n\n**Remark 20**. It is a nontrivial fact that\n$\\vartheta ^{\\mathcal{A} }_{m}$ is independent of the point\n$x_0\\in \\mathcal{C}_{\\textbf{s}}^+$ we have chosen, see . Moreover, in\ngeneral $\\vartheta ^\\mathcal{A} _{m}$ can be an infinite sum and in\norder to think of $\\vartheta ^{\\mathcal{A} }_{m}$ as a function on a\nspace one needs to work formally an consider a degeneration of\n$\\mathcal{A}$, see for the details. However, in case\n$\\vartheta ^{\\mathcal{A} }_{m}$ is polynomial then\n$\\vartheta ^{\\mathcal{A} }_{m}\\in H^0(\\mathcal{A} ,\\mathcal{O}_{\\mathcal{A} })$,\nthat is, $\\vartheta ^{\\mathcal{A} }_{m}$ is an algebraic function on\n$\\mathcal{A}$. The definition of $\\vartheta ^{\\mathcal{A} }_{m}$ in\ncorresponds to the expression of such function written in the\ncoordinates of the seed torus $\\mathcal{A} _{\\textbf{s}}$.\n\n### Labeling by tropical points\n\nRecall that we are identifying\n$\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee}(\\mathbb{R} ^T)$ and\n$M^{\\circ}_{\\mathbb{R} }$. By construction, a theta function on\n$\\mathcal{A}$ is labeled by a point\n$m\\in \\mathcal{A} ^\\vee_{\\textbf{s}^\\vee}(\\mathbb{Z} ^T)=  M^{\\circ}$.\nBy , this labeling upgrades to a labeling by a point in\n$\\mathcal{A} ^\\vee(\\mathbb{Z} ^T)$. The main point being that if we let\n$m'=(\\mu^{\\mathcal{A} ^\\vee}_{k})^T(m)\\in \\mathcal{A} ^{\\vee}_{\\mu_k(\\textbf{s}^\\vee)}(\\mathbb{Z} ^T)$\nfor $k \\in I_{\\text{uf}}$ then $\\vartheta ^{\\mathcal{A} }_m$ and\n$\\vartheta ^{\\mathcal{A} }_{m'}$ correspond to the same function (see\nRemark ) expressed, however, in different cluster coordinates. This fact\nis of great importance for this paper so we would like to highlight it:\n\n*every theta function on $\\mathcal{A}$ is naturally labeled by a point\nof $\\mathcal{A} ^\\vee(\\mathbb{Z} ^T)$*.\n\nIn light of the discussion just above, from now on we label theta\nfunctions on $\\mathcal{A}$ either by elements of\n$\\mathcal{A} ^{\\vee}(\\mathbb{Z} ^T)$ or of\n$\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee}(\\mathbb{Z} ^T)$. For sake of\nclarity, tropical points are denoted in bold font and as tuples. That\nis, ${\\bf m}=(m_{\\textbf{s}^\\vee})_{\\textbf{s}^\\vee\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}}$ denotes an element of\n$\\mathcal{A} ^\\vee(\\mathbb{Z} ^T)$ and\n$m_{\\textbf{s}^\\vee}=\\mathfrak{r}_{\\textbf{s}^\\vee}({\\bf m})$. With this\nnotation we have the following identity\n$$\\vartheta ^{\\mathcal{A} }_{\\bf m}=\\vartheta ^{\\mathcal{A} }_{m_{\\textbf{s}^\\vee}}.$$\n\nEven further, we can think of\n$\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^{\\mathcal{A} }_{\\textbf{s}})$ as a\nsubset of $\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee}(\\mathbb{R} ^T)$. By we\nhave that for every $k \\in I_{\\text{uf}}$,\n$\\mu^{\\mathcal{A} ^\\vee}_{\\textbf{s}^\\vee,\\mu_k(\\textbf{s}^\\vee)}\\left(\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^{\\mathcal{A} }_{\\textbf{s}})\\right)=\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^{\\mathcal{A} }_{\\mu_k(\\textbf{s})})$\nand that $\\mathfrak{D}^{\\mathcal{A} }_{\\textbf{s}}$ and\n$\\mathfrak{D}^{\\mathcal{A} }_{\\mu_k(\\textbf{s})}$ are equivalent. Hence\nthere is a well defined subset\n$\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^\\mathcal{A} ) \\subset \\mathcal{A} ^{\\vee}(\\mathbb{R} ^T)$\nsuch that\n$$\\mathfrak{r}_{\\textbf{s}^\\vee}\\left(\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^\\mathcal{A} ) \\right)= \\mathop{\\mathrm{Supp}}(\\mathfrak{D}^\\mathcal{A} _{\\textbf{s}})$$\nfor every $\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$. The point here is that\n$\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^\\mathcal{A} )$ is seed independent.\nSimilarly, the map\n$\\mu^{\\mathcal{A} ^\\vee}_{\\textbf{s}^\\vee,\\mu_k(\\textbf{s}^\\vee)}$\ndetermines a bijection between the set of broken lines for\n$\\mathfrak{D}^{\\mathcal{A} }_{\\textbf{s}}$ and the set of broken lines\nfor $\\mathfrak{D}^{\\mathcal{A} }_{\\mu_k(\\textbf{s})}$ (see ). In\nparticular, supports of broken lines make sense in\n$\\mathcal{A} ^\\vee(\\mathbb{R} ^T)$.\n\n**Remark 21**. It is possible to upgrade\n$\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^\\mathcal{A} )$ to a scattering\ndiagram inside $\\mathcal{A} ^{\\vee}(\\mathbb{R} ^T)$. In this generality\nscattering functions are described using log Gromov–Witten invariants.\nSee for details.\n\n### The middle cluster algebra\n\nLet us recall now that broken lines also encode the multiplication of\ntheta functions. That is, given a product of arbitrary theta functions\n$\\vartheta ^{\\mathcal{A} }_p \\vartheta ^{\\mathcal{A} }_q$ with\n$p,q \\in \\mathcal{A} ^\\vee_{\\textbf{s}^\\vee}(\\mathbb{Z} ^T)$, we can use\nbroken lines to express the structure constants\n$\\alpha\\left(p,q,r\\right)$ in the expansion $$\\begin{aligned}\n \\label{eq:product}\n    \\vartheta^{\\mathcal{A} }_p \\vartheta^{\\mathcal{A} }_q = \\sum_{r\\in \\mathcal{A} ^\\vee_{\\textbf{s}^\\vee}(\\mathbb{Z} ^T)} \\alpha(p,q,r) \\vartheta^{\\mathcal{A} }_r.\n\\end{aligned}$$ We review the construction here. First, pick a general\nendpoint $z$ near $r$. Then define ()\n$$\\label{eq:multibrokenline} \\begin{split} \n    \\alpha_z (p, q, r) := \\sum_{\\substack{\\left(\\gamma^{(1)}, \\gamma^{(2)}\\right) \\\\ I(\\gamma^{(1)})= p,\\ I(\\gamma^{(2)})= q\\\\ \\gamma^{(1)}(0) = \\gamma^{(2)}(0) = z\\\\\n    F(\\gamma^{(1)}) + F(\\gamma^{(2)}) = r   }}   c(\\gamma^{(1)})\\ c(\\gamma^{(2)}),   \\end{split}$$\nwhere the sum is over all pairs of broken lines\n$\\left(\\gamma^{(1)}, \\gamma^{(2)}\\right)$ ending at $z$ with initial\nslopes $I(\\gamma^{(1)}) = p$, $I(\\gamma^{(2)}) = q$ and final slopes\nsatisfying $F(\\gamma^{(1)})+F(\\gamma^{(2)}) =r$.\nGross–Hacking–Keel–Kontsevich show that for $z$ sufficiently close to\n$r$, $\\alpha_z (p, q, r)$ is independent of $z$ and gives the structure\nconstant $\\alpha (p, q, r)$ (see ).\n\n**Definition 22**.\n\nLet\n$\\Theta(\\mathcal{A} ):= \\{ {\\bf m} \\in \\mathcal{A} ^{\\vee}(\\mathbb{Z} ^T) \\mid \\vartheta^{\\mathcal{A} }_{{\\bf m}} \\text{ is polynomial}\\}$.\nThe **middle cluster algebra** $\\text{mid}(\\mathcal{A} )$ is the\n$\\Bbbk$-algebra whose underlying vector space is\n$\\{ \\vartheta ^{\\mathcal{A} }_{{\\bf m}} \\mid {\\bf m} \\in \\Theta(\\mathcal{A} ) \\}$,\nthe multiplication of the basis elements is given by and extended\nlinearly to all $\\mathop{\\mathrm{mid}}(\\mathcal{A} )$.\n\n### Theta functions on $\\mathcal{A}_{\\mathrm{prin}}$\n\nThe data $\\Gamma_{{\\mathrm{prin}} }$ is of full-rank. Therefore, this\ncase is a particular case of §. So we can talk about scattering\ndiagrams, broken lines and theta functions for\n$\\mathcal{A}_{\\mathrm{prin}}$. The following result follows from Theorem\nand the definition of theta functions.\n\n**Lemma 23**. Fix a seed $\\widetilde{\\textbf{s}}$ for\n$\\mathcal{A}_{\\mathrm{prin}}$ and express theta functions on the cluster\ncoordinates determined by $\\widetilde{\\textbf{s}}$. For\n$(m,n)\\in \\mathfrak{r}_{\\widetilde{\\textbf{s}}}(\\Theta(\\mathcal{A}_{\\mathrm{prin}}))$\nwe have that\n$\\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{(m,n)}=\\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{(m,0)}\\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{(0,n)}$\nand $\\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{(0,n)}$ is the Laurent\nmonomial on the coefficients given by $n$.\n\nNote that, for $(m_1,n_1),(m_2,n_2) \\in M^{\\circ}_{\\rm prin}$, in\ngeneral we have that\n$\\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{(m_1+m_2,n_1+n_2)} \\neq \\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{(m_1,n_1)} \\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{(m_2,n_2)}$.\nThe above lemma holds because the decomposition is only separating the\nunfrozen and frozen parts (*cf.*  below).\n\n**Remark 24**. Scattering diagrams for $\\mathcal{A}_{\\mathrm{prin}}$ can\nbe used to define scattering diagrams, broken lines therein and theta\nfunctions on a variety $\\mathcal{V}$ of form $\\mathcal{A}$ (even if\n$\\Gamma$ is not of full-rank), $\\mathcal{X}$, $\\mathcal{A} /T_{H}$ and\n$\\mathcal{X} _{{\\bf 1}}$. Further, in each one of these cases we can\ndefine the associated middle cluster algebra\n$\\mathop{\\mathrm{mid}}(\\mathcal{V} )$ and the set $\\Theta(\\mathcal{V} )$\nparametrizing its theta basis. In the following subsections we explain\nthe cases of $\\mathcal{A} /T_{H}$, $\\mathcal{X}$, and\n$\\mathcal{X} _{{\\bf 1}}$ individually. We do not treat the case of\n$\\mathcal{A}$ for $\\Gamma$ when $\\Gamma$ is not of full-rank since the\nresults of § do not apply to this case.\n\n### Theta functions on $\\mathcal{A} /T_{H}$\n\nSuppose that $\\Gamma$ is of full-rank (*cf.* Remark ). Let\n$H \\subset K^\\circ$ be a saturated sublattice and consider the quotient\n$\\mathcal{A} /T_{H}$ and the fibration $w_H: \\mathcal{A} ^{\\vee}\\to H^*$\n(see the end of §). The next result shows that theta functions on\n$\\mathcal{A}$ have a well defined $T_{H}$-weight.\n\n**Proposition 25**.\n\nEvery polynomial theta function on $\\mathcal{A}$ is an eigenfunction\nwith respect to the $T_{H}$-action. For every\n${\\bf q}\\in \\Theta(\\mathcal{A} )$ the $T_H$-weight of\n$\\vartheta ^{\\mathcal{A} }_{\\bf q}$ is the image of\n${\\bf q} \\in \\mathcal{A} ^{\\vee}(\\mathbb{Z} ^T)$ under the tropicalized\nmap $w^{T}_{H}:\\mathcal{A} ^{\\vee}(\\mathbb{Z} ^T) \\to H^*$. Under the\nisomorphism $H^* \\cong M^\\circ/H^\\perp$ and in the lattice\nidentification of $\\mathcal{A} ^\\vee_{\\textbf{s}^\\vee}(\\mathbb{Z} ^T)$\nof $\\mathcal{A} ^\\vee (\\mathbb{Z} ^T)$ the map $w^T_{H}$ is given by\n$$\\begin{aligned}\n   w^{T}_{H} : \\ & \\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee}(\\mathbb{Z} ^T) \\to M^\\circ/H^{\\perp},\\\\\n     & \\ \\ \\  \\ \\ \\  q \\longmapsto q + H^{\\perp}.\n\\end{aligned}$$\n\nThe claims are essentially contained in the literature already (see for\ninstance ). The differences are that we are acting by a potentially\nsmaller torus (Gross–Hacking–Keel–Kontsevich act by $T_{K^\\circ}$ rather\nthan $T_{H}$) and, regarding the map\n$w_{H}: \\mathcal{A} ^\\vee \\to T_{H^*}$, we are including $\\Bbbk[H]$ into\n$\\Bbbk[N^\\vee]=\\Bbbk[N^\\circ]$ rather than including $\\Bbbk[K^\\circ]$\ninto $\\Bbbk[N^\\circ]$. For the convenience of the reader we give a proof\nof the statement.\n\n*Proof of Proposition .* By all scattering functions may be taken to be\nof the form $\\left(1+z^{p^*(n)}\\right)^c$ for some $n \\in N_{\\text{uf}}$\nand some positive integer $c$.[^4] For\n$q\\in \\Theta(\\mathcal{A} )_{\\textbf{s}^\\vee}$ we have that\n$\\vartheta ^{\\mathcal{A} }_q$ is as a Laurent polynomial in\n$\\Bbbk[M^\\circ]$. All monomial summands of\n$\\vartheta ^{\\mathcal{A} }_{q}$ have the form $c_m z^{q + m}$ for some\n$m \\in p^*(N_{\\text{uf}})$ and $c_m \\in \\mathbb{Z} _{>0}$. The\n$T_{H}$-weight of this monomial is obtained from the map\n$$\\begin{split} \nT_{H} \\to T_{\\mathbb{Z} }=\\Bbbk^*\\quad \\text{given by } \\quad z^{h} \\mapsto z^{\\left\\langle{q+m , h}\\right\\rangle} \\quad \\text{for $h \\in {H}$}.\n \\end{split}$$ Since $H \\subset p^*\\left(N_{\\text{uf}}\\right)^\\perp$ we\nhave that\n$z^{\\left\\langle{q+m , h}\\right\\rangle} = z^{\\left\\langle{q, h}\\right\\rangle}$.\nThat is, the $T_{H}$-weight of each monomial $z^{m'}$, $m'\\in M^\\circ$,\nis the character of $T_{H}$ given by\n$m' + H^\\perp  \\in M^\\circ/H^\\perp \\cong H^*$. Moreover, all monomial\nsummands of $\\vartheta ^{\\mathcal{A} }_q$ have the $T_{H}$-weight\n$q + H^\\perp \\in H^*$. Next, the piecewise linear map\n$(\\mu_k^{\\mathcal{A} ^\\vee})^T:M^\\circ_\\textbf{s}\\to M^\\circ_{\\mu_k(\\textbf{s})}$\nsends $m$ to $m+m'$ for some $m'\\in p^*(N_{\\text{uf}})$. So, the choice\nof torus does not affect the $T_{H}$-weight. Therefore,\n$\\vartheta ^{\\mathcal{A} }_q$ is an eigenfunction whose weight is\n$q + H^\\perp$. Furthermore, the projection\n$$\\begin{split} M^\\circ &\\to M^\\circ/H^\\perp\\quad \\text{given by} \\quad q \\mapsto q + H^\\perp  \\end{split}$$\ndualizes the inclusion $H \\hookrightarrow N^\\circ$. So, restricting to\nseed tori, this is precisely the tropicalization of the map\n${T_{M^\\circ} \\rightarrow T_{H^*}}$ whose pullback is the inclusion\n$H \\hookrightarrow N^\\circ$. Since $p^*$ commutes with mutation, we see\nthat the $T_{H}$-weight of $\\vartheta ^{\\mathcal{A} }_{\\bf q}$ is the\nimage of ${\\bf q}$ under the tropicalization of\n$w_{H}: \\mathcal{A} ^\\vee \\rightarrow T_{H^*}$. ◻\n\nEvery weight $0$ eigenfunction on $\\mathcal{A}$ induces a well defined\nfunction on $\\mathcal{A} /T_{H}$. So in order to construct a\nscattering-diagram-like structure $\\mathfrak{D}^{\\mathcal{A} /T_H}$\ndefining theta functions on $\\mathcal{A} /T_{H}$ we consider the\n**weight zero slice** inside $\\mathcal{A} ^{\\vee}(\\mathbb{R} ^T)$\ndefined as $(w^T_H)^{-1}(0)$. Observe that identifying\n$\\mathcal{A} ^\\vee$ with $M^\\vee$ via a choice of seed, then\n$(w^T_H)^{-1}(0)$ corresponds to $H^{\\perp}_{\\mathbb{R} }$. With this in\nmind, we define\n$\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^{\\mathcal{A} /T_{H}})$ as\n$$\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^{\\mathcal{A} /T_{H}}):=\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^{\\mathcal{A} })\\cap (w^T_H)^{-1}(0).$$\nThe scattering functions attached to the walls of\n$\\mathfrak{r}_{\\textbf{s}^\\vee}(\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^{\\mathcal{A} /T_{H}}))$\nare the same as the corresponding functions attached to the walls of\n$\\mathfrak{D}^{\\mathcal{A} }_\\textbf{s}$. This gives rise to a\nscattering diagram $\\mathfrak{D}^{\\mathcal{A} /T_{H}}_{\\textbf{s}}$\ninside $(\\mathcal{A} /T_{H})^\\vee_{\\textbf{s}^\\vee}(\\mathbb{R} ^T)$ for\nevery $\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$. The broken lines for\n$\\mathfrak{D}^{\\mathcal{A} /T_{H}}_\\textbf{s}$ are the broken lines for\n$\\mathfrak{D}^{\\mathcal{A} }_\\textbf{s}$ entirely contained in\n$\\mathfrak{r}_{\\textbf{s}^\\vee}(w^{-1}_{H}(0))$.\n\nIn order to label a theta function on $\\mathcal{A} /T_{H}$ with an\nelement of $(\\mathcal{A} /T_{H})^{\\vee}(\\mathbb{Z} ^T)$ it suffices to\nconsider a bijection\n$(\\mathcal{A} /T_{H})^{\\vee}(\\mathbb{R} ^T) \\overset{\\sim}{\\longrightarrow} (w_H^T)^{-1}(0)$.\nSuch a bijection can be obtained tropicalizing the inclusion\n$\\mathfrak{i}_H:(\\mathcal{A} /T_{H})^{\\vee} \\hookrightarrow \\mathcal{A} ^\\vee$.\nIndeed, in lattice identifications of the tropical spaces given by a\nseed $\\textbf{s}$, the map\n$\\mathfrak{i}_H^T:(\\mathcal{A} /T_{H})^{\\vee}_{\\textbf{s}}(\\mathbb{Z} ^T)\\hookrightarrow \\mathcal{A} ^\\vee_{\\textbf{s}}(\\mathbb{Z} ^T)$\ncorrespond to the inclusion $H^\\perp \\hookrightarrow M^\\vee$ and\n$w^{-1}_{H}(0)(\\mathbb{Z} )$ corresponds to $H^\\perp$.\n\nIn particular, we obtain (as one should have expected) that the theta\nfunctions on $\\mathcal{A} /T_{H}$ are precisely the functions on\n$\\mathcal{A} /T_{H}$ induced by the $T_H$-weight zero theta functions on\n$\\mathcal{A}$. So we let\n$\\Theta(\\mathcal{A} /T_H)\\subset (\\mathcal{A} /T_H)^{\\vee}(\\mathbb{Z} ^T)$\nbe the preimage of $\\Theta(\\mathcal{A} )\\cap (w_H^T)^{-1}(0)$ under\n$\\mathfrak{i}^T_H$ and define the middle cluster algebra\n$\\mathop{\\mathrm{mid}}(\\mathcal{A} /T_H)$ as in the case of\n$\\mathcal{A}$ (see ). In particular, for\n${\\bf m}\\in \\Theta (\\mathcal{A} /T_H)$ the theta function\n$\\vartheta ^{\\mathcal{A} /T_H}_{\\bf m}$ is the function on\n$\\mathcal{A} /T_H$ induced by\n$\\vartheta ^{\\mathcal{A} }_{\\mathfrak{i}^T_H({\\bf m})}$. So,\n\n*every theta function on $\\mathcal{A} /T_H$ is naturally labeled by a\npoint of $(\\mathcal{A} /T_H)^\\vee(\\mathbb{Z} ^T)$*.\n\n### Theta functions on $\\mathcal{X}$\n\nRecall from § that there is an isomorphism\n$\\chi: \\mathcal{A}_{\\mathrm{prin}}/T_{H_{\\mathcal{A}_{\\mathrm{prin}}}}\\to \\mathcal{X}$,\nwhere\n$$H_{\\mathcal{A}_{\\mathrm{prin}}}=\\left\\{\\left(n,-(p^*)^*(n)\\right)\\in N^\\circ_{\\mathrm{prin}} \\mid n \\in N^\\circ\\right\\} \\subset K^{\\circ}_{{\\mathrm{prin}} }.$$\nHence, the construction of theta functions on $\\mathcal{X}$ is already\ncovered in the previous subsection. However, there is a very subtle\ndifference created by treating\n$\\mathcal{A}_{\\mathrm{prin}}/T_{H_{\\mathcal{A}_{\\mathrm{prin}}}}$ as a\ncluster $\\mathcal{X}$-variety as opposed to a quotient of\n$\\mathcal{A}_{\\mathrm{prin}}$:\n\n*every theta function on $\\mathcal{X}$ is naturally labeled by a point\nof $\\mathcal{X} ^\\vee(\\mathbb{Z} ^t)$ as opposed to\n$\\mathcal{X} ^\\vee(\\mathbb{Z} ^T)$.*\n\nIf we would proceed as in the previous subsection we would label theta\nfunctions on\n$\\mathcal{A}_{\\mathrm{prin}}/T_{H_{\\mathcal{A}_{\\mathrm{prin}}}}$ by\npoints of\n$(\\mathcal{A}_{\\mathrm{prin}}/T_{H_{\\mathcal{A}_{\\mathrm{prin}}}})^\\vee(\\mathbb{R} ^T)$.\nThe origin of the difference is made explicit by the following lemma.\n\n**Lemma 26**. There is a canonical bijection between\n$\\mathcal{X} ^{\\vee}(\\mathbb{R} ^t)$ and\n$\\left(w^T_{H_{\\mathcal{A}_{\\mathrm{prin}}}}\\right)^{-1}(0)\\subset \\mathcal{A}_{\\mathrm{prin}}^\\vee(\\mathbb{R} ^T)$.\n\n*Proof.* One can verify directly that the composition\n$\\xi^T_{\\Gamma^\\vee}\\circ i$ gives rise to the desired bijection, where\n$$i:\\mathcal{X} ^\\vee(\\mathbb{R} ^t) \\to \\mathcal{X} ^\\vee(\\mathbb{R} ^T)$$\nis the bijection discussed in § and\n$$\\xi_{\\Gamma^\\vee}^T:  \\mathcal{X} ^\\vee(\\mathbb{R} ^T) \\to \\mathcal{A}_{\\mathrm{prin}}^{\\vee}(\\mathbb{R} ^T)$$\nis the tropicalization of the map\n$\\xi_{\\Gamma^\\vee}:\\mathcal{X} ^\\vee=\\mathcal{A} _{\\Gamma^\\vee} \\to  \\mathcal{X} _{(\\Gamma^\\vee)_{{\\mathrm{prin}} }}\\cong \\mathcal{X} _{(\\Gamma_{{\\mathrm{prin}} })^\\vee}=\\mathcal{A}_{\\mathrm{prin}}^{\\vee}$\ndescribed in , see Remarks and . However, for the convenience of the\nreader we include computations that show in a rather explicit way the\nnecessity to consider $\\mathcal{X} ^\\vee(\\mathbb{Z} ^t)$ as opposed to\n$\\mathcal{X} ^\\vee(\\mathbb{Z} ^T)$. For simplicity throughout this proof\nwe denote $w_{H_{\\mathcal{A}_{\\mathrm{prin}}}}$ simply by $w$.\n\nPick a seed $\\textbf{s}=(e_i)_{i \\in I}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ for $\\Gamma$ and\nconsider the seed $\\textbf{s}^\\vee$ for $\\Gamma^\\vee$. Denote by\n$\\widetilde{\\textbf{s}}^\\vee$ the seed for\n$(\\Gamma_{{\\mathrm{prin}} })^\\vee$ obtained mutating\n$\\textbf{s}_{0_{{\\mathrm{prin}} }}$ in the same sequence of directions\nneeded to obtain $\\textbf{s}$ from $\\textbf{s}_0$. Then\n$$\\left((w^T)^{-1}(0)\\right)_{\\widetilde{\\textbf{s}}^\\vee}(\\mathbb{R} ^T)=H^\\perp_{\\mathcal{A}_{\\mathrm{prin}}}=\\{(p^*(n),n)\\in M^\\circ_{{\\mathrm{prin}} ,\\mathbb{R} } \\mid n\\in N_{\\mathbb{R} }\\}\\subset M^\\circ_{{\\mathrm{prin}} , \\mathbb{R} }=M^\\circ_{\\mathbb{R} }\\oplus N_{\\mathbb{R} }$$\n(see to recall the meaning of\n$\\left((w^T)^{-1}(0)\\right)_{\\widetilde{\\textbf{s}}^\\vee}(\\mathbb{R} ^T)$).\nWe now verify that for every $k\\in I_{\\text{uf}}$ there is a commutative\ndiagram $$\\xymatrix{\n\\left((w^T)^{-1}(0)\\right)_{\\widetilde{\\textbf{s}}^{\\vee}}(\\mathbb{R} ^T) \\ar^{\\left(\\mu^{\\mathcal{A}_{\\mathrm{prin}}^\\vee}_{k}\\right)^T}[rr]  \\ar_{\\pi^{\\mathcal{X} ^\\vee}_1}[d] & & \\left((w^T)^{-1}  (0)\\right)_{\\mu_k(\\widetilde{\\textbf{s}}^{\\vee})}(\\mathbb{R} ^T) \\ar^{\\pi^{\\mathcal{X} ^\\vee}_2}[d]\n\\\\\n\\mathcal{X} ^{\\vee}_{\\textbf{s}^{\\vee}}(\\mathbb{R} ^t)\\ar^{\\left(\\mu^{\\mathcal{X} ^\\vee}_{k}\\right)^t}[rr] & & \\mathcal{X} ^{\\vee}_{\\mu_k(\\textbf{s}^{\\vee})}(\\mathbb{R} ^t),\n}$$ where the vertical maps $\\pi^{\\mathcal{X} ^\\vee}_1$ and\n$\\pi^{\\mathcal{X} ^\\vee}_2$ are both given by $(p^*(n),n)\\mapsto dn$\n(recall that\n$\\mathcal{X} ^{\\vee}_{\\textbf{s}^{\\vee}}(\\mathbb{R} ^t)=(N^\\vee)^\\circ_{\\mathbb{R} }= (d\\cdot N)_{\\mathbb{R} } =\\mathcal{X} ^{\\vee}_{\\mu_k(\\textbf{s}^{\\vee})} (\\mathbb{R} ^t)$).\nBy definition we have that $$\\begin{aligned}\n    \\left(\\mu^{\\mathcal{A}_{\\mathrm{prin}}^\\vee}_{k}\\right)^T(p^*(n),n)& \\overset{\\eqref{eq:tropical_X_mutation}}{=} & (p^*(n),n)+[\\langle (d e_k,0),(p^*(n),n)\\rangle]_+\\{d_ke_k, \\cdot \\}^{\\vee}_{{\\mathrm{prin}} } \\\\\n    &=&  (p^*(n),n)+ [p^*(n)(de_k)]_+(\\{d_ke_k, \\cdot\\}^\\vee,d_ke_k)\\\\\n    &=&(p^*(n) + [\\{n, de_k\\}]_+\\{d_ke_k, \\cdot\\}^\\vee,n+ [\\{n, de_k\\}]_+d_ke_k).\n\\end{aligned}$$ Using the facts that $d, d_k>0$ and that\n$d\\max(a,b)=\\max(da,db)$ and $\\max(a,b)=-\\min(-a,-b)$ for all\n$a,b \\in \\mathbb{R}$, we compute that $$\\begin{aligned}\n\\pi^{\\mathcal{X} ^\\vee}_2\\left(\\left(\\mu^{\\mathcal{A}_{\\mathrm{prin}}^\\vee}_{k}\\right)^T(p^*(n),n)\\right) &=&  dn+ d[\\{n, de_k\\}^\\vee]_+d_ke_k\\\\\n&=&  dn+ [\\{dn, de_k\\}^\\vee]_+d_ke_k\\\\\n &=&  dn+[-\\{de_k,dn\\}^{\\vee}]_+d_ke_k\\\\\n &=&  dn+[-\\{d_ke_k,dn\\}^{\\vee}]_+de_k\\\\\n &=&  dn-[\\{d_ke_k,dn\\}^{\\vee}]_-de_k\\\\\n  &=&  dn-[\\langle v_k^\\vee,dn\\rangle]_-de_k\\\\\n&=& dn+[\\langle v_k^\\vee,dn\\rangle]_-(-d_k^\\vee e^\\vee_k)\\\\\n&\\overset{\\eqref{eq:tropical_A_mutation}}{=}& \\left(\\mu^{\\mathcal{X} ^\\vee}_{k}\\right)^t (dn)\\\\\n&=& \\left(\\mu^{\\mathcal{X} ^\\vee}_{k}\\right)^t \\left(\\pi^{\\mathcal{X} ^\\vee}_1(p^*(n),n)\\right).\n\\end{aligned}$$ This gives the commutativity of the diagram. Notice\nmoreover that $\\pi^{\\mathcal{X} ^\\vee}_1$ and\n$\\pi^{\\mathcal{X} ^\\vee}_2$ are canonical bijections. These two facts\ntogether imply that we have a well defined bijection\n$$\\pi^{\\mathcal{X} ^\\vee}: (w^T)^{-1}(0)(\\mathbb{R} ^T) \\overset{\\sim}{\\longrightarrow} \\mathcal{X} ^{\\vee}(\\mathbb{R} ^t).$$\nThe fact that $\\xi^T_{\\Gamma^\\vee}\\circ i$ is the inverse of\n$\\pi^{\\mathcal{X} ^\\vee}$ follows from noticing that, in lattice\nidentifications of the domain and codomian of $\\xi^T_{\\Gamma^\\vee}$\ngiven by a choice of seed, we have that\n$$\\xi^T_{\\Gamma^\\vee}(dn)=(-p^*(n),-n).$$ ◻\n\nWe can now define cluster scattering diagrams for $\\mathcal{X}$ using\ncluster scattering diagrams for $\\mathcal{A}_{\\mathrm{prin}}$ and the\nquotient map $\\tilde{p}:\\mathcal{A}_{\\mathrm{prin}}\\to \\mathcal{X}$\ndescribed in and the content of Lemma . We define\n$\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^{\\mathcal{X} })$ as\n$$\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^{\\mathcal{X} }):=\\pi^{\\mathcal{X} ^\\vee}\\left(\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^{\\mathcal{A}_{\\mathrm{prin}}})\\cap (w^T_H)^{-1}(0)\\right)\\subset \\mathcal{X} ^\\vee(\\mathbb{Z} ^t).$$\nBy definition the support of the scattering diagram\n$\\mathfrak{D}^{\\mathcal{X} }_{\\textbf{s}}$ is\n$\\mathfrak{r}_{\\textbf{s}^\\vee}\\left(\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^{\\mathcal{X} })\\right)$.\nThe scattering functions attached to the walls of\n$\\mathop{\\mathrm{Supp}}(\\mathfrak{D}^{\\mathcal{X} }_\\textbf{s})$ are\nobtained by applying $\\tilde{p}^*$ to the scattering functions of the\ncorresponding walls of\n$\\mathfrak{D}^{\\mathcal{A}_{\\mathrm{prin}}}_\\textbf{s}$. We proceed in\nan analogous way to define broken lines for\n$\\mathfrak{D}^{\\mathcal{X} }_\\textbf{s}$. As in the previous cases,\nsupports of broken lines are well defined inside\n$\\mathcal{X} ^\\vee(\\mathbb{Z} ^t)$.\n\nThe labeling of a theta function on $\\mathcal{X}$ with an element of\n$\\mathcal{X} ^{\\vee}(\\mathbb{Z} ^t)$ is obtained using the bijection of\nLemma . More precisely, for\n${\\bf n} \\in \\mathcal{X} ^\\vee(\\mathbb{Z} ^t)$ with\n${\\bf n}\\in \\Theta(\\mathcal{X} )$ we have\n$$\\tilde{p}^*(\\vartheta ^\\mathcal{X} _{\\bf n})=\\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{\\xi^T_{\\Gamma^\\vee}\\circ i({\\bf n})}.$$\nExplicitly, in lattice identifications of the tropical spaces, we have\nthat for $dn \\in \\mathcal{X} ^\\vee_{\\textbf{s}^\\vee}(\\mathbb{Z} ^t)$\n$$\\tilde{p}^*\\left(\\vartheta ^{\\mathcal{X} }_{dn}\\right):= \n\\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{(p^*(n),n)}.$$\n\n**Example 27**. Let $\\epsilon\n=\n\\left(\\begin{matrix}\n0 & 2 \\\\\n-1 & 0\n\\end{matrix}\\right)$ and $d_1=1, d_2=2$. Using the above parametrization\nwe compute\n$$\\vartheta ^{\\mathcal{X} }_{2(-1,-2)}=X_1^{-1}X_2^{-2}+2X_1^{-1}X_2^{-1}+X_1^{-1}.$$\nIndeed, we have that $\\xi^T_{\\Gamma^\\vee}\\circ i(2(-1,-2))=(2,-2)$ and\n$$\\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{(2,-2),(-1,-2)}= \\left(\\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{(1,-1),(0,0)}\\right)^2 \\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{(0,0),(-1,-2)} = \\left(\\dfrac{A_1+t_2}{A_2}\\right)^2t_1^{-1}t_2^{-2}= \\tilde{p}^*(X_1^{-1}X_2^{-2}+2X_1^{-1}X_2^{-1}+X_1^{-1}).$$\n\n### Theta functions on $\\mathcal{X} _{\\bf 1}$\n\nAs in the previous subsections we would like to highlight that\n\n*every theta function on $\\mathcal{X} _{\\bf 1}$ is naturally labeled by\na point of $(\\mathcal{X} _{\\bf 1})^\\vee(\\mathbb{Z} ^t)$*\n\nas we now explain. The tropical space\n$(\\mathcal{X} _{\\bf 1})^{\\vee}(\\mathbb{R} ^t)$ is the quotient of\n$\\mathcal{X} ^{\\vee} (\\mathbb{R} ^t)$ by the tropicalization of the\naction of $T_H$ on $\\mathcal{X} ^{\\vee}$. In other words, since the\nvariety $(\\mathcal{X} _{\\bf 1})^{\\vee}$ is a quotient of\n$\\mathcal{X} ^\\vee$, we can consider the quotient map by\n$\\varpi_{H}: \\mathcal{X} ^\\vee \\to (\\mathcal{X} _{\\bf 1})^\\vee$ to\nobtain a surjection\n$$\\varpi_H^t:  \\mathcal{X} ^{\\vee}(\\mathbb{R} ^t) \\to (\\mathcal{X} _{\\bf 1})^{\\vee} (\\mathbb{R} ^t).$$\nThen, given\n$\\overline{\\bf n}\\in (\\mathcal{X} _{\\bf 1})^{\\vee} (\\mathbb{R} ^t)$ and\n${\\bf n}\\in (\\varpi_H^t)^{-1}(\\overline{\\bf n})$ we define\n$$\\vartheta ^{\\mathcal{X} _{\\bf 1}}_{\\overline{\\bf n}}=\\vartheta ^{\\mathcal{X} }_{\\bf n}|_{\\mathcal{X} _{\\bf 1}}.$$\nMore concretely, working in lattice identifications of the tropical\nspaces, we have that\n$\\mathcal{X} ^{\\vee}(\\mathbb{R} ^t)_{\\textbf{s}^\\vee} = N_\\mathbb{R}$\nand\n$(\\mathcal{X} _{\\bf 1})^{\\vee}_{\\textbf{s}^\\vee}(\\mathbb{R} ^t) {\\cong}N_\\mathbb{R} /H_{\\mathbb{R} }$.\nThen for every $n \\in N$\n$$\\vartheta ^{\\mathcal{X} _{\\bf 1}}_{d n + H}=\\vartheta ^{\\mathcal{X} }_{dn}|_{\\mathcal{X} _{\\bf 1}}.$$\nOne can proceed in an analogous way as in the previous cases to\nconstruct a scattering diagram like structure\n$\\mathfrak{D}^{\\mathcal{X} _{\\bf 1}}_{\\textbf{s}}$ inside\n$(\\mathcal{X} _{\\bf 1})^\\vee_{\\textbf{s}}(\\mathbb{Z} ^t)$. In turn we\nobtain a description of\n$\\vartheta ^{\\mathcal{X} _{\\bf 1}}_{\\overline{\\bf n}}$ using broken\nlines and use these to define\n$\\mathop{\\mathrm{mid}}(\\mathcal{X} _{\\bf 1})$ and\n$\\Theta(\\mathcal{X} _{\\bf 1})$.\n\n### The full Fock–Goncharov conjecture\n\nLet $\\mathcal{V}$ be a scheme of the form $\\mathcal{A}$, $\\mathcal{X}$,\n$\\mathcal{A} /T_{H}$ or $\\mathcal{X} _{{\\bf 1}}$. The **upper cluster\nalgebra** of $\\mathcal{V}$ is defined as\n$$\\text{up}(\\mathcal{V} ):=H^0(\\mathcal{V} ,\\mathcal{O}_{\\mathcal{V} }).$$\nEvery polynomial theta function on $\\mathcal{V}$ belongs to\n$\\text{up}(\\mathcal{V} )$, therefore, we have a natural $\\Bbbk$-linear\nmap $\\mathop{\\mathrm{mid}}(\\mathcal{V} )\\to \\text{up}(\\mathcal{V} )$. If\n$\\mathcal{V}$ is one of $\\mathcal{A}$ (see Remark ) or $\\mathcal{X}$ it\nwas proved in that this map is in fact an injective homomorphism of\nalgebras. These cases already imply that the same is true is\n$\\mathcal{V}$ if of the form $\\mathcal{A} /T_H$ or\n$\\mathcal{X} _{\\bf 1}$.\n\n**Remark 28**. If $\\mathcal{V} = \\mathcal{A}$, $\\mathcal{X}$,\n$\\mathcal{A} /T_{H}$ or $\\mathcal{X} _{{\\bf 1}}$ then\n$\\mathop{\\mathrm{mid}}(\\mathcal{V} )$ is an integral domain. Indeed,\n$\\mathop{\\mathrm{mid}}(\\mathcal{V} )$ is a subalgebra of\n$\\mathop{\\mathrm{up}}(\\mathcal{V} )=H^0(\\mathcal{V} ,\\mathcal{O}_{\\mathcal{V} })$\nwhich is a domain as $\\mathcal{V}$ is irreducible.\n\nAs we have seen in the previous subsections theta functions on varieties\nof the form $\\mathcal{A}$ or $\\mathcal{A} /T_H$ are naturally labeled by\nthe $\\mathbb{Z} ^T$-points of its Fock–Goncharov dual, whereas theta\nfunctions on varieties of the form $\\mathcal{X}$ or\n$\\mathcal{X} _{\\bf 1}$ are naturally labeled by the\n$\\mathbb{Z} ^t$-points of its Fock–Goncharov dual. Since we would like\nto consider all these cases simultaneously we introduce the following\nnotation. For $G= \\mathbb{Z} , \\mathbb{Q}$ or $\\mathbb{R}$ we set\n\n$$\\label{eq:unif}\n    \\mathrm{Trop} _G(\\mathcal{V} ):=\n    \\begin{cases}\n        \\mathcal{V} (G^t) &\\text{ if } \\mathcal{V} =\\mathcal{A} \\text{ or } \\mathcal{V} =\\mathcal{A} /T_H\\vspace{1mm}\\\\\n        \\mathcal{V} (G^T) \\ & \\text{ if } \\mathcal{V} =\\mathcal{X} \\text{ or } \\mathcal{V} =\\mathcal{X} _{\\bf 1}.\n    \\end{cases}$$\n\nSimilarly, for a positive rational function\n$g: \\mathcal{V} \\dashrightarrow \\Bbbk$ we let $$\\label{eq:unif_function}\n    \\mathrm{Trop} _G(g):=\n    \\begin{cases}\n        g^t &\\text{ if } \\mathcal{V} =\\mathcal{A} \\text{ or } \\mathcal{V} =\\mathcal{A} /T_H\\vspace{1mm}\\\\\n        g^T \\ &\\text{ if } \\mathcal{V} =\\mathcal{X} \\text{ or } \\mathcal{V} =\\mathcal{X} _{\\bf 1}.\n    \\end{cases}$$\n\nIn particular, if we think of the seed torus $\\mathcal{V} _\\textbf{s}$\nas a cluster variety with only frozen directions then\n$\\mathrm{Trop} _G(\\mathcal{V} _\\textbf{s})=\\mathfrak{r}_{\\textbf{s}}(\\mathrm{Trop} _G(\\mathcal{V} ))=\\mathcal{V} _{\\textbf{s}}(G^t)$,\nif $\\mathcal{V}$ is of the form $\\mathcal{A}$ or $\\mathcal{A} /T_H$ and\n$\\mathrm{Trop} _G(\\mathcal{V} _\\textbf{s})=\\mathfrak{r}_{\\textbf{s}}(\\mathrm{Trop} _G(\\mathcal{V} ))=\\mathcal{V} _{\\textbf{s}}(G^T)$,\nif $\\mathcal{V}$ is of the form $\\mathcal{X}$ or $\\mathcal{X} _{\\bf 1}$.\nFor later use we also set $$\\label{eq:Theta_seed}\n\\Theta(\\mathcal{V} )_{\\textbf{s}^\\vee}:=\\mathfrak{r}_{\\textbf{s}^\\vee}(\\Theta(\\mathcal{V} ))\\subset \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee),$$\nsee the line just below equation . Following we introduce the following\ndefinition.\n\n**Definition 29**. Let $\\mathcal{V}$ be a scheme of the form\n$\\mathcal{A}$, $\\mathcal{X}$, $\\mathcal{A} /T_{H}$ or\n$\\mathcal{X} _{{\\bf 1}}$. We say that **the full Fock–Goncharov\nconjecture** holds for $\\mathcal{V}$ if\n\n-   $\\Theta(\\mathcal{V} )=\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$,\n    and\n\n-   the natural map\n    $\\mathop{\\mathrm{mid}}(\\mathcal{V} ) \\to \\text{up}(\\mathcal{V} )$ is\n    an isomorphism.\n\n# Bases of theta functions for partial minimal models\n\nIn , the authors obtained nearly optimal conditions ensuring that the\nfull Fock–Goncharov conjecture holds for a cluster variety. However,\nthey were able to prove that the ring of regular functions of a partial\ncompactifications of a cluster varieties has a basis of theta functions\nunder much stronger conditions. In this section we outline this\nframework, including quotients and fibres of cluster varieties, and\nrefer to for a detailed treatment. The main class of (partial)\ncompactifications we shall consider are the (partial) minimal models\ndefined below.\n\n**Definition 30**. Let $\\mathcal{V}$ be a scheme of the form\n$\\mathcal{A} , \\mathcal{X} , \\mathcal{A} /T_H$ or\n$\\mathcal{X} _{\\bf 1}$. An inclusion $\\mathcal{V} \\subset Y$ as an open\nsubscheme of a normal variety $Y$ is a **partial minimal model** of\n$\\mathcal{V}$ if the canonical volume form on $\\mathcal{V}$ has a simple\npole along every irreducible divisor of $Y$ contained in\n$Y \\setminus \\mathcal{V}$. It is a **minimal model** if $Y$ is, in\naddition, projective. We call $Y \\setminus \\mathcal{V}$ the **boundary**\nof $\\mathcal{V} \\subset Y$.\n\nFor example, if $\\mathcal{V}$ is a cluster $\\mathcal{A}$-variety with\nfrozen variables we can let these variables vanish to obtain a partial\nminimal model of $\\mathcal{V}$ as in . Similarly, if we consider a torus\nas a cluster variety (by letting $I_{\\text{uf}}= \\emptyset$) then a\npartial minimal model is simply a normal toric variety.\n\nGiven a partial minimal model $\\mathcal{V} \\subset Y$, where\n$\\mathcal{V}$ is a scheme of the form\n$\\mathcal{A} , \\mathcal{X} , \\mathcal{A} /T_H$ or\n$\\mathcal{X} _{\\bf 1}$, we would like to describe the set of theta\nfunctions on $\\mathcal{V}$ (resp. $\\mathcal{V} ^\\vee$) that extend to\n$Y$ in a similar way as the ring of algebraic functions on a normal\ntoric variety is described in toric geometry using polyhedral fans. In\norder to be able to do so we need that the pair\n$(\\mathcal{V} , \\mathcal{V} ^\\vee)$ satisfies a technical condition\n–*theta reciprocity*– that we will introduce shortly. For this, we need\nto discuss first the *tropical pairings* associated to the pair\n$(\\mathcal{V} ,\\mathcal{V} ^{\\vee})$.\n\nIn order to define the tropical pairings we temporarily assume that\n$\\mathcal{V}$ is a variety of the form $\\mathcal{A}$ or\n$\\mathcal{A} /T_{H}$ so that $\\mathcal{V} ^\\vee$ is a cluster\n$\\mathcal{X}$-variety or a fibre of a cluster $\\mathcal{X}$-variety,\nrespectively. In particular,\n$\\Theta(\\mathcal{V} )\\subset \\mathcal{V} ^\\vee(\\mathbb{Z} ^T)= \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee)$\nand\n$\\Theta(\\mathcal{V} ^\\vee)\\subset \\mathcal{V} (\\mathbb{Z} ^t)=\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} )$,\nsee . Recall from Remark  that the set $\\mathcal{V} (\\mathbb{Z} ^t)$\n(resp. $\\mathcal{V} ^\\vee(\\mathbb{Z} ^t)$) is canonically identified\nwith the geometric tropicalization\n$\\mathcal{V} ^\\mathrm{trop} (\\mathbb{Z} )$ (resp.\n$(\\mathcal{V} ^\\vee)^\\mathrm{trop} (\\mathbb{Z} )$). Therefore, we\nsystematically think of the elements of $\\mathcal{V} (\\mathbb{Z} ^t)$\n(resp. $\\mathcal{V} ^\\vee(\\mathbb{Z} ^t)$) as divisorial discrete\nvaluations on $\\Bbbk(\\mathcal{V} )$ (resp. $\\Bbbk(\\mathcal{V} ^\\vee)$).\nWe also consider the bijection\n$i : \\mathcal{V} ^\\vee(\\mathbb{Z} ^T) \\to \\mathcal{V} ^\\vee(\\mathbb{Z} ^t )$\nintroduced in § (see the comment bellow ). The **tropical pairings**\nassociated to the pair $(\\mathcal{V} ,\\mathcal{V} ^\\vee)$ are the\nfunctions\n$\\langle \\cdot , \\cdot \\rangle : \\Theta(\\mathcal{V} ^{\\vee})  \\times \\Theta (\\mathcal{V} )  \\to \\mathbb{Z}$\nand\n$\\langle \\cdot , \\cdot \\rangle^{\\vee} : \\Theta(\\mathcal{V} ^{\\vee})  \\times \\Theta (\\mathcal{V} )  \\to \\mathbb{Z}$\ngiven by\n$$\\langle {\\bf v} , {\\bf b} \\rangle = {\\bf v}(\\vartheta ^{\\mathcal{V} }_{\\bf b}) \\ \\ \\ \\ \\ \\ \\ \\text{and} \\ \\ \\ \\ \\ \\ \\ \\langle {\\bf v} , {\\bf b} \\rangle^{\\vee} = i({\\bf b}) (\\vartheta ^{\\mathcal{V} ^{\\vee}}_{\\bf v}),$$\n\n**Definition 31**. Let $\\mathcal{V}$ be a scheme of the form\n$\\mathcal{A} , \\mathcal{X} , \\mathcal{A} /T_H$ or\n$\\mathcal{X} _{\\bf 1}$. The pair $(\\mathcal{V} ,\\mathcal{V} ^\\vee)$ has\n**theta reciprocity** if\n$\\Theta(\\mathcal{V} )=\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee)$,\n$\\Theta(\\mathcal{V} ^{\\vee})=\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} )$,\nand\n$\\langle {\\bf v} , {\\bf b} \\rangle = \\langle {\\bf v} , {\\bf b} \\rangle^{\\vee}$\nfor all\n$({\\bf v},{\\bf b})\\in \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ) \\times \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee)$.\n\n**Remark 32**. Definition shall not be considered artificial. In fact,\nan analogous conjecture for affine log Calabi–Yau varieties with maximal\nboundary is expected to hold true, see .\n\n**Lemma 33**. Let $\\mathcal{V}$ be a scheme of the form\n$\\mathcal{A} , \\mathcal{X} , \\mathcal{A} /T_H$ or $\\mathcal{X} _{\\bf 1}$\nand let $\\mathcal{V} \\subset Y$ be a (partial) minimal model. Suppose\nthat the pair $(\\mathcal{V} ,\\mathcal{V} ^\\vee)$ has theta reciprocity.\nThen for every seed $\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ the set of theta\nfunctions on $\\mathcal{V}$ that extend to $Y$ can be described as the\nintersection of $\\Theta(\\mathcal{V} ^\\vee)_{\\textbf{s}^\\vee}$ (see )\nwith a polyhedral cone of the vector space\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee})$\n(see the sentence bellow equation ).\n\n*Proof.* We treat the cases $\\mathcal{V} = \\mathcal{A}$ or\n$\\mathcal{A} /T_H$ as the proof is completely analogous for the cases\n$\\mathcal{V} = \\mathcal{X}$ or $\\mathcal{X} _{\\bf 1}$. Let\n$D_1, \\dots, D_s$ be the irreducible divisors of $Y$ contained in the\nboundary of $\\mathcal{V} \\subset Y$. Since $Y$ is normal, to describe\nthe theta functions on $\\mathcal{V}$ that extend to $Y$ it is enough to\ndescribe the set of theta functions that extend to $D_1, \\dots , D_s$\nsince $Y\\setminus (\\mathcal{V} \\cup D_1, \\dots , D_s)$ has co-dimension\ngreater or equal to $2$ in $Y$. Let $\\mathop{\\mathrm{ord}}_{D_j}$ be the\ndiscrete valuation on $\\Bbbk(\\mathcal{V} )\\setminus \\{ 0 \\}$ associated\nto the irreducible divisor $D_j$. Since $\\mathcal{V} \\subset Y$ is a\npartial minimal model, $\\mathop{\\mathrm{ord}}_{D_j}$ determines a point\nof $\\mathcal{V} (\\mathbb{Z} ^t)$. Since\n$\\Theta(\\mathcal{V} ^{\\vee})= \\mathcal{V} (\\mathbb{Z} ^t)$ we have\n$\\mathop{\\mathrm{ord}}_{D_j} \\in \\Theta (\\mathcal{V} ^{\\vee})$.\nTherefore,\n$\\vartheta ^{\\mathcal{V} ^{\\vee}}_{\\mathop{\\mathrm{ord}}_{D_j}}$ is a\npolynomial theta function and its tropicalization is the function\n$$(\\vartheta _{\\mathop{\\mathrm{ord}}_{D_j}}^{\\mathcal{V} ^\\vee})^t:\\mathcal{V} ^{\\vee}( \\mathbb{Z} ^t)\\to \\mathbb{Z} \\quad \\text{given by} \\quad  v \\mapsto v (\\vartheta ^{\\mathcal{V} ^{\\vee}}_{\\mathop{\\mathrm{ord}}_{D_j}}).$$\nIn other words,\n$(\\vartheta _{\\mathop{\\mathrm{ord}}_{D_j}}^{\\mathcal{V} ^{\\vee}})^t(v)=\\langle \\mathop{\\mathrm{ord}}_{D_j}, i(v) \\rangle$.\nSince $\\Theta(\\mathcal{V} )= \\mathcal{V} ^\\vee(\\mathbb{Z} ^T)$ we have\nthat $i(v)\\in \\Theta(\\mathcal{V} )$ and, therefore,\n$\\vartheta ^\\mathcal{V} _{i(v)}$ is a polynomial theta function. The\nassumption\n$\\langle{\\bf v} , {\\bf b} \\rangle = \\langle {\\bf v} , {\\bf b} \\rangle^{\\vee}$\nfor all ${\\bf v}$ and ${\\bf b}$ implies that\n$$(\\vartheta _{\\mathop{\\mathrm{ord}}_{D_j}}^{\\mathcal{V} ^{\\vee}})^t(v)= (\\vartheta ^{\\mathcal{V} }_{i(v)})^t(\\mathop{\\mathrm{ord}}_{D_j}),$$\nsince\n$$(\\vartheta _{\\mathop{\\mathrm{ord}}_{D_j}}^{\\mathcal{V} ^{\\vee}})^t(v) =\n\\langle \\mathop{\\mathrm{ord}}_{D_j}, i(v) \\rangle =\n\\langle \\mathop{\\mathrm{ord}}_{D_j}, i(v)\\rangle^{\\vee} =\n\\mathop{\\mathrm{ord}}_{D_j}(\\vartheta ^{\\mathcal{V} }_{i(v)}) =\n(\\vartheta ^{\\mathcal{V} }_{i(v)})^t(\\mathop{\\mathrm{ord}}_{D_j}).$$\nThus a theta function\n$\\vartheta ^{\\mathcal{V} }_{i(v)} \\in \\mathop{\\mathrm{mid}}(\\mathcal{V} )$\nextends to $D_j$ if and only if\n$0\\leq (\\vartheta ^{\\mathcal{V} ^{\\vee}}_{\\mathop{\\mathrm{ord}}_{D_j}})^t(v)$.\nIn particular, a theta function $\\vartheta ^\\mathcal{V} _{i(v)}$ extends\nto $Y$ if and only if\n$$i(v)\\in \\bigcap_{i=1}^s\\{b\\in\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee}(\\mathbb{R} ^T)\\mid 0\\leq (\\vartheta ^{\\mathcal{V} ^{\\vee}}_{\\mathop{\\mathrm{ord}}_{D_j}})^T(b)\\}$$\nsince $g^T(b)=g^t(i(b))$ for every positive function $g$ on\n$\\mathcal{V}$, see . By definition of tropicalization, the set\n$\\bigcap_{i=1}^s\\{b\\in\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee}(\\mathbb{R} ^T)\\mid 0\\leq (\\vartheta ^{\\mathcal{V} ^{\\vee}}_{\\mathop{\\mathrm{ord}}_{D_j}})^T(b)\\}$\nis a polyhedral cone of\n$\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee}(\\mathbb{R} ^T)=\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee})$. ◻\n\nWe now turn to the problem of understanding when the theta functions on\n$\\mathcal{V}$ that extend to a (partial) minimal model\n$\\mathcal{V} \\subset Y$ form a basis of $H^0(Y, \\mathcal{O}_Y)$. The\nfollowing notion is central.\n\n**Definition 34**. Let $\\mathcal{V}$ be a scheme of the form\n$\\mathcal{A} , \\mathcal{X} , \\mathcal{A} /T_H$ or\n$\\mathcal{X} _{\\bf 1}$. We say that the theta functions on $\\mathcal{V}$\n**respect the order of vanishing** if for all\n${\\bf v}\\in \\mathcal{V} (\\mathbb{Z} ^t)$ and\n$\\displaystyle \\sum_{{\\bf q}\\in \\Theta(\\mathcal{V} )} \\alpha_{\\bf q} \\vartheta ^{\\mathcal{V} }_{\\bf q}\\in \\mathop{\\mathrm{mid}}(\\mathcal{V} )$\nthen\n$${\\bf v}\\left(\\sum_{{\\bf q}\\in \\Theta(\\mathcal{V} )} \\alpha_{\\bf q} \\vartheta ^{\\mathcal{V} }_{\\bf q}\\right) \\geq 0 \\ \\ \\text{ if and only if }\\ \\  {\\bf v}(\\vartheta _{\\bf q})\\geq 0  \\text{ for all } {\\bf q} \\text{ such that } \\alpha_{\\bf q}\\neq 0.$$\n\nNotice that in the authors conjecture that the theta functions on\n$\\mathcal{A}_{\\mathrm{prin}}$ respect the order of vanishing. The\n**superpotential** associated to a partial minimal model\n$\\mathcal{V} \\subset Y$ is the function on $\\mathcal{V} ^\\vee$ defined\nas $$\\label{eq:def superpotential}\n    W_{Y}:=\\sum_{j=1}^n \\vartheta ^{\\mathcal{V} ^{\\vee}}_{j},$$ where\n$$\\label{eq:def superpotential_summands}\n   \\vartheta ^{\\mathcal{V} ^{\\vee}}_{j}=\\begin{cases}\n       \\vartheta ^{\\mathcal{V} ^{\\vee}}_{\\mathop{\\mathrm{ord}}_{D_j}} &\\text{ if } \\mathcal{V} =\\mathcal{A} \\text{ or } \\mathcal{V} =\\mathcal{A} /T_H\\vspace{1mm}\\\\\n       \\vartheta ^{\\mathcal{V} ^{\\vee}}_{i(\\mathop{\\mathrm{ord}}_{D_j})} \\ &\\text{ if } \\mathcal{V} =\\mathcal{X} \\text{ or } \\mathcal{V} =\\mathcal{X} _{\\bf 1}.\n   \\end{cases}$$ The **superpotential cone** associated to $W_Y$ is\n$$\\label{eq:def Xi}\n    \\Xi_Y:= \\{ {\\bf v} \\in \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^\\vee) \\mid \\mathrm{Trop} _{\\mathbb{R} }(W_Y)({\\bf v})\\geq0 \\},$$\nsee equation .\n\nWe further set\n$\\Xi_{Y;\\textbf{s}^\\vee}:= \\mathfrak{r}_{\\textbf{s}^\\vee}(\\Xi_Y)\\subset \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee})$.\nNotice that if the theta functions on $\\mathcal{V}$ respect the order of\nvanishing then $\\Xi_{Y;\\textbf{s}}$ is precisely the polyhedral subset\nof Lemma . The next results follows at once from the definitions.\n\n**Lemma 35**. Let $\\mathcal{V}$ be a scheme of the form\n$\\mathcal{A} , \\mathcal{X} , \\mathcal{A} /T_H$ or $\\mathcal{X} _{\\bf 1}$\nand let $\\mathcal{V} \\subset Y$ be a (partial) minimal model. Suppose\nthat the full Fock–Goncharov conjecture holds for $\\mathcal{V}$, that\nthe pair $(\\mathcal{V} , \\mathcal{V} ^\\vee)$ has theta reciprocity and\nthat the theta functions on $\\mathcal{V}$ respect the order of\nvanishing. Then the set of theta functions on $\\mathcal{V}$ parametrized\nby the points of $\\Xi_Y(\\mathbb{Z} )$ is a basis of\n$H^0(Y, \\mathcal{O}(Y))$.\n\n**Lemma 36**. Suppose there is a cluster ensemble map\n$p:\\mathcal{A} \\to \\mathcal{X}$ that is an isomorphism. Then theta\nfunctions on $\\mathcal{A}$ respect the order of vanishing if and only\ntheta functions on $\\mathcal{X}$ respect the order of vanishing.\n\n*Proof.* The result follows at once from the fact that\n$p^*(\\vartheta ^{\\mathcal{X} }_{\\bf n})= \\vartheta ^{\\mathcal{A} }_{(p^{\\vee})^T\\circ i ({\\bf n})}$. ◻\n\nWe propose the following definition that allows to have the benefits of\nLemma without having to verify all its assumptions. We apply this in §.\n\n**Definition 37**. We say that $\\mathcal{V} \\subset Y$ has **enough\ntheta functions** if the full Fock–Goncharov conjecture holds for\n$\\mathcal{V}$ and the theta functions on $\\mathcal{V}$ parametrized by\n$\\Xi_{Y} (\\mathbb{Z} )$ form a basis of $H^0(Y, \\mathcal{O}_Y)$.\n\nWe now recall an important notion introduced in that can be used to\nverify in a combinatorial way that a partial minimal model\n$\\mathcal{A} \\subset Y$ has enough theta functions provided $Y$ is\nobtained by letting the frozen variables vanish.\n\n**Definition 38**. We say that a seed $\\textbf{s}=(e_i)_{ i \\in I}$ is\n**optimized** for a point ${\\bf n} \\in \\mathcal{A} (\\mathbb{Z} ^t)$ if\nunder the identification of $\\mathcal{A} (\\mathbb{Z} ^t)$ with $N^\\circ$\nafforded by $\\textbf{s}$ we have that $\\{ e_k, n_{\\textbf{s}} \\}\\geq 0$\nfor all $k \\in I_{\\text{uf}}$.\n\n**Lemma 39**.\n\nAssume that $\\mathcal{A}$ satisfies the full Fock–Goncharov conjecture.\nLet $\\mathcal{A} \\subset Y$ be a partial minimal model of $\\mathcal{A}$\nand let $D_1, \\dots , D_s$ be the irreducible divisors of $Y$ contained\nin $Y\\setminus \\mathcal{A}$. Assume that\n$p^*_2|_{N^{\\circ}}: N^{\\circ}\\to N_{\\text{uf}}^*$ is surjective and\nthat the point\n$\\mathop{\\mathrm{ord}}_{D_j}\\in \\mathcal{A} ^{\\vee}(\\mathbb{Z} ^t)$ has\nan optimized seed for every $1 \\leq j \\leq s$. Then the partial minimal\nmodel $\\mathcal{A} \\subset Y$ has enough theta functions.\n\n*Proof.* Since $p^*_2|_{N^{\\circ}}$ is surjective we have that\n$\\mathcal{A}_{\\mathrm{prin}}$ is isomorphic to $\\mathcal{A} \\times T_M$\n(see ). Consider the partial compactification\n$\\mathcal{A}_{\\mathrm{prin}}\\subset Y \\times T_M$. Its boundary is\nisomorphic to $D\\times T_M$ and the irreducible components of the\nboundary are the divisors $\\widetilde{D}_1, \\dots, \\widetilde{D}_s$,\nwhere $\\widetilde{D}_j:=D_j \\times T_M$. By hypothesis\n$\\mathop{\\mathrm{ord}}_{D_j}$ is optimized for some seed $\\textbf{s}_j$.\nLet $\\widetilde{\\textbf{s}}_j$ be the seed for\n$\\Gamma_{{\\mathrm{prin}} }$ obtained mutating\n$\\textbf{s}_{0_{{\\mathrm{prin}} }}$ in the same sequence of directions\nneeded to obtain $\\textbf{s}_j$ from $\\textbf{s}_0$. Observe that for\nevery $1\\leq j \\leq s$, under the identifications\n$$\\mathcal{A} _{{\\mathrm{prin}} ,\\widetilde{\\textbf{s}}_j}(\\mathbb{Z} ^t) = N_{\\mathrm{prin}} ^{\\circ} = \\mathcal{A} _{\\textbf{s}_j}(\\mathbb{Z} ^t) \\oplus  T_M(\\mathbb{Z} ^t),$$\nthe point $\\mathop{\\mathrm{ord}}_{\\widetilde{D}_j}$ of\n$\\mathcal{A}_{\\mathrm{prin}}(\\mathbb{Z} ^t)$ corresponds to the point\n$(\\mathop{\\mathrm{ord}}_{D_j},0)$ of\n$\\mathcal{A} (\\mathbb{Z} ^t)\\times T_M(\\mathbb{Z} ^t)$.\n\nRecall that the index set of unfrozen indices for\n$\\mathcal{A}_{\\mathrm{prin}}$ is $I_{\\text{uf}}$. In particular, for\nevery $k \\in I_{\\text{uf}}$ we have that the $k^{\\text{th}}$ element of\n$\\widetilde{\\textbf{s}}_{j}$ is of the form $( e_{k;j},0)$, where\n$e_{k;j}$ is the $k^{\\text{th}}$ element of $\\textbf{s}_j$. Then for\neach $1\\leq j\\leq s$ we compute $$\\begin{aligned}\n\\{ (e_{k;j},0),  \\mathop{\\mathrm{ord}}_{\\widetilde{D}_j}\\} & = \\{ (e_{k;j},0), (\\mathop{\\mathrm{ord}}_{D_j},0)\\} \\\\\n& = \\{e_{k;j}, \\mathop{\\mathrm{ord}}_{D_j} \\} \\geq 0.\n\\end{aligned}$$\n\nThis tells us that $\\mathop{\\mathrm{ord}}_{\\widetilde{D}_j}$ is\noptimized for $\\widetilde{\\textbf{s}}_j$. Let\n$W_{Y\\times T_M}=\\sum_{j}^{s}\\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}^\\vee}_{\\mathop{\\mathrm{ord}}{\\widetilde{D}_j}}$\nbe the superpotential associated to\n$\\mathcal{A}_{\\mathrm{prin}}\\subset Y \\times T_M$. By Proposition 9.7\nand Lemma 9.10 (3) of the integral points of $\\Xi_{Y \\times T_M}$ can be\ndescribed as\n$$\\Xi_{Y \\times T_M}\\cap (\\mathbb{Z} ) = \\{ b \\in \\Theta(\\mathcal{A}_{\\mathrm{prin}}) \\mid \\mathop{\\mathrm{ord}}_{i(b)} (\\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}^{\\vee}}_j)\\geq 0 \\text{ for all } j\\}.$$\nWe define $\\mathop{\\mathrm{mid}}(Y\\times T_M)$ to be the vector subspace\nof $\\mathop{\\mathrm{mid}}(\\mathcal{A}_{\\mathrm{prin}})$ spanned by the\ntheta functions parametrized by $\\Xi_{Y \\times T_M}(\\mathbb{Z} ^T)$. For\nthe convenience of the reader we point out that in the notation of the\npartial compactification $Y\\times T_M$ of $\\mathcal{A}_{\\mathrm{prin}}$\nwould be denoted by $\\overline{\\mathcal{A} }_{\\text{prin}}^{S}$ and\n$\\Xi_{Y \\times T_M}(\\mathbb{Z} )$ by\n$\\Theta(\\overline{\\mathcal{A} }_{\\text{prin}}^{S})$, where\n$S:=\\{ i(\\mathop{\\mathrm{ord}}_{\\widetilde{D}_1}),\\dots , i(\\mathop{\\mathrm{ord}}_{\\widetilde{D}_s})\\}$.\nBy we have\n$$\\mathop{\\mathrm{mid}}(Y\\times T_M)=H^0(Y\\times T_M, \\mathcal{O}_{Y\\times T_M}) \\cong H^0(Y, \\mathcal{O}_{Y})\\otimes_{\\Bbbk} H^0( T_M, \\mathcal{O}_{ T_M}).$$\nIn particular, $H^0(Y\\times T_M, \\mathcal{O}_{Y\\times T_M})$ has a theta\nbasis parametrized by $\\Theta(Y\\times T_M)$. The theta function\n$\\vartheta ^{\\mathcal{A} }_{\\mathop{\\mathrm{ord}}_{D_j}}$ is obtained\nfrom $\\vartheta ^{\\mathcal{A}_{\\mathrm{prin}}}_{\\widetilde{D}_j}$ by\nspecializing the coefficients to $1$. This implies that\n$$\\Xi_{Y}\\cap \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} ^{\\vee})= \\Xi_{Y \\times T_M} \\cap \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} ^{\\vee}).$$\nWe conclude that $H^0(Y, \\mathcal{O}_Y)$ has a theta basis parametrized\nby the integral point of $\\Xi_{Y}$. ◻\n\n# Valuations on middle cluster algebras and adapted bases\n\nIn the authors noticed that the so-called **g**-vectors associated to\ncluster variables can be used to construct valuations on\n$\\Bbbk(\\mathcal{A} )$ provided $\\Gamma$ is of full-rank. In this section\nwe study some properties of these valuations. We extend this approach\nfor quotients of $\\mathcal{A}$ and (fibres of) $\\mathcal{X}$.\n\nLet $\\mathcal{V}$ be a scheme of the form\n$\\mathcal{A} , \\mathcal{X} , \\mathcal{A} /T_H$ or\n$\\mathcal{X} _{\\bf 1}$. Recall from § that every theta function on\n$\\mathcal{V}$ is labeled with a point of\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee)$, see .\n\n**Definition 40**. Suppose $\\Gamma$ is of full-rank and let\n$\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ be a seed for $\\Gamma$.\nThe **opposite dominance order** on $M^\\circ$ defined by $\\textbf{s}$ is\nthe partial order $\\preceq_{\\textbf{s}}$ on $M^\\circ$ determined by the\nfollowing condition: $$\\label{eq:dom_order}\nm_1 \\prec_{\\textbf{s}} m_2 \\ \\Leftrightarrow \\ m_2=  m_1 + p^{\\ast}_1(n) \\text{ for some }n\\in N^+_{\\operatorname{uf}, \\textbf{s}}.$$\n\n**Remark 41**. In Definition , $m_1\\preceq_{\\textbf{s}} m_2$ means that\neither $m_1 \\prec_\\textbf{s}m_2$ or $m_1=m_2$. We will also adopt this\nnotation for other orders we consider. The dominance order was\noriginally considered in and it is the opposite order to the one given\nin Definition . This order was exploited by in his work on bases for\ncluster algebras. The full-rank condition is needed so that\n$\\preceq_{\\textbf{s}}$ is reflexive. However, observe that for every\nseed $\\textbf{s}$ such that\n$\\text{ker}(p_1^*)\\cap N^+_{\\operatorname{uf}, \\textbf{s}} = \\emptyset$,\nequation still determines a partial order on $M^\\circ$ even if $\\Gamma$\nis not of full-rank. Nonetheless, whenever we talk about an (opposite)\ndominance order in this paper we will be tacitly assuming that $\\Gamma$\nis of full-rank.\n\nIt is straightforward to verify that $\\preceq_{\\textbf{s}}$ is\n**linear**. That is, $m_1 \\preceq_\\textbf{s}m_2$ implies that\n$m_1 + m \\preceq_\\textbf{s}m_2 + m$ for all $m \\in M^\\circ$.\n\n**Definition 42**. Let $A$ be an integral domain with a $\\Bbbk$-algebra\nstructure, $L$ a lattice isomorphic to $\\mathbb{Z} ^r$ and $\\leq$ a\ntotal order on $L$. A **valuation** on $A$ with values in $L$ is a\nfunction $\\nu : A\\setminus \\{0 \\} \\to (L,<)$ such that\n\n-   $\\nu(f+g) \\geq  \\min\\{\\nu(f), \\nu(g)\\}$, unless $f+g=0$,\n\n-   $\\nu(fg)= \\nu(f) + \\nu(g)$,\n\n-   $\\nu(cf)=\\nu(f)$ for all $c \\in \\Bbbk^*$.\n\nFor $l \\in L$ we define the subspace\n$A_{\\nu \\geq l}:= \\{ x\\in A \\setminus \\{0\\} \\mid \\nu(x)\\geq l\\} \\cup \\{ 0 \\}$\nof $A$. The subspace $A_{\\nu > l}$ is defined analogously. We say that\n$\\nu$ has **1-dimensional leaves** if the dimension of the quotient\n$$\\label{eq:graded_piece}\nA_l:=A_{\\nu \\geq l} \\big{/} A_{\\nu > l}$$ is either $0$ or $1$ for all\n$l\\in L$. A basis $B$ of $A$ is **adapted** for $\\nu$ if for all\n$l\\in L$ the set $B\\cap A_{\\nu \\geq l}$ is a basis of $A_{\\nu\\geq l}$.\n\n**Lemma 43**.\n\nAssume $\\Gamma$ is of full-rank. Let\n$\\vartheta^{\\mathcal{A} }_{m_1},\\vartheta^{\\mathcal{A} }_{m_2}\\in \\text{mid}(\\mathcal{A} )$\nwith\n$m_1,m_2\\in =\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee})=M^\\circ$.\nThen the product\n$\\vartheta^{\\mathcal{A} }_{m_1}\\vartheta^{\\mathcal{A} }_{m_2}$ expressed\nin the theta basis of $\\text{mid}(\\mathcal{A} )$ has the following form\n$$\\vartheta^{\\mathcal{A} }_{m_1}\\vartheta^{\\mathcal{A} }_{m_2}= \\vartheta^{\\mathcal{A} }_{m_1+m_2}+ \\sum_{m_1+m_2 \\prec_{\\textbf{s}}  m}c_{m}\\vartheta^{\\mathcal{A} }_{m}.$$\n\n*Proof.* First notice that for any broken line $\\gamma$ we have that\n$$F(\\gamma)=I(\\gamma) +a_{1}p^*_1(n_{1})+ \\dots + a_{r}p^*_1(n_{r}),$$\nwhere $a_1, \\dots , a_r$ are non-negative integers and\n$n_1, \\dots , n_r \\in N^+_{\\operatorname{uf}, \\textbf{s}}$. This follows\nfrom and the bending rule of broken lines (*i.e.* (4)). In particular,\nwe have that\n$a_{1}n_{1}+ \\dots + a_{r}n_{r}\\in N^+_{\\operatorname{uf}, \\textbf{s}} \\cup \\{ 0 \\}$.\nMoreover, $a_{1}p^*_1(n_{1})+ \\dots + a_{r}p^*_1(n_{r}) = 0$ if and only\nif $a_1=\\cdots = a_r =0$. Therefore,\n$I(\\gamma) \\preceq_{\\textbf{s}} F(\\gamma)$ and $I(\\gamma)=F(\\gamma)$ if\nand only if $\\gamma$ does not bend at all.\n\nThe statement we want to prove already follows from the observations\nmade above. Indeed, by we know that $\\alpha(m_1,m_2,m)\\neq 0$ if and\nonly if there exist broken lines $\\gamma_1$ and $\\gamma_2$ such that\n$I(\\gamma_i)=m_i$ for $i \\in \\{1,2\\}$ and\n$F(\\gamma_1)+F(\\gamma_2)=m=\\gamma_1(0)=\\gamma_2(0)$. Therefore, if\n$\\alpha(m_1,m_2,m)\\neq 0$ then\n$m_1 + m_2=I(\\gamma_1) + I(\\gamma_2) \\preceq_{\\textbf{s}} m$. Moreover,\nthe equality $m_1+ m_2=m$ holds if and only if both $\\gamma_1$ and\n$\\gamma_2$ do not bend at all. This latter case can be realized in a\nunique way, therefore, $\\alpha(m_1,m_2,m_1+m_2)=1$. ◻\n\nFrom now on the symbol $\\leq_{\\textbf{s}}$ is used to denote a total\norder on $M^\\circ$ refining $\\preceq_{\\textbf{s}}$.\n\n**Definition 44**. Let\n${\\bf m}=(m_{\\textbf{s}^\\vee})\\in \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} ^{\\vee})$.\nThe **g-vector of** $\\vartheta ^{\\mathcal{A} }_{\\bf m}$ **with respect\nto** $\\textbf{s}$ is $$\\label{eq:red-g-val-A}\n{\\bf g}_{\\textbf{s}}\\left(\\vartheta ^{\\mathcal{A} }_{\\bf m}\\right):= m_{\\textbf{s}^\\vee}\n\\in \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee}).$$\n\n**Definition 45**.\n\nAssume $\\Gamma$ is of full-rank and think of $M^{\\circ}$ as\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee})$.\nLet\n$\\mathbf{g} _{\\textbf{s}}:\\mathop{\\mathrm{mid}}(\\mathcal{A} ) \\setminus \\{ 0\\} \\to (M^{\\circ},\\leq_{\\textbf{s}})$\nbe the map given by $$\\label{eq:g_val}\n    \\mathbf{g} _{\\textbf{s}}(f):= \\min{}_{\\leq_{\\textbf{s}}}\\{m_1, \\dots , m_t\\},$$\nwhere\n$f=c_1\\vartheta^{\\mathcal{A} }_{m_1} + \\dots + c_t\\vartheta^{\\mathcal{A} }_{m_t}$,\n$m_j\\in M^\\circ$ and $c_j\\not=0$ for all $j=1,\\dots,t$ is the expression\nof $f$ in the theta basis of $\\text{mid}(\\mathcal{A} )$.\n\n**Lemma 46**.\n\nFor every seed $\\textbf{s}$ the map $\\mathbf{g} _{\\textbf{s}}$ is a\nvaluation on $\\mathop{\\mathrm{mid}}(\\mathcal{A} )$ with 1-dimensional\nleaves and the theta basis\n$\\{ \\vartheta _{m} \\mid m\\in \\Theta (\\mathcal{A} ) \\}$ is adapted for\n$\\mathbf{g} _{\\textbf{s}}$.\n\n*Proof.* This statement follows from but for the convenience of the\nreader we give a proof here. Items (1) and (3) of Definition  follow\ndirectly from the definition of $\\mathbf{g} _{\\textbf{s}}$. For item (2)\nconsider the expressions\n$f=\\sum_{i=1}^r c_i\\vartheta^{\\mathcal{A} }_{m_i}$ and\n$g=\\sum_{j=1}^s c'_j\\vartheta^{\\mathcal{A} }_{m'_j}$ where all $c_i$ and\n$c'_j$ are non-zero. Then by $$\\begin{aligned}\n\\label{eq:fg in basis}\nfg=\\sum_{i,j} c_ic'_j\\left(\\vartheta^{\\mathcal{A} }_{m_i+m'_j} + \\sum_{m_i+m'_j\\prec_{\\textbf{s}} m}c_{m}\\vartheta^{\\mathcal{A} }_{m}\\right).\n\\end{aligned}$$ By definition of $\\mathbf{g} _{\\textbf{s}}$ we have\n$m_\\mu:=\\mathbf{g} _{\\textbf{s}}(f)\\prec_{\\textbf{s}} m_i$ for all\n$i\\in \\{1,\\dots, r\\} \\setminus \\{ \\mu \\}$ and\n$m'_\\nu:=\\mathbf{g} _{\\textbf{s}}(g)\\prec_{\\textbf{s}} m'_j$ for all\n$j\\in \\{1,\\dots,s\\}\\setminus \\{\\nu\\}$. We need to show that the term\n$\\vartheta_{m_\\mu+m'_\\nu}$ appears with non-zero coefficient in $fg$.\nAssume there exist $i\\not =\\mu$ and $j\\not=\\nu$ such that\n$m_\\mu+m'_\\nu=m_i+m'_j$. Then as $\\prec_{\\textbf{s}}$ is linear we have\n$$m_\\mu +m'_\\nu \\prec_{\\textbf{s}} m_\\mu + m'_j \\prec_{\\textbf{s}} m_i + m'_j,$$\na contradiction. Hence, the term $\\vartheta_{m_\\mu+m'_\\nu}$ appears in\nthe expression of $fg$ with coefficient $c_\\mu c'_\\nu\\not =0$ and\n$\\mathbf{g} _{\\textbf{s}}(fg)=m_\\mu+m'_\\nu=\\mathbf{g} _{\\textbf{s}}(f)+\\mathbf{g} _{\\textbf{s}}(g)$.\n\nThe fact that ${\\bf g}_{\\textbf{s}}$ has one dimension leaves follows\ndirectly from (). It is also clear from the definitions that for\n$m\\in M^{\\circ}$ the subspace $\\mathop{\\mathrm{mid}}(\\mathcal{A} )_{m}$\nas in () is isomorphic to $\\Bbbk\\cdot \\vartheta ^{\\mathcal{A} }_{m}$ if\n$m \\in \\Theta(\\mathcal{A} )$ and $0$-dimensional otherwise. In\nparticular, the fact that we have a bijection between the set of values\nof $\\mathbf{g} _\\textbf{s}$ and the elements of the theta basis is\nequivalent to the theta basis being an adapted basis, see . ◻\n\n**Corollary 47**. The image of the valuation ${\\bf g}_{\\textbf{s}}$ is\nindependent of the linear refinement $\\leq_{\\textbf{s}}$ of\n$\\preceq_{\\textbf{s}}$.\n\n*Proof.* Since the theta basis is adapted for ${\\bf g}_{\\textbf{s}}$ we\nhave\n$${\\bf g}_{\\textbf{s}}\\left(\\mathop{\\mathrm{mid}}(\\mathcal{A} )\\setminus \\{0\\}\\right)= {\\bf g}_{\\textbf{s}}\\left(\\Theta(\\mathcal{A} )\\right).$$\nThe result follows. ◻\n\n**Remark 48**.\n\nSince $\\mathop{\\mathrm{mid}}(\\mathcal{A} )$ is a domain (see Remark )\nwhose associated field of fractions is isomorphic to\n$\\Bbbk(A_i :i \\in I)$, we can extend the valuation\n${\\bf g}_{\\mathbf{s}}$ on $\\text{mid}(\\mathcal{A} )$ to a valuation on\n$\\Bbbk(A_i :i \\in I)$ by declaring\n${\\bf g}_{\\mathbf{s}} (f/g):={\\bf g}_{\\mathbf{s}} (f)- {\\bf g}_{\\mathbf{s}} (g)$.\n\nThe valuation ${\\bf g}_{\\textbf{s}}$ is called the ****g**-vector\nvaluation associated to $\\textbf{s}$**.\n\nWe now turn our attention to quotients of $\\mathcal{A}$. We keep the\nassumption that $\\Gamma$ is of full-rank and consider a saturated\nsublattice $H=H_{\\mathcal{A} }$ of $K^\\circ$. Recall from § that\n$$\\mathrm{Trop} _{\\mathbb{Z} }((\\mathcal{A} /T_H)^{\\vee}_{\\textbf{s}^\\vee})= H^{\\perp}.$$\nSince $\\Theta(\\mathcal{A} /T_{H})_{\\textbf{s}^\\vee}\\subset  H^ \\perp$,\nwe can restrict restrict the total order $\\leq_{\\textbf{s}}$ on\n$M^{\\circ}$ to $H^{\\perp}$ to obtain a **g**-vector valuation on\n$\\mathop{\\mathrm{mid}}(\\mathcal{A} /T_{H})$ associated to $\\textbf{s}$\nas in the previous cases:\n$${\\bf g}_{\\textbf{s}}: \\mathop{\\mathrm{mid}}(\\mathcal{A} /T_H)\\setminus\\{0\\} \\to \\mathrm{Trop} _{\\mathbb{Z} }((\\mathcal{A} /T_H)^{\\vee}_{\\textbf{s}^\\vee}).$$\n\n**Remark 49**. As opposed to the case of $\\mathcal{A}$, in general the\nfield of fractions of $\\mathop{\\mathrm{mid}}(\\mathcal{A} /T_H)$ might\nnot be isomorphic to $\\Bbbk(\\mathcal{A} /T_H)$. This fails for example\nif the smallest cone in\n$\\mathrm{Trop} _{\\mathbb{R} }((\\mathcal{A} /T_H)^{\\vee}_{\\textbf{s}^\\vee})$\ncontaining $\\Theta(\\mathcal{A} /T_H)_{\\textbf{s}^\\vee}$ is not\nfull-dimensional. However, the field of fractions of\n$\\mathop{\\mathrm{mid}}(\\mathcal{A} /T_H)$ is isomorphic to\n$\\Bbbk(\\mathcal{A} /T_H)$ provided $\\mathcal{A} /T_H$ satisfies the full\nFock–Goncharov conjecture. In such a case, a **g**-vector valuation on\n$\\mathop{\\mathrm{mid}}(\\mathcal{A} /T_H)$ can be extended to\n$\\Bbbk(\\mathcal{A} /T_H)$ as in .\n\nWe now treat the case of $\\mathcal{X}$. So fix a cluster ensemble\nlattice map $p^*:N \\to M^{\\circ}$ and a seed $\\textbf{s}$. Consider the\nidentifications\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{X} ^\\vee_{\\textbf{s}})= d\\cdot N$\nand\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} ^{\\vee}_{{\\mathrm{prin}} ,\\widetilde{\\textbf{s}}^\\vee}) = M^{\\circ}_{{\\mathrm{prin}} }=M^{\\circ}\\oplus N$\nwhere $\\widetilde{\\textbf{s}}$ is the seed for $\\Gamma_{\\mathrm{prin}}$\nobtained mutating $\\textbf{s}_{0_{\\mathrm{prin}} }$ in the same sequence\nof directions needed to obtain $\\textbf{s}$ from $\\textbf{s}_0$. Recall\nfrom § that we have an inclusion\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{X} ^\\vee_{\\textbf{s}})\\to \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} ^{\\vee}_{{\\mathrm{prin}} ,\\widetilde{\\textbf{s}}^\\vee})$\ngiven by $dn \\mapsto (p^*(n),n)$.\n\n**Definition 50**. Let\n${\\bf n}=(dn_{\\textbf{s}^\\vee})\\in \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{X} ^{\\vee})$.\nThe **c-vector of** $\\vartheta ^{\\mathcal{X} }_{\\bf n}$ with respect to\n$\\textbf{s}$ is $$\\label{eq:red-g-val-X}\n{\\bf c}_{\\textbf{s}}\\left(\\vartheta ^{\\mathcal{X} }_{\\bf n}\\right):= dn_{\\textbf{s}^\\vee}\n\\in \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{X} ^{\\vee}_{\\textbf{s}^\\vee}).$$\n\n**Remark 51**. Observe that\n$\\mathbf{c} _\\textbf{s}(\\vartheta ^{\\mathcal{X} }_{\\bf n})$ is an\nelement of $d\\cdot N$. In practice we could work with the lattice $N$ as\nopposed to $d\\cdot N$ as they are canonically isomorphic. The lattice\n$N$ is the set where the ${\\bf c}$-vectors (in the sense of ) live.\n\n**Definition 52**. The **divisibility order** on $N$ determined by\n$\\textbf{s}$ is the partial order $\\preceq_{\\textbf{s}, \\text{div}}$\ngiven by\n$$n_1 \\preceq_{\\textbf{s}, \\text{div}} n_2 \\text{ if and only if } \nn_2- n_1 \\in N_{\\textbf{s}}^+.$$\n\n**Lemma 53**.\n\nThe restriction of $\\preceq_{\\widetilde{\\textbf{s}}^\\vee}$ to the $N$\ncomponent of $M^\\circ_{{\\mathrm{prin}} }$ coincides with the\ndivisibility order $\\prec_{\\textbf{s},\\text{div}}$ on $N$.\n\n*Proof.* Let\n$p^*_{{\\mathrm{prin}} ,1}:N_{\\operatorname{uf}, {\\mathrm{prin}} }\\to M^\\circ_{\\mathrm{prin}}$\nbe the given by $(n,m)\\mapsto \\{ (n,m), \\cdot \\}_{{\\mathrm{prin}} }$ (in\nother words, $p^*_{{\\mathrm{prin}} ,1}$ corresponds to the map $p_1^*$\nin for $\\Gamma_{{\\mathrm{prin}} }$). In particular,\n$p^*_{{\\mathrm{prin}} ,1} (n,0) = (p^*_1(n), n)$. Let $n_1,n_2 \\in N$ be\ndistinct elements such that $n_2-n_1 \\in N^+_\\textbf{s}$. Let\n$\\widetilde{m}_i=(p_1(n_i),n_i)$ for $i = 1,2$. Then\n$\\widetilde{m}_2 -\\widetilde{m}_1= (p^*_1(n_2-n_1), n_2 -n_1)$. The\nresult follows. ◻\n\nThe next result follows at once from and .\n\n**Lemma 54**. Let\n$\\vartheta ^{\\mathcal{X} }_{dn_1},\\vartheta ^{\\mathcal{X} }_{dn_2} \\in \\mathop{\\mathrm{mid}}(\\mathcal{X} )$\nwith\n$d_1n_1, d_2n_2 \\in \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{X} ^\\vee_{\\textbf{s}^\\vee})=d\\cdot N$.\nThen the product\n$\\vartheta^{\\mathcal{X} }_{dn_1}\\vartheta^{\\mathcal{X} }_{dn_2}$\nexpressed in the theta basis of $\\mathop{\\mathrm{mid}}(\\mathcal{X} )$ is\nof the following form\n$$\\vartheta^{\\mathcal{X} }_{dn_1}\\vartheta^{\\mathcal{X} }_{dn_2}= \\vartheta^{\\mathcal{X} }_{dn_1+dn_2}+ \\sum_{n_1+n_2 \\ \\prec_{\\textbf{s}, \\text{div}} \\ n} c_{n}\\vartheta^{\\mathcal{X} }_{dn}.$$\n\nFrom now on we let $\\leq_{\\textbf{s},\\text{div}}$ be any total order\nrefining $\\preceq_{\\textbf{s}, \\text{div}}$.\n\n**Corollary 55**. Let\n${\\bf c}_{\\textbf{s}}:\\mathop{\\mathrm{mid}}(\\mathcal{X} ) \\setminus \\{ 0\\} \\to (d \\cdot N,\\leq_{\\textbf{s},\\text{div}})$\nbe the map defined by\n$${\\bf c }_{\\textbf{s}}(f):= \\min{}_{\\leq_{\\textbf{s},\\text{div}}}\\{n_1, \\dots , n_t\\},$$\nwhere\n$f=c_1\\vartheta ^{\\mathcal{X} }_{d n_1} + \\dots + c_t\\vartheta ^{\\mathcal{X} }_{d n_t}$\nis the expression of $f$ in the theta basis of\n$\\text{mid}(\\mathcal{X} )$. Then ${\\bf c }_{\\textbf{s}}$ is a valuation\nwith 1-dimensional leaves and the theta basis for\n$\\mathop{\\mathrm{mid}}(\\mathcal{X} )$ is adapted for\n${\\bf c}_\\textbf{s}$.\n\nWe now let $\\mathcal{X} _{\\bf 1}$ be the fibre of $\\mathcal{X}$\nassociated to a sublattice $H:= H_{\\mathcal{X} } \\subset K$. In order to\ndefine a **c**-vector valuation on\n$\\mathop{\\mathrm{mid}}(\\mathcal{X} _{\\bf 1})$ we need that\n$$H\\cap N^+_{\\textbf{s}}= \\emptyset.$$ Since, if this condition holds,\n$\\preceq_{\\textbf{s}, \\text{div}}$ induces a well partial order on\n$N/H =\\mathcal X_{\\bf 1,\\textbf{s}}$ defined as\n$$n_1 + H \\preceq_{\\textbf{s}, \\text{div}} n_2+H \\quad \\text{ if and only if } \\quad n_2 - n_1 \\in N^+_{\\textbf{s}}+ H.$$\nThe rest of the construction follows from the cases already treated.\n\n**Lemma 56**. Suppose $\\Gamma$ is of full-rank and let\n$p: \\mathcal{A} \\to \\mathcal{X}$ be a cluster ensemble map. Then we have\na commutative diagram $$\\xymatrix{\n\\mathop{\\mathrm{mid}}(\\mathcal{X} ) \\setminus \\{0\\} \\ar^{p^*}[r] \\ar_{{\\bf c}_{\\textbf{s}}}[d] &  \\mathop{\\mathrm{mid}}(\\mathcal{A} ) \\setminus \\{0\\} \\ar^{{\\bf g}_{\\textbf{s}}}[d] \\\\\n\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{X} ^{\\vee}_{\\textbf{s}^\\vee}) \\ar_{(p^\\vee)^T\\circ i} [r] & \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee}) \n}$$\n\n*Proof.* It is enough to show that for\n${\\bf n} \\in \\Theta(\\mathcal{X} )$ we have\n$$\\mathbf{g} _{\\textbf{s}}(p^*(\\vartheta ^\\mathcal{X} _{\\bf n}))=(p^\\vee)^T\\circ i({\\bf c}_{\\textbf{s}} (\\vartheta ^\\mathcal{X} _{\\bf n}))$$\nLet $dn=\\mathfrak{r}_{\\textbf{s}^\\vee}({\\bf n})$. We have that\n$$\\vartheta ^{\\mathcal{X} }_{dn}=z^n + \\sum_{n\\prec_{\\textbf{s}}n'}a_{n'}z^{n'}.$$\nTherefore,\n$$p^*(\\vartheta ^{\\mathcal{X} }_{dn})=z^{p^*(n)} + \\sum_{n<_{\\textbf{s}, \\text{div}}n'}a_{n'}z^{p^*(n')}.$$\nWe conclude that\n$\\mathbf{g} _{\\textbf{s}}(p^*(\\vartheta ^\\mathcal{X} _{\\bf n}))=p^*(n)$.\nOn the other hand we have that\n${\\bf c}_{\\textbf{s}} (\\vartheta ^\\mathcal{X} _{\\bf n})=dn$. We compute\n$$\\begin{aligned}\n    (p^\\vee)^T\\circ i (dn)= ((p^\\vee)^*)^*(-dn)=\\left(-\\frac{1}{d}(p^*)^*)\\right)^*(-dn)=p^*(n).\n\\end{aligned}$$ The claim follows. ◻\n\nWe would like to treat **g**-vector valuations for varieties of the form\n$\\mathcal{A}$ and $\\mathcal{A} /T_H$ and **c**-vector valuations on\n$\\mathcal{X}$ and $\\mathcal{X} _{\\bf 1}$ in a uniform way. With this in\nmind we introduce the following notation.\n\n**Notation 57**.\n\nLet $\\mathcal{V}$ be a cluster variety and $\\mathcal{V} ^{\\vee}$ its\nFock–Goncharov dual. The cluster valuation on\n$\\mathop{\\mathrm{mid}}(\\mathcal{V} )$ associated to a seed\n$\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ is\n$$\\nu_{\\textbf{s}}:\\mathop{\\mathrm{mid}}(\\mathcal{V} )\\setminus\\{0\\} \\to (\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee_{\\textbf{s}^\\vee}), <_{\\textbf{s}}),$$\nwhere\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee_{\\textbf{s}^\\vee})$ is\nas in and $<_{\\textbf{s}}$ is a linear order on\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee_{\\textbf{s}^\\vee})$\nrefining $\\prec_\\textbf{s}$ in case $\\mathcal{V} =\\mathcal{A}$ or\n$\\mathcal{A} /T_H$ and it refines $\\prec_{\\textbf{s},\\text{div}}$ if\n$\\mathcal{V} =\\mathcal{X}$ or $\\mathcal{X} _{\\bf 1}$.\n\n# Newton–Okounkov bodies\n\nIn this section we provide a general approach to construct\nNewton–Okounkov bodies associated to certain partial minimal models of\nvarieties with a cluster structure. In particular, we treat a situation\nthat often arises in representation theory where the universal torsor of\na projective variety has a cluster structure of type $\\mathcal{A}$. The\nNewton–Okounkov bodies we construct depend on the choice of an initial\nseed. Hence we discuss how the bodies associated to different choices of\ninitial seed are related and introduce the intrinsic Newton–Okounkov\nbody which is seed independent.\n\n## Schemes and ensembles with cluster structure\n\n**Definition 58**.\n\nWe say a smooth scheme (over $\\Bbbk$) $V$ **can be endowed with cluster\nstructure of type** $\\mathcal{V}$ if there is a birational map\n$\\Phi: \\mathcal{V} \\dashrightarrow V$ which is an isomorphism outside a\ncodimension two subscheme of the domain and range. In this setting, we\nsay that the pair $(V,\\Phi)$ is **a scheme with cluster structure of\ntype** $\\mathcal{V}$.\n\n**Remark 59**. We are straying slightly from in . Specifically, we are\nnow including $\\Phi$ as part of the data defining a scheme with cluster\nstructure. So, given two different birational maps\n$\\Phi_1:\\mathcal{V} _1 \\dashrightarrow V$ and\n$\\Phi_2: \\mathcal{V} _2 \\dashrightarrow V$ as in , we now consider\n$(V,\\Phi_1)$ and $(V,\\Phi_2)$ different as schemes with cluster\nstructure (as is the case, for example, for open positroid varieties,\nsee Remark ). Nevertheless, when the map $\\Phi$ is clear from the\ncontext or we are just dealing with a single birational map\n$\\mathcal{V} \\dashrightarrow V$, we will simply say that $V$ has a\ncluster structure of type $\\mathcal{V}$.\n\nLet $V=(V,\\Phi)$ be a scheme with a cluster structure of type\n$\\mathcal{V}$. Since $V$ is normal and isomorphic to $\\mathcal{V}$ up to\nco-dimension $2$ then $V$ and $\\mathcal{V}$ have isomorphic rings of\nregular functions. In turn, we can talk about polynomial theta functions\non $V$ which we denote by $\\vartheta ^V_{\\bf v}$ for\n${\\bf v}\\in \\Theta (\\mathcal{V} )$. Moreover, recall that $\\mathcal{V}$\nis log Calabi–Yau. By $V$ is also log Calabi–Yau. Hence, $V$ has a\ncanonical volume form whose pullback by $\\Phi$ coincides with the\ncanonical volume form on $\\mathcal{V}$. Moreover, a (partial) minimal\nmodel $V\\subset Y$ and its boundary can be defined as in Definition .\n\n**Definition 60**. An inclusion $V \\subset Y$ as an open subscheme of a\nnormal variety $Y$ is a **partial minimal model** of $V$ if the\ncanonical volume form on $V$ has a simple pole along every irreducible\ndivisor of $Y$ contained in $Y \\setminus V$. It is a **minimal model**\nif $Y$ is, in addition, projective. We call $Y \\setminus V$ the\n**boundary** of $V  \\subset Y$.\n\n**Definition 61**. Suppose $\\Phi:\\mathcal{V} \\dashrightarrow V$ endows\n$V$ with a cluster structure of type $\\mathcal{V}$ and that the cluster\nvaluation $\\nu_{\\textbf{s}}$ extends to $\\Bbbk(\\mathcal{V} )$. Then the\n**cluster valuation**\n$\\nu^{\\Phi}_{\\textbf{s}}:\\Bbbk(V)^*\\to \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee)$\nis given by\n$$\\nu^{\\Phi}_{\\textbf{s}}(f)=  \\nu_{\\textbf{s}}(\\Phi^*(f)).$$\n\n**Definition 62**. Suppose\n$\\Phi_{\\mathcal{A} }:\\mathcal{A} \\dashrightarrow V_1$ and\n$\\Phi_{\\mathcal{X} }:\\mathcal{X} \\dashrightarrow V_2$ endow $V_1$ (resp.\n$V_2$) with cluster structures of type $\\mathcal{A}$ (resp.\n$\\mathcal{X}$). We say that $V_1 \\overset{\\tau}{\\to} V_2$ is a cluster\nensemble structure if there exists a cluster ensemble map\n$p:\\mathcal{A} \\to \\mathcal{X}$ such that the following diagram commutes\n$$\\xymatrix{\n    V_1 \\ar^{\\tau}[r] & V_2 \\\\\n    \\mathcal{A} \\ar@{-->}^{\\Phi_{\\mathcal{A} }}[u] \\ar_p[r] & \\mathcal{X} \\ar@{-->}_{\\Phi_{\\mathcal{X} }}[u].\n    }$$\n\n## Newton–Okounkov bodies for Weil divisors supported on the boundary\n\nThroughout this section we let $\\mathcal{V}$ be a scheme of the form\n$\\mathcal{A}$, $\\mathcal{X}$, $\\mathcal{A} /T_{H}$ or\n$\\mathcal{X} _{{\\bf 1}}$. Whenever we talk about a cluster valuation on\n$\\mathop{\\mathrm{mid}}(\\mathcal{V} )$ we are implicitly assuming we are\nin a setting where such valuation exist, see §.\n\n**Definition 63**. A closed subset\n$S\\subseteq \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$ is\n**positive** if for any positive integers $d_1, d_2$, any\n$p_1\\in d_1\\cdot S(\\mathbb{Z} )$, $p_2\\in d_2\\cdot S(\\mathbb{Z} )$ and\nany $r \\in \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$ such that\n$\\alpha (p_1,p_2,r)\\neq 0$, we have that\n$r \\in (d_1 +d_2)\\cdot S(\\mathbb{Z} )$.\n\n**Remark 64**. We can also define positive sets inside\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})_{\\textbf{s}^\\vee}$ in\nexactly the same way they are defined in Definition . In particular we\nhave that $S\\subset \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$\nis positive if and only if\n$\\mathfrak{r}_{\\textbf{s}^\\vee}(S)\\subset \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^\\vee_{\\textbf{s}})$\nis positive.\n\nIn the authors discuss how positive sets give rise to both, partial\nminimal models of cluster varieties and toric degenerations of such. In\nthis section we study the inverse problem. Namely, we let $(V,\\Phi)$ be\na scheme with a cluster structure of type $\\mathcal{V}$ and construct\nNewton–Okounkov bodies associated to a partial minimal model\n$V \\subset Y$ (see §). Then we show that under suitable hypotheses these\nNewton–Okounkov bodies are positive sets. We let $D_1, \\dots , D_s$ be\nthe irreducible divisors of $Y$ contained in the boundary of\n$V\\subset Y$ and let $D:=\\bigcup_{j=1}^s D_j$.\n\nGiven a Weil divisor $D'$ on $Y$ we denote by $R(D')$ the associated\n**section ring**. Recall that $R(D')$ can be described as the\n$\\mathbb{Z} _{\\geq 0}$-graded ring whose $k^{\\mathrm{th}}$ homogeneous\ncomponent is\n$$R_k(D') := H^0(Y, \\mathcal{O}(kD'))= \\left\\{  f\\in \\Bbbk(Y)^* \\mid \\text{div}(f)+kD'\\geq 0 \\right\\}\\cup \\{ 0\\},$$\nwhere $\\text{div}(f)$ is the principal divisor associated to $f$. Even\nmore concretely, if $D'=c_1  D'_1 + \\cdots + c_{s'}D'_{s'}$, where\n$D'_1, \\dots , D'_{s'}$ are distinct prime divisors of $Y$ and\n$c_1, \\dots , c_{s'}$ are non-negative integers, then $R_k(D')$ is the\nvector space consisting of the rational functions on $Y$ that are\nregular on the complement of $\\bigcup_{j=1}^{s'} D'_j$ and whose order\nof vanishing along every prime divisor $D'_j$ is bounded below by\n$-kc_j$. The multiplication of $R(D')$ is induced by the multiplication\non $\\Bbbk(Y)$.\n\n**Definition 65**.\n\nLet $\\nu:\\Bbbk(Y)\\setminus \\{ 0 \\} \\to L$ be a valuation, where\n$(L, < )$ is a linearly ordered lattice. Let $D'$ be a Weil divisor on\n$Y$ having a non-zero global section. For a choice of non-zero section\n$\\tau \\in R_1 (D')$ the associated **Newton–Okounkov body** is\n$$\\begin{split} \n\\Delta_\\nu(D',\\tau) := \\overline{\\mathop{\\mathrm{conv}}\\Bigg( \\bigcup_{k\\geq 1}  \\left\\{\\frac{\\nu\\left(f/\\tau^k\\right)}{k} \\mid f\\in R_k(D')\\setminus \\{0\\} \\right\\} \\Bigg) }\\subseteq L\\otimes \\mathbb{R} ,\n \\end{split}$$ where $\\mathop{\\mathrm{conv}}$ denotes the convex hull\nand the closure is taken with respect to the standard topology of\n$L\\otimes \\mathbb{R}$.\n\nFrom now on we assume that $D'$ has a non-zero global section. We would\nlike to use a cluster valuation\n$\\nu^{\\Phi}_{\\textbf{s}}: \\Bbbk(V)\\setminus \\{ 0\\} \\to (\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee}),<_{\\textbf{s}})$\nto construct Newton–Okounkov bodies. Notice that if $\\mathcal{V}$\nsatisfies the full Fock–Goncharov conjecture, then it is possible to do\nso as we can extend $\\nu_{\\textbf{s}}$ from\n$\\mathop{\\mathrm{mid}}(\\mathcal{V} )=\\mathop{\\mathrm{up}}(\\mathcal{V} )$\nto $\\Bbbk(\\mathcal{V} ) = \\Bbbk(Y)$. Observe, moreover, that if $D'$ is\nsupported on $D$ (that is $D'=\\sum_{j=1}^s c_jD_j$ for some integers\n$c_1,\\dots , c_s$) then every graded piece $R_k(D')$ is contained in\n$H^{0}(V,\\mathcal{O}_V)\\cong H^{0}(\\mathcal{V} ,\\mathcal{O}_{\\mathcal{V} })$,\nso elements of $R_k(D')$ can be described using the theta basis for\n$H^0(\\mathcal{V} ,\\mathcal{O}_{\\mathcal{V} })$. Moreover,\n$\\mathop{\\mathrm{ord}}_{D_j}\\in \\mathcal{V} (\\mathbb{Z} ^t)$, so we can\ndefine $\\vartheta ^{\\mathcal{V} }_j$ as in .\n\n**Definition 66**. Assume $\\mathcal{V}$ satisfies the full\nFock–Goncharov conjecture and that $D'$ is of the form\n$D'=\\sum_{j=1}^s c_jD_j$. We say that $R(D')$ **has a graded theta\nbasis** if for every integer $k\\geq 0$ the set of theta functions on\n$\\mathcal{V}$ parametrized by the integral points of\n$$P_k(D'):= \\bigcap_{j=1}^s \\left\\{b\\in \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee}) \\mid  \\mathrm{Trop} _{\\mathbb{R} }(\\vartheta ^{\\mathcal{V} ^\\vee}_j)(b) \\geq -kc_j\\right\\}$$\nis a basis for $R_k(D')$.\n\nThe reader should notice that in case $\\mathcal{V}$ has theta\nreciprocity (see Definition ), then the definition of $P_k(D')$ becomes\nvery natural from the perspective of toric geometry, see §. We now\nintroduce a notion that allows us to make a good choice for the section\n$\\tau$.\n\n**Definition 67**. A subset $L\\subset \\Theta(\\mathcal{V} )$ is\n**linear** if\n\n-   for any $a,b\\in L$ there exists a unique $r\\in\\Theta(\\mathcal{V} )$\n    such that $\\alpha(a,b,r)\\neq 0$ and moreover, $r\\in L$,\n\n-   for each $a\\in L$ there exists a unique $b\\in L$ such that\n    $\\vartheta ^{\\mathcal{V} }_a \\vartheta ^{\\mathcal{V} }_b=1$.\n\nWe further say that a linear subset $L$ **acts linearly** on\n$\\Theta(\\mathcal{V} )$ if for any $a\\in L$ and\n$b \\in \\Theta(\\mathcal{V} )$ there exists a unique\n$r\\in \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$ such that\n$\\alpha(a,b,r)\\neq 0$.\n\nFor example, if $\\mathcal{V} =\\mathcal{A}$ then\n$\\mathfrak{r}_{\\textbf{s}}^{-1}(N_{\\text{uf}}^\\perp)$ is linear and acts\nlinearly on $\\Theta(\\mathcal{V} )$. If $\\mathcal{V} =\\mathcal{X}$ then\n$\\mathfrak{r}_{\\textbf{s}}^{-1}(\\ker(p_2^*))$ is linear and acts\nlinearly on $\\Theta(\\mathcal{V} )$.\n\n**Theorem 68**. Let $V\\subset Y$ be a partial minimal model. Assume the\nfull Fock–Goncharov conjecture holds for $\\mathcal{V}$. Let\n$D'=\\sum_{j=1}^s c_j D_j$ be a Weil divisor on $Y$ supported on $D$ such\nthat $R(D')$ has a graded theta basis. Let $\\tau\\in R_1(D')$ be such\nthat $\\nu^{\\Phi}_{\\textbf{s}}(\\tau)$ belongs to a linear subset of\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$ acting linearly on\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$. Then the\nNewton–Okounkov body\n$\\Delta_{\\nu^{\\Phi}_{\\textbf{s}}}(D',\\tau)\\subset \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee})$\nis a positive set.\n\n*Proof.* To make notation lighter, throughout this proof we denote\n$\\Delta_{\\nu_{\\textbf{s}}}(D',\\tau)$ simply by $\\Delta$,\n$P_k(D')_\\textbf{s}$ by $P_k$ and $\\nu^{\\Phi}_{\\textbf{s}}$ by\n$\\nu_{\\textbf{s}}$. We work in the lattice identification\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee})$ of\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$. The linear subset\nof the statement corresponds to a sublattice\n$L \\subseteq \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee})$.\n\nConsider $d_1, d_2 \\in \\mathbb{Z} _{>0}$ and\n$p_1\\in d_1\\Delta(\\mathbb{Z} )$, $p_2\\in d_2\\Delta(\\mathbb{Z} )$. We\nhave to show that for any\n$r \\in \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee})$\nwith $\\alpha (p_1,p_2,r)\\neq 0$ then\n$r \\in (d_1 +d_2)\\Delta(\\mathbb{Z} )$. For this it is enough to show\nthat $k\\Delta = P_k - k\\nu_{\\textbf{s}}(\\tau)$ for all\n$k \\in \\mathbb{Z} _{>0}$ as we now explain.[^5] If this is the case then\nfor $i=1,2$, the point $p_i+d_i\\nu_\\textbf{s}(\\tau)$ belongs to\n$P_{d_i}(\\mathbb{Z} )$. By hypothesis\n$\\vartheta ^V_{p_i+d_i \\nu_{\\textbf{s}}(\\tau)}\\in R_{d_i}(D')$. In\nparticular, the product\n$\\vartheta ^V_{p_1+d_1 \\nu_{\\textbf{s}}(\\tau)}\\vartheta ^V_{p_2+d_2 \\nu_{\\textbf{s}}(\\tau)}$\nmust belong to $R_{d_1+d_2}(D')$ and this product must be expressed as a\nlinear combination of theta functions that belong to $R_{d_1+d_2}(D')$.\nTo finish we just need to convince ourselves that\n$$\\alpha(p_1+d_1\\nu_\\textbf{s}(\\tau),p_2+d_2\\nu_\\textbf{s}(\\tau), r+(d_1+d_2)\\nu_\\textbf{s}(\\tau))\\neq 0$$\nas this would imply\n$$r+(d_1+d_2)\\nu_\\textbf{s}(\\tau)\\in P_{d_1+d_2}(\\mathbb{Z} )=(d_1+d_2)\\Delta(\\mathbb{Z} )+ (d_1+d_2)\\nu_\\textbf{s}(\\tau) .$$\nHowever, this follows at once from the fact that $\\nu_\\textbf{s}(\\tau)$\nbelongs to the linear subset $L$. Indeed, the condition\n$\\alpha(p_1,p_2,r)\\neq 0$ implies the existence of a pair of broken\nlines $\\gamma_1, \\gamma_2$ such that $I(\\gamma_i)=p_i$ and\n$F(\\gamma_1)+F(\\gamma_2)=r$. Since $\\nu_\\textbf{s}(\\tau)\\in L$ we can\nconstruct new broken lines $\\gamma'_1$ and $\\gamma'_2$ such that\n$I(\\gamma'_i)=p_i+d_i\\nu_\\textbf{s}(\\tau)$ and\n$F(\\gamma'_1)+F(\\gamma'_2)=r+(d_1+d_2)\\nu_\\textbf{s}(\\tau)$ by changing\nthe direction of all the domains of linearity of $\\gamma_i$ by\n$d_i\\nu_\\textbf{s}(\\tau)$.\n\nWe now proceed to show that $k\\Delta= P_k-k\\nu_\\textbf{s}(\\tau)$ for all\n$k \\in \\mathbb{Z} _{>0}$. First notice that $aP_1= P_a$ for all\n$a\\in \\mathbb{R} _{\\geq 0}$ (if $g$ is a positive Laurent polynomial\nthen $g^T(ax)=ag^T(x)$ provided $a$ is non-negative). Since $P_k$ is\nclosed and convex in order to show that\n$k \\Delta  \\subset P_k- k\\nu_\\textbf{s}(\\tau)$ it is enough to show that\n$\\frac{k}{k'}\\ \\nu_\\textbf{s}(f/\\tau^{k'})=\\frac{k}{k'}\\ \\nu_\\textbf{s}(f)-k \\nu_\\textbf{s}(\\tau)$\nbelongs to $P_k-k\\nu_\\textbf{s}(\\tau)$ for all $k'\\geq 1$ and all\n$f\\in R_{k'}(D')\\setminus \\{0\\}$. This follows at once from the fact\nthat $\\frac{k}{k'}\\nu_\\textbf{s}(f)\\in P_k$ as $\\frac{k}{k'}P_{k'}=P_k$.\nTo obtain the reverse inclusion it is enough to show that the inclusion\nholds at the level of rational points, namely,\n$P_k(\\mathbb{Q} )-k\\nu_\\textbf{s}(\\tau)\\subset k\\Delta(\\mathbb{Q} )$.\nIndeed, since $P_k$ is a finite intersection of rational hyperplanes in\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee})$ it\ncan be described as the convex hull of its rational points. If\n$x\\in P_k(\\mathbb{Q} )$ then\n$\\frac{x}{k}\\in \\frac{1}{k}P_k(\\mathbb{Q} )=P_1(\\mathbb{Q} )$. Let\n$d\\in \\mathbb{Z} _{>0}$ be such that\n$x':=\\frac{dx}{k} \\in \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee})$.\nIn particular, $x'\\in P_{d}(\\mathbb{Z} )_{\\textbf{s}}$ which gives that\n$d^{-1}\\nu_\\textbf{s}(\\frac{\\vartheta _{x'}}{\\tau^{d}})\\in \\Delta$.\nFinally, notice that\n$d^{-1}\\nu_\\textbf{s}(\\frac{\\vartheta _{x'}}{\\tau^{d}})=d^{-1}(\\nu_\\textbf{s}(\\vartheta _{x'})-d\\nu_\\textbf{s}(\\tau))=d^{-1}x'-\\nu_\\textbf{s}(\\tau)$\nwhich implies $x-k\\nu_\\textbf{s}(\\tau) \\in k\\Delta$. ◻\n\nIn Theorem the assumption that $R(D')$ has a graded theta basis might\nseem rather strong. We now provide a situation in which this hypothesis\nholds and in the next subsection we treat a more robust framework in\nwhich this condition follows directly from the equivariant nature of\ntheta functions.\n\n**Lemma 69**. Let $V\\subset Y$ be a minimal model. Assume\n$D=\\sum_{j=1}^n D_j$ is ample with $D'=cD$ very ample for some\n$c\\in \\mathbb{Z} _{>0}$. Assume further that the image of the embedding\nof $Y$ into a projective space given by $D'$ is projectively normal. If\n$\\mathcal{V}$ has theta reciprocity and the theta functions on\n$\\mathcal{V}$ respect the order of vanishing (see Definition ), then\n$R(D')$ has a graded theta basis.\n\n*Proof.* It is enough to treat the case $\\mathcal{V} =V$. Consider the\naffine cone $\\widetilde{Y}$ of the embedding of $Y$ into a projective\nspace given by $D'$. We consider the canonical projection\n$\\widetilde{Y}\\setminus \\{ 0\\} \\overset{\\pi}{\\to } Y$ and let\n$\\mathcal{V} ':= \\pi^{-1}(\\mathcal{V} )$. Observe that\n$\\mathcal{V} '\\cong \\mathcal{V} \\times \\mathbb{C} ^*$. We may think of\n$\\mathcal{V} '$ as the cluster variety obtained from $\\mathcal{V}$ by\nadding a frozen index and extending trivially the bilinear form in the\nfixed data defining $\\mathcal{V}$. In particular,\n$\\text{up}(\\mathcal{V} ')= \\text{up}(\\mathcal{V} )[x^{\\pm 1}]$, where\n$x$ is the coordinate for the $\\mathbb{C} ^*$ component. Notice that the\ntheta functions on $\\mathcal{V} '$ are of the form\n$\\vartheta ^{\\mathcal{V} '}_{(p,h)}=\\vartheta ^{\\mathcal{V} '}_{(0,h)}\\vartheta ^{\\mathcal{V} '}_{(p,0)} =x^h\\vartheta ^{\\mathcal{V} }_p$,\nwhere $\\vartheta ^{\\mathcal{V} }_p$ is a theta function on $\\mathcal{V}$\nand $h \\in \\mathbb{Z} =\\mathrm{Trop} _{\\mathbb{Z} }(\\mathbb{C} ^*)$. An\nanalogous description holds for the theta functions on\n$(\\mathcal{V} ')^\\vee \\cong \\mathcal{V} ^\\vee \\times \\mathbb{C} ^*$.\nNamely, these theta functions are of the form\n$x^h\\vartheta _q^{\\mathcal{V} ^\\vee}$ for some $h\\in \\mathbb{Z}$. We\nconsider the inclusion $R(D')\\hookrightarrow \\text{up}(\\mathcal{V} ')$\ngiven by sending a homogeneous element $f\\in R_k(D')$ to $x^kf$. The map\nis well defined since $f$ is regular on $\\mathcal{V}$. Moreover, if we\nlet $\\widetilde{D}_j:= \\pi^{-1}(D_j)$ then for all $j$ we have\n$\\mathop{\\mathrm{ord}}_{\\widetilde{D}_j}\\left(x^{k}\\right)=k$ and\n$\\mathop{\\mathrm{ord}}_{\\widetilde{D}_j}\\left(\\vartheta ^{\\mathcal{V} '}_{(p,0)}\\right)=\\mathop{\\mathrm{ord}}_{D_j}\\left(\\vartheta ^V_p\\right)$.\nIn particular, thinking of $\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ')$\nas $\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} )\\times \\mathbb{Z}$ we have\n$\\mathop{\\mathrm{ord}}_{\\widetilde{D}_k}=(\\mathop{\\mathrm{ord}}_{D_k},1)$.\nSince theta functions on $\\mathcal{V}$ respect the order of vanishing,\nthe same holds for the theta functions on $\\mathcal{V} '$. This implies\nthat for every $a \\in \\mathbb{Z}$ and every $j$,\n$\\mathop{\\mathrm{ord}}_{D_j}\\left(\\sum_q \\alpha_q \\vartheta _q^{\\mathcal{V} }\\right)\\geq a$\nif and only if\n$\\mathop{\\mathrm{ord}}_{D_j}(\\vartheta _q^{\\mathcal{V} })\\geq a$ for all\n$q$ such that $\\alpha_q \\neq 0$. To see this there is only one\nimplication to be checked (the other follows from the axioms of\nvaluations). So assume\n$\\mathop{\\mathrm{ord}}_{D_j}\\left(\\sum_q \\alpha_q \\vartheta _q^{\\mathcal{V} }\\right)\\geq a$.\nSince\n$\\mathop{\\mathrm{ord}}_{D_j}\\left(\\sum_q \\alpha_q \\vartheta _q^{\\mathcal{V} }\\right)=\\mathop{\\mathrm{ord}}_{\\widetilde{D}_j}\\left(\\sum_q \\alpha_q \\vartheta _q^{\\mathcal{V} }\\right)$\nand $x^{-a}\\vartheta _q^{\\mathcal{V} }$ is a theta function on\n$\\mathcal{V} '$ for all $q$ we have the following $$\\begin{aligned}\n    \\mathop{\\mathrm{ord}}_{D_j}\\left(\\sum_q \\alpha_q \\vartheta _q^{\\mathcal{V} }\\right)\\geq a & \\Longleftrightarrow   \\mathop{\\mathrm{ord}}_{\\widetilde{D}_j}\\left(\\sum_q \\alpha_q \\vartheta _q^{\\mathcal{V} }\\right)\\geq a \\\\\n    & \\Longleftrightarrow   \\mathop{\\mathrm{ord}}_{\\widetilde{D}_j}\\left(x^{-a}\\sum_q \\alpha_q \\vartheta _q^{\\mathcal{V} }\\right) \\geq 0 \\\\\n    & \\Longleftrightarrow   \\mathop{\\mathrm{ord}}_{\\widetilde{D}_j}(x^{-a} \\vartheta _q^{\\mathcal{V} })\\geq 0  \\text{ for all } q \\text{ such that } \\alpha_q\\neq 0 \\\\\n     & \\Longleftrightarrow   \\mathop{\\mathrm{ord}}_{\\widetilde{D}_j}( \\vartheta _q^{\\mathcal{V} })\\geq a  \\text{ for all } q \\text{ such that } \\alpha_q\\neq 0 \\\\\n     & \\Longleftrightarrow   \\mathop{\\mathrm{ord}}_{D_j}( \\vartheta _q^{\\mathcal{V} })\\geq a  \\text{ for all } q \\text{ such that } \\alpha_q\\neq 0.\n\\end{aligned}$$ Since $D'$ is very ample and $Y$ is projectively normal\nin its embedding given by $D'$ we have that\n$H^0(\\widetilde{Y}, \\mathcal{O}_{\\widetilde{Y}}) \\cong R(D') \\hookrightarrow \\text{up}(\\mathcal{V} ')$.\nIn particular, if we express $f \\in R_k(D')$ as\n$f= \\sum_q \\alpha_q \\vartheta ^{\\mathcal{V} }_q$, we have that\n$\\mathop{\\mathrm{ord}}_{D_j}\\left(\\vartheta ^\\mathcal{V} \\right)\\geq -kc$\nfor all $j$ and all $q$ such that $\\alpha_q \\neq 0$. This means that\n$\\vartheta _q^{\\mathcal{V} } \\in R_k(D')$ for all such $q$. In\nparticular, the theta functions of $\\mathcal{V}$ that lie in $R_k(D')$\nhave to be a basis a of $R_k(D')$. By theta reciprocity, such theta\nfunctions are precisely those parametrized by $P_k(D')$. ◻\n\n**Remark 70**. If $R(D')$ is finitely generated and the semigroup\ngenerated by the image of $\\nu^\\mathcal{V} _{\\textbf{s}}$ is of\nfull-rank and finitely generated then there is a one parameter toric\ndegeneration of $Y$ to the toric variety associated to\n$\\Delta_{\\nu^\\mathcal{V} _{\\textbf{s}}}(D',\\tau)$ [^6]. As explained in\nfor cluster varieties of type $\\mathcal{A}$ (regardless of the full-rank\nassumption) a polyhedral positive set defines a partial compactification\n$\\mathcal{A} _{\\text{prin}} \\subset \\overline{\\mathcal{A} }_{\\text{prin}}$.\nThis compactification comes with a flat morphism\n$\\overline{\\mathcal{A} }_{\\text{prin}}\\to \\mathbb A^r$ having\n$\\overline{\\mathcal{A} }=Y$ as fibre over ${\\bf 1}=(1, \\dots , 1)$ and\nwhose fibre over $0$ is the toric variety associated to the positive\nset. Therefore, both constructions can be used to degenerate varieties\nwith a cluster structure to the same toric variety. However, the variety\ngiven by the latter construction contains various intermediate fibres\nthat lie in between $\\mathcal A=\\mathcal{V}$ and a toric variety.\nMoreover, while Anderson’s degenerations produces a\n$(\\Bbbk^*)$-equivariant family, for the latter degeneration this is the\ncase if and only if $\\Gamma$ is of full-rank.\n\n## Newton–Okounkov bodies for line bundles via universal torsors\n\nIn this section we consider a particularly nice geometric situation that\narises often in representation theory. We let $Y$ be an irreducible\nnormal projective scheme whose Picard group $\\text{Pic}(Y)$ is free of\nfinite rank $\\rho \\in \\mathbb{Z} _{>0}$ (recall that $\\text{Pic}(Y)$ is\nalways abelian). Following (see also , , or ), we consider the universal\ntorsor of $Y$ and the associated Cox ring (*cf.* Remark ). For the\nconvenience of the reader we recall these concepts. We begin by\nconsidering the quasi-coherent sheaf of $\\mathcal{O}_Y$-modules\n$$\\bigoplus_{[\\mathcal{L} ] \\in \\text{Pic}(Y)} \\mathcal{L} .$$ In\nessence, the universal torsor of $Y$ is obtained by applying a relative\nspectrum construction (also denoted by **Spec**) to this sheaf. However,\nthe choice of the representative $\\mathcal{L}$ in the class\n$[\\mathcal{L} ]$ prevents this sheaf from having a natural\n$\\mathcal{O}_Y$-algebra structure. To address this situation one can\nproceed as in and consider line bundles\n$\\mathcal{L} _1, \\dots, \\mathcal{L} _{\\rho}$ whose isomorphism classes\nform a basis of $\\text{Pic}(Y)$. For\n$v=(v_{1},\\dots, v_{\\rho})\\in \\mathbb{Z} ^{\\rho}$ we let\n$\\mathcal{L} ^{v}= \\mathcal{L} _1^{\\otimes v_1}\\otimes \\cdots \\otimes \\mathcal{L} _{\\rho}^{\\otimes v_{\\rho}}$\nand consider the quasi-coherent sheaf\n$$\\bigoplus_{v \\in \\mathbb{Z} ^{\\rho}}\\mathcal{L} ^{v}.$$ This sheaf has\na natural structure of a reduced $\\mathcal{O}_Y$-algebra that is locally\nof finite type over $\\mathcal{O}_Y$ (the component associated to the\nzero element of $\\text{Pic}(Y)$). This means that for sufficiently small\naffine open subsets $U$ of $Y$, the space\n$\\bigoplus_{v \\in \\mathbb{Z} ^{\\rho}}\\mathcal{L} ^{v}(U)$ is a finitely\ngenerated $\\mathcal{O}_Y(U)$-algebra. The universal torsor of $Y$ is\nobtained by gluing the affine schemes\n$\\text{Spec}\\left(\\bigoplus_{v \\in \\mathbb{Z} ^{\\rho}}\\mathcal{L} ^{v}(U)\\right)$.\n\n**Definition 71**. The **universal torsor** of $Y$ is\n$$\\text{UT} _Y= \\textbf{Spec}\\left(\\bigoplus_{v \\in \\mathbb{Z} ^{\\rho}}\\mathcal{L} ^{v} \\right).$$\nThe **Cox ring** of $Y$ is\n$$\\text{Cox}(Y)= H^0 (\\text{UT} _Y,\\mathcal{O}_{\\text{UT} _Y}).$$\n\nUniversal torsors can be used to generalize the construction of a\nprojective variety from its affine cone as follows. Observe that the\ninclusion of $\\mathcal{O}_Y$ as the degree $0$ part of\n$\\bigoplus_{v \\in \\mathbb{Z} ^{\\rho}}\\mathcal{L} ^{v}$ gives rise to an\naffine regular map $\\text{UT} _Y\\to Y$. Since $\\text{Cox}(Y)$ is\n$\\text{Pic}(Y)$-graded there is an action of\n$T_{\\text{Pic}(Y)^*}= \\text{Spec}(\\mathbb{C} [\\text{Pic}(Y)])$ on\n$\\text{UT} _Y$. This action is free and the map $\\text{UT} _Y\\to Y$ is\nthe associated quotient map (see ).\n\n**Remark 72**. The notion of a Cox ring associated to a projective\nvariety (satisfying some technical assumptions) was first introduced in\n. This notion was generalized in for any divisorial variety with only\nconstant globally invertible functions, in particular, for any\nquasi-projective variety (over very general ground fields). However, in\nthe term *Cox ring* was not used. The importance of considering\nuniversal torsors and Cox rings in the context of cluster varieties was\npointed out in (see also ) and satisfactorily pursued in representation\ntheoretic contexts where Cox rings arise naturally, see for example .\n\n**Remark 73**. For simplicity we are assuming that $\\text{Pic}(Y)$ is\nfree. In case it has torsion we can still construct a universal torsor\nwhich might not be unique as it depends on the choice of a *shifting\nfamily* as in (see for a related discussion). Generalizations of the\nresults of this section to the torsion case shall be treated elsewhere.\n\n**Remark 74**. If $Y$ is smooth we can construct the Cox ring of $Y$ and\nthe universal torsor (still assuming that $\\text{Pic(Y)}$ is torsion\nfree) in an equivalent way. The Cox ring can be defined as\n$\\text{Cox}(Y)=\\bigoplus_{v\\in \\mathbb{Z} ^{\\rho}} H^0 (Y, \\mathcal{L} ^v)$.\nIf $\\text{Cox}(Y)$ is finitely generated over $\\mathcal{O}_Y$-algebra\nthen the universal torsor $\\text{UT} _Y$ is obtained from\n$\\text{Spec}(\\text{Cox}(Y))$ by removing the unstable locus of the\nnatural $T_{\\text{Pic(Y)}^*}$-action on $\\text{Spec}(\\text{Cox}(Y))$.\n\nFrom now on we assume $V\\subset \\text{UT} _Y$ is a partial minimal model\nwhere $(V,\\Phi)$ is a scheme with a cluster structure of type\n$\\mathcal{A}$. In most of the result of this section we assume that\n$V\\subset \\text{UT} _Y$ has enough theta functions. Under certain\nconditions that we discuss next, it is possible to show that $Y$ is a\nminimal model for a scheme with a cluster structure given by a quotient\nof $\\mathcal{A}$ and construct Newton–Okounkov bodies for elements of\n$\\text{Pic}(Y)$. The key point is to relate the action of\n$T_{\\text{Pic}(Y)^*}$ on $\\text{UT} _Y$ with the torus actions on\n$\\mathcal{A}$ arising from cluster ensemble maps.\n\n**Lemma 75**. Let $p:\\mathcal{A} \\to \\mathcal{X}$ be a cluster ensemble\nmap and $H\\subset K^{\\circ}$ be a saturated sublattice. Consider the\nquotient $\\mathcal{A} /T_H$ and the fibration\n$w_H:\\mathcal{A} ^\\vee \\to T_{H^*}$ (see §). Then the set\n$$\\left\\{ \\vartheta ^{\\mathcal{A} }_{\\bf m} \\in \\mathop{\\mathrm{mid}}(\\mathcal{A} )  \\mid {\\bf m} \\in \\left(\\mathrm{Trop} _{\\mathbb{Z} }(w_H)\\right)^{-1}(q) \\cap \\Theta(\\mathcal{A} )\\right\\}$$\nconsists precisely of the polynomial theta functions on $\\mathcal{A}$\nwhose $T_H$-weight is $q$. Moreover, for every $q \\in H^*$ the set\n$\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(q)\\subset \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{A} ^{\\vee})$\nis positive.\n\n*Proof.* The first claim follows from . So we only need to show that\n$\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(q)$ is positive. In\norder to show this it is convenient to work with a condition equivalent\nto positivity called broken line convexity, see §. We work in the\nlattice identification\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee})$ of\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{A} ^{\\vee})$. We first argue that\nthe set\n$\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$\nis positive. First notice that any linear segment $L$ of a broken line\nsegment contained in\n$\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$\nhas itself tangent direction in\n$\\left(\\mathrm{Trop} _{\\mathbb{Z} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$.\nLet\n$m\\in \\left(\\mathrm{Trop} _{\\mathbb{Z} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$\nbe the tangent direction of $L$. The tangent direction of the following\nlinear segment is of form $m+cp^*(n)$ for some $n\\in N^+_{\\textbf{s}}$\nand $c\\in  \\mathbb{Z} _{\\geq 0}$. For any $h\\in H^\\circ$ we have\n$$\\langle m+cp^*(n),h\\rangle =\\langle m,h\\rangle + c\\{n,h\\}=0,$$ as\n$H^\\circ\\subset K^\\circ$. So the next tangent direction also belongs to\n$\\left(\\mathrm{Trop} _{\\mathbb{Z} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$.\nWe conclude that the set\n$\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$\nis broken line convex and by the main result of (see Theorem below) the\nset\n$\\left(\\mathrm{Trop} _{\\mathbb{Z} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$\nis positive. This already implies that for any\n$x\\in \\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(q)_{\\textbf{s}^\\vee}$\nthe set\n$x+ \\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$\nremains positive. Indeed, let\n$y, z \\in x+ \\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$.\nThen\n$y- z \\in \\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$.\nIn other words, any line segment within the set\n$x+\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$\nhas tangent direction in\n$\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$.\nTherefore, after bending it will remain in the set\n$x+\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$.\nFinally, observe that\n$x+\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}=\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(q)_{\\textbf{s}^\\vee}$. ◻\n\nHaving in mind Proposition  and the action of the $T_{\\text{Pic}(Y)^*}$\non $\\text{UT} _Y$ we introduce the following notion.\n\n**Definition 76**.\n\nThe pair $(p,H)$ has the **Picard property** with respect to\n$V\\subset \\text{UT} _Y$ if\n\n-   $H$ and $\\text{Pic}(Y)^*$ have the same rank, and\n\n-   the action of $T_{H}$ on $\\mathcal{A}$ coincides with the action of\n    $T_{\\text{Pic}(Y)^*}$ on $\\text{UT} _Y$ restricted to the image of\n    $\\Phi:\\mathcal{A} \\dashrightarrow V$.\n\nRecall the definitions of the superpotential and its associated cone of\ntropical points from and in §. The following result adapts the content\nof Proposition to this framework.\n\n**Lemma 77**. Suppose that $V \\subset \\text{UT} _Y$ is a partial minimal\nmodel with enough theta functions and that $(p,H)$ has the Picard\nproperty with respect to this model. Then for every class\n$[\\mathcal{L} ]\\in \\text{Pic}(Y)\\cong H^*$ we have that the theta\nfunctions parametrized by the integral points of the set\n$\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}\\left([\\mathcal{L} ]\\right)\\cap \\Xi_{\\text{UT} _Y}$\nis a basis for $H^0(Y, \\mathcal{L} )$. In particular,\n$\\mathop{\\mathrm{Cox}}(Y)$ has a basis of theta functions which are\n$T_{\\text{Pic}(Y)^*}$-eigenfunctions.\n\nWe consider the section ring\n$R(\\mathcal{L} )=\\bigoplus_{k\\geq 0} R_k(\\mathcal{L} )$. The\n$k^{\\mathrm{th}}$ homogeneous component is defined as\n$R_k(\\mathcal{L} )=H^0(Y, \\mathcal{L} ^{\\otimes k})$. The product of\n$R(\\mathcal{L} )$ is given by the tensor product of sections. Fix a seed\n$\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$, a linear dominance\norder $<_{\\textbf{s}}$ on\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee})$\nand consider the valuation\n$\\mathbf{g} ^{\\Phi}_{\\textbf{s}}:\\Bbbk(V)\\setminus \\{ 0 \\} \\to (\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee}), <_{\\textbf{s}}).$\nObserve that $R_k(\\mathcal{L} )\\subset \\mathop{\\mathrm{Cox}}(Y)$ for all\n$k$. Hence we can define the Newton–Okounkov body $$\\begin{split} \n\\Delta_{\\mathbf{g} ^{\\Phi}_{\\textbf{s}}}(\\mathcal{L} ) := \\overline{\\mathop{\\mathrm{conv}}\\Bigg( \\bigcup_{k\\geq 1}\\left\\{ \\frac{1}{k}\\mathbf{g} ^{\\Phi}_{\\textbf{s}} (f) \\mid f\\in R_k(\\mathcal L)\\setminus \\{0\\} \\right\\} \\Bigg) }\\subseteq \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{A} ^\\vee_{\\textbf{s}^\\vee})=M^{\\circ}_{\\mathbb{R} }.\n \\end{split}$$\n\n**Theorem 78**.\n\nSuppose that $V \\subset \\text{UT} _Y$ is a partial minimal model with\nenough theta functions and that $(p,H)$ has the Picard property with\nrespect to this model. Then for any line bundle $\\mathcal{L}$ on $Y$\n$$\\Delta_{{\\bf g}^{\\Phi}_{\\textbf{s}}}(\\mathcal{L} )=\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}\\left([\\mathcal{L} ]\\right)_{\\textbf{s}^\\vee}\\cap \\Xi_{\\text{UT} _Y, \\textbf{s}^\\vee}.$$\nIn particular, $\\Delta_{{\\bf g}^{\\Phi}_{\\textbf{s}}}(\\mathcal{L} )$ is a\npositive subset of\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee})$.\n\n*Proof.* To make notation lighter, throughout this proof we let\n$S=\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}\\left([\\mathcal{L} ]\\right)_{\\textbf{s}^\\vee}\\cap \\Xi_{\\text{UT} _Y,\\textbf{s}^\\vee}$\nand denote $\\mathbf{g} ^{\\Phi}_{\\textbf{s}}$ simply by\n$\\mathbf{g} _{\\textbf{s}}$. Observe that\n$[\\mathcal{L} ^{\\otimes k}]=k[\\mathcal{L} ]$ in $\\text{Pic}(Y)$.\nTherefore, by Lemma we have that\n${\\bf g}_{\\textbf{s}}(R_k(\\mathcal{L} ))\\subseteq \\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}\\left(k[\\mathcal{L} ]\\right)_{\\textbf{s}^\\vee}$\nfor all $k\\geq 1$. In particular,\n$\\dfrac{1}{k}{\\bf g}_{\\textbf{s}}(R_k(\\mathcal{L} ))\\subseteq \\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}\\left([\\mathcal{L} ]\\right)_{\\textbf{s}^\\vee}$\nfor all $k \\geq 1$. Since\n$\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}\\left([\\mathcal{L} ]\\right)_{\\textbf{s}^\\vee}$\nis closed in\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee})$\nand convex we have that\n$\\Delta_{{\\bf g}_{\\textbf{s}}}(\\mathcal{L} )\\subseteq \\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}\\left([\\mathcal{L} ]\\right)_{\\textbf{s}^\\vee}$.\nLet $\\mathbb B_k$ be the theta basis of $R_k(\\mathcal{L} )$, see . Since\nthe theta basis is adapted for ${\\bf g}_{\\textbf{s}}$ we have that\n${\\bf g}_{\\textbf{s}}(R_k(\\mathcal{L} ))={\\bf g}_{\\textbf{s}}(\\mathbb B_k)$.\nSince $\\mathcal{A} \\subseteq \\text{UT} _Y$ has enough theta functions,\nevery theta function $\\vartheta \\in \\mathbb B_k$ is a global function on\n$\\text{UT} _Y$, therefore, we have that\n${\\bf g}_{\\textbf{s}}(\\vartheta ) \\in \\Xi_{\\text{UT} _Y}$. Since\n$\\Xi_{\\text{UT} _Y}$ is closed in\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee})$,\nconvex and closed under positive scaling then\n$\\Delta_{{\\bf g}_{\\textbf{s}}}(\\mathcal{L} )\\subseteq \\Xi_{\\text{UT} _Y,\\textbf{s}^\\vee}$.\nHence, $\\Delta_{{\\bf g}_{\\textbf{s}}}(\\mathcal{L} )\\subseteq S$. To see\nthe reverse inclusion we notice that the set of rational points of $S$\ncoincide with the set\n$\\bigcup_{k\\geq 1} \\frac{1}{k}  \\mathbf{g} _{\\textbf{s}}(\\mathbb B_k)=  \\bigcup_{k\\geq 1} \\frac{1}{k}  \\mathbf{g} _{\\textbf{s}}\\left(R_k(\\mathcal{L} )\\right)$.\nSince $S$ can be expressed as the closure of its set of rational points\nwe have that\n$S\\subseteq \\Delta_{\\mathbf{g} _{\\textbf{s}}}(\\mathcal{L} )$. Finally,\nsince\n$\\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}\\left([\\mathcal{L} ]\\right)_{\\textbf{s}^\\vee}$\nand $\\Xi_{\\text{UT} _Y,\\textbf{s}^\\vee}$ are positive sets then\n$S=\\Delta_{{\\bf g}_{\\textbf{s}}}(\\mathcal{L} )$ is an intersection of\npositive sets. Hence, it is positive. ◻\n\n**Remark 79**. Under the assumptions of we have that $Y$ is a minimal\nmodel with enough theta functions for an open subscheme $V'\\subset Y$\nwith a cluster structure given by a birational map\n$\\Phi':\\mathcal{A} /T_{H}\\dashrightarrow V'$ induced by $\\Phi$. To\nrelate the Newton–Okounkov bodies constructed in this section with those\nconstructed in the former we let $\\mathcal{L}$ be isomorphic to\n$\\mathcal{O}(D')$ for some Weil divisor $D'$ on $Y$ satisfying the\nframework of §. Under the identification\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{A} ^{\\vee}_{\\textbf{s}^\\vee}) = M^\\circ_\\mathbb{R}$\nwe realize\n$\\mathrm{Trop} _{\\mathbb{R} }((\\mathcal{A} /T_H)^{\\vee}_{\\textbf{s}^\\vee})$\nas the subset of $M^\\circ_\\mathbb{R}$ orthogonal to $H$ (see §). For any\n$\\tau \\in R_1(D')$ we have\n$\\Delta_{\\mathbf{g} ^{\\Phi'}_\\textbf{s}}(D',\\tau)\\subset M_{\\mathbb{R} }^\\circ\\cap \\left(\\mathrm{Trop} _{\\mathbb{R} }(w_H)\\right)^{-1}(0)_{\\textbf{s}^\\vee}$\nand by construction\n$$\\Delta_{\\mathbf{g} ^{\\Phi'}_\\textbf{s}}(D',\\tau) =\\Delta_{\\mathbf{g} ^{\\Phi}_\\textbf{s}}(\\mathcal{L} )- \\mathbf{g} ^{\\Phi}_{\\textbf{s}}(\\tau).$$\n\n**Example 80**.\n\nAn important class of examples is provided by the base affine spaces.\nConsider $G=SL_{n+1}(\\Bbbk)$ and $B\\subset G$ a Borel subgroup with\nunipotent subgroup $U\\subset B$. Then $G/U$ is a universal torsor for\n$G/B$. Moreover, $G/U$ carries a cluster structure induced by the double\nBruhat cell $G^{e,w_0}:=B_-\\cap Bw_0B$, where $B_-\\subset G$ is the\nBorel subgroup opposite to $B$ (i.e. $B\\cap B^-=:T$ is a maximal torus)\nand $w_0$ the longest element in this Weyl group $S_n$ is identified\nwith a matrix representative in $N_G(T)/C_G(T)$ (the normalizer of $T$\nmodulo the centralizer of $T$). The cluster structure on $G^{e,w_0}$ was\nintroduced by Berenstein–Fomin–Zelevinsky in and it follows that (up to\nco-dimension 2) $G^{e,w_0}$ agrees with the corresponding\n$\\mathcal A$-cluster variety. By there is an embedding\n$G^{e,w_0}\\hookrightarrow G/U$ compatible with the cluster structure. In\nparticular, $G/U$ is a partial compactification of the\n$\\mathcal A$-cluster variety $G^{e,w_0}$ obtained by adding the locus\nwhere frozen variables are allowed to vanish. Magee further proved in\nthat the full Fock–Goncharov conjecture holds and a cluster ensemble map\nsatisfying is provided in . Hence, we obtain a ${\\bf g}$-vector\nvaluation ${\\bf g}_{\\textbf{s}}$ on $H^0(G/U,\\mathcal O_{G/U})$ for\nevery choice of seed $\\textbf{s}$.\n\nIn particular, applies: recall that the Picard group of $G/B$ is\nisomorphic to the lattice spanned by the fundamental weights\n$\\omega_1,\\dots,\\omega_{n}$. Let $\\Lambda$ denote the dominant weights,\n*i.e.* its elements are $\\lambda=a_1\\omega_1+\\dots+a_n\\omega_n$ with\n$a_i\\in \\mathbb Z_{\\ge 0}$ and let $\\mathcal L_\\lambda\\to G/B$ be the\nassociated line bundle. The ring of regular functions on the\nquasi-affine variety $G/U$ coincides with the Cox ring of the flag\nvariety:\n$$H^0(G/U,\\mathcal O_{G/U})\\cong \\bigoplus_{\\lambda \\in \\Lambda} H^0 (G/B,\\mathcal L_\\lambda).$$\nHence, we may restrict the ${\\bf g}$-vector valuations\n${\\bf g}_{\\textbf{s}}$ for all seeds $\\textbf{s}$ to the section ring of\nany line bundle on $G/B$. The resulting Newton–Okounkov polytopes\ncoincide with slices of the tropicalization of the superpotential\ncorresponding to the compactification. It has been shown in that for\ncertain choices of seeds these polytopes are unimodularly equivalent to\nLittelmann’s string polytopes (see ).\n\n**Example 81**. Grassmannians also form a distinguished class of\nexamples fitting this framework. We treat this class separately in §.\n\n## The intrinsic Newton–Okounkov body\n\nIn the situation of § or §, we can choose two seeds\n$\\textbf{s}, \\textbf{s}'%\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ to obtain two\nNewton–Okounkov bodies, say $\\Delta_{\\nu_\\textbf{s}}$ and\n$\\Delta_{\\nu_\\textbf{s}'}$ (these are associated to a line bundle\n$\\mathcal{L}$ in case we are in a framework as in § or to a divisor $D'$\nand a section $\\tau$ in case our framework is as in §). In the same\nspirit as in (see also and ), in this section we show that if one of\n$\\Delta_{\\nu_{\\textbf{s}}}$ or $\\Delta_{\\nu_{\\textbf{s}'}}$\n(equivalently both) is a positive set then these Newton–Okounkov bodies\nare related to each other by a distinguished piecewise linear\ntransformation and, moreover, any such Newton–Okounkov body can be\nintrinsically described as a *broken line convex hull* (see Theorems and\nbelow). In order to obtain the last assertion we rely on . Along the way\nwe introduce a theta function analog of the Newton polytope associated\nto a regular function on a torus.\n\nWe start by considering Newton–Okounkov bodies associated to Weil\ndivisors as in §. Let $\\mathcal{V}$ be a scheme of the form\n$\\mathcal{A}$, $\\mathcal{X}$, $\\mathcal{A} /T_{H}$ or\n$\\mathcal{X} _{\\bf 1}$ and $(V, \\Phi)$ a scheme with a cluster structure\nof type $\\mathcal{V}$. Denote by\n$\\mathbb{B}_{\\vartheta }(\\mathcal{V} )=\\{\\vartheta ^{\\mathcal{V} }_{\\bf v}\\mid {\\bf v}\\in \\Theta(\\mathcal{V} )\\}$\nthe theta basis of $\\mathop{\\mathrm{mid}}(\\mathcal{V} )$. We begin by\nobserving that a cluster valuation $\\nu_{\\textbf{s}}$ on\n$\\mathop{\\mathrm{mid}}(\\mathcal{V} )$ can be thought of as an extension\nof the composition of the seed-independent map $$\\begin{aligned}\n    \\label{eq:nu_seed_free}\n    \\nu: \\mathbb{B}_{\\vartheta }(\\mathcal{V} )  &\\to&   \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee)\\\\\n    \\nonumber\n\\vartheta ^{\\mathcal{V} }_{\\bf v} &\\mapsto &{\\bf v},\n\\end{aligned}$$ with the identification\n$\\mathfrak{r}_{\\textbf{s}^\\vee}:\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee) \\to  \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee_{\\textbf{s}^\\vee})$.\nIf $\\mathbb B_{\\vartheta }(V)$ denotes the set of polynomial theta\nfunctions on $V$ then we can define\n$\\nu^{\\Phi} : \\mathbb B_{\\vartheta }(V) \\to \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee)$\nanalogously. Moreover, even though\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$ may not have a\nlinear structure, if\n$\\Theta (\\mathcal{V} )= \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$\nand $L\\subseteq \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$ is a\nlinear subset acting linearly on\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$ (see Definition )\nthen for every $y\\in L$ we have a well defined “subtraction\" function\n$$\\begin{split}  (\\ \\cdot \\ )-y: \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})  &\\to   \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})\\\\\nx &\\mapsto x-y,  \\end{split}$$ where $-y$ is the unique point of\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$ such that\n$\\vartheta _y\\vartheta _{-y}=1$ and $x-y$ is the unique point of\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})$ such that\n$\\vartheta _{x}\\vartheta _{-y}=\\vartheta _{x-y}$.\n\nWe now define our notion of convexity. Recall from § that we might think\nof supports of broken lines as seed independent objects. In light of\nthis we consider the following.\n\n**Definition 82**. A closed subset $S$ of\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} )$ is **broken line convex**\nif for every pair of rational points $s_1, s_2$ in $S(\\mathbb{Q} )$,\nevery segment of a broken line with endpoints $s_1$ and $s_2$ is\nentirely contained in $S$.\n\n**Remark 83**. The broken lines considered in Definition include those\nthat are *non-generic*. Namely, broken lines that are obtained as limits\nof the generic broken lines introduced in . See for details.\n\nThe main result of asserts that positivity of a set is equivalent to its\nbroken line convexity:\n\n**Theorem 84**.\n\nLet $\\mathcal{V}$ be a variety of the form $\\mathcal{A}$, $\\mathcal{X}$,\n$\\mathcal{A} /T_{H}$ or $\\mathcal{X} _{\\bf 1}$. Then a closed subset $S$\nof $\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} )$ is is broken line convex\nif and only if it is positive.\n\nMorally, this means that broken line convexity in\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^\\vee)$ play the same role in\ndescribing partial minimal models of $\\mathcal{V}$ that usual convexity\nin $M_\\mathbb{R}$ plays in describing normal toric varieties\n$T_N \\subset X$. One appealing feature of the broken line convexity\nnotion is that it makes no reference to any auxiliary data– given\n$\\mathcal{V}$, we can talk about broken line convexity in\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$. In contrast, the\nNewton–Okounkov bodies we discussed in § and § are convex bodies whose\nconstruction depends upon a choice of seed $\\textbf{s}$. More generally,\na usual Newton–Okounkov body depends not only on the geometric data of a\nprojective variety together with a divisor but also on the auxiliary\ndata of a choice of valuation. Broken line convexity makes no reference\nto any such auxiliary data and will lead us to an intrinsic version of a\nNewton–Okounkov body.\n\n**Definition 85**. Let\n$S \\subset\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$ be a set.\nThe **broken line convex hull of $S$**, denoted by\n$\\mathop{\\mathrm{conv_{BL}}}(S)$, is the intersection of all broken line\nconvex sets containing $S$.\n\n**Remark 86**. We can also define broken line convexity and broken line\nconvex hulls inside\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee})$ in\nexactly the same way they are defined in Definitions and . In\nparticular, we have that\n$S\\subset \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$ is broken\nline convex if and only if\n$\\mathfrak{r}_{\\textbf{s}^\\vee}(S)\\subset \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^\\vee_{\\textbf{s}})$\nis broken line convex.\n\nUsing this convexity notion, we describe a set analogous to the Newton\npolytope of a function on a torus.\n\n**Definition 87**. Given a regular function\n${f= \\sum_{{\\bf v} \\in \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee})}} a_{\\bf v} \\vartheta ^{V}_{\\bf v}$\non $V$, we define the **$\\vartheta$-function analogue of the Newton\npolytope of $f$** to be\n$$\\begin{split}  \\mathop{\\mathrm{Newt_{\\vartheta }}}(f) := \\mathop{\\mathrm{conv_{BL}}}\\left\\{ {\\bf v} \\in \\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^{\\vee}) \\mid a_{\\bf v} \\neq 0 \\right\\}.  \\end{split}$$\n\nThis leads to an intrinsic version of the Newton–Okounkov bodies we have\nconstructed. So consider a partial minimal model $V \\subset Y$ and let\n$D'$ be a divisor on $Y$ supported on the boundary of $V\\subset Y$.\n\n**Definition 88**. Assume that $R(D')$ has a graded theta basis (see\nDefinition ). Then the associated **intrinsic Newton–Okounkov body** is\n$$\\begin{split} \n\\Delta_{\\mathrm{BL}}(D'):= \\mathop{\\mathrm{conv_{BL}}}\\Bigg( \\bigcup_{k\\geq 1} \\Bigg(\\bigcup_{f \\in R_k(D')} \\frac{1}{k} \\mathop{\\mathrm{Newt_{\\vartheta }}}(f) \\Bigg)  \\Bigg)\\subseteq \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^\\vee).\n \\end{split}$$\n\nIn order to describe how the different realizations of intrinsic\nNewton–Okounkov bodies are related we record the tropicalization of the\ngluing map\n$\\mu^{\\mathcal{V} ^\\vee}_k:\\mathcal{V} ^{\\vee}_\\textbf{s}\\dashrightarrow \\mathcal{V} ^{\\vee}_{\\textbf{s}'}$\nin terms of the fixed data $\\Gamma$ and inital seed\n$\\textbf{s}_0=(e_i)_{i\\in I}$ defining $\\mathcal{V}$.\n$$\\mathrm{Trop} _{\\mathbb{R} }\\left(\\mu^{\\mathcal{A} ^\\vee}_{k}\\right)(m)=\\begin{cases} m + \\langle d_ke_k, m \\rangle  v_k & \\text{if } \\langle e_k, m \\rangle \\geq 0,\\\\\nm & \\text{if } \\langle e_k, m \\rangle \\leq 0,\n\\end{cases}$$ for $m \\in M^{\\circ}$.\n$$\\mathrm{Trop} _{\\mathbb{R} }\\left(\\mu^{\\mathcal{X} ^\\vee}_{k}\\right)(n)=\\begin{cases} n + \\{n,d_ke_k \\} e_k & \\text{if } \\{  n,e_K \\}\\geq 0,\\\\\nn & \\text{if } \\{ n,e_K\\} \\leq 0,\n\\end{cases}$$ for $n \\in N$.\n$$\\mathrm{Trop} _{\\mathbb{R} }\\left(\\mu^{(\\mathcal{X} _{\\bf 1})^\\vee}_{k}\\right)(n+H)=\\begin{cases} n + \\{n,d_ke_k \\}e_k + H & \\text{if } \\{ n, e_k \\}\\geq 0,\\\\\nn + H& \\text{if } \\{ n, e_k \\} \\leq 0,\n\\end{cases}$$ for $n + H \\in N/H$.\n$$\\mathrm{Trop} _{\\mathbb{R} }\\left(\\mu^{(\\mathcal{A} /T_H)^\\vee}_{k}\\right) =  \\mathrm{Trop} _{\\mathbb{R} }\\left(\\mu^{\\mathcal{A} ^\\vee}_{k}\\right) \\mid_{H^\\perp}.$$\n\n**Theorem 89**.\n\nLet $(V,\\Phi)$ be a scheme with a cluster structure of type\n$\\mathcal{V}$ and let $V \\subset Y$ be a partial minimal model. Assume\nthat the full Fock–Goncharov conjecture holds for $\\mathcal{V}$ and that\nthere exists a theta function $\\tau \\in R_1(D')$ such that\n$\\nu^{\\Phi}_{\\textbf{s}}(\\tau)$ lies in a linear subset of\n$\\mathrm{Trop} _{\\mathbb{Z} }(\\mathcal{V} ^\\vee)$. If\n$\\Delta_{\\nu^{\\Phi}_{\\textbf{s}}}(D',\\tau)$ is positive then for every\nseed $\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ we have that\n$\\mathfrak{r}_{\\textbf{s}^\\vee}(\\Delta_{\\mathrm{BL}}(D')-\\nu^{\\Phi}(\\tau))= \\Delta_{\\nu^{\\Phi}_{\\textbf{s}}}(D',\\tau)$.\nIn particular, for any other seed $\\textbf{s}'\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ we have that\n$$\\Delta_{\\nu_{\\textbf{s}'}}(D', \\tau )= \\mathrm{Trop} _{\\mathbb{R} }\\left(\\mu^{\\mathcal{V} ^\\vee}_{\\textbf{s},\\textbf{s}'}\\right)\\left(\\Delta_{\\nu_{\\textbf{s}}}(D', \\tau)\\right).$$\n\n*Proof.* It is enough to treat the case $V= \\mathcal{V}$. We consider\nthe broken line convex hull of\n$$S=\\bigcup_{k\\geq 1}\\left\\{\\dfrac{\\nu_{\\textbf{s}}(f)}{k}-\\nu_{\\textbf{s}}(\\tau)\\mid f\\in R_k(D') \\setminus\\{0\\}\\right\\}$$\nin\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee})$.\nSince all line segments of\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee}_{\\textbf{s}^\\vee})$\ncan be thought of as a segment of a broken line and\n$\\Delta_{\\nu_{\\textbf{s}}}(D', \\tau)$ is closed we have that\n$\\Delta_{\\nu_{\\textbf{s}}}(D', \\tau)\\subseteq \\mathop{\\mathrm{conv_{BL}}}(S)$.\nBy $\\Delta_{\\nu_{\\textbf{s}}}(D', \\tau)$ is broken line convex. Since\n$S\\subset \\Delta_{\\nu_{\\textbf{s}}}(D', \\tau)$ we have the reverse\ninclusion. The last statement follows from the fact that broken line\nconvex sets are preserved by\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mu^{\\mathcal{V} ^{\\vee}}_k)$. ◻\n\nThere is an analogous result for line bundles fitting the framework of\n§.\n\n**Definition 90**. Let $Y$ be a projective variety such that\n$\\text{Pic}(Y)$ is free of finite rank. Assume $(V, \\Phi)$ is a scheme\nwith a cluster structure of type $\\mathcal{A}$ and that\n$V \\subset \\text{UT} _Y$ is a partial minimal model with enough theta\nfunctions. Let $(p, H)$ have the Picard property (see ). The **intrinsic\nNewton–Okounkov body associated to a class\n$[ \\mathcal{L} ]\\in \\text{Pic}(Y)\\cong H^*$** is $$\\begin{split} \n\\Delta_{\\mathrm{BL}}(\\mathcal{L} ):= \\mathop{\\mathrm{conv_{BL}}}\\Bigg( \\bigcup_{k\\geq 1} \\Bigg(\\bigcup_{f \\in R_k(\\mathcal{L} )} \\frac{1}{k} \\mathop{\\mathrm{Newt_{\\vartheta }}}(f) \\Bigg)  \\Bigg)\\subseteq \\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{A} ^{\\vee}).\n \\end{split}$$\n\nIn this case we have the following theorem whose proof is completely\nanalogous to the proof of . Moreover, it uses the fact that\n$\\nu_{\\textbf{s}}(\\mathcal{L} )$ is a positive set, as shown in .\n\n**Theorem 91**.\n\nKeep the assumptions of Definition . For every seed $\\textbf{s}\\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ we have that\n$\\Delta_{\\nu^{\\Phi}_{\\textbf{s}}}(\\mathcal{L} )=\\mathfrak{r}_{\\textbf{s}^\\vee}(\\Delta_{\\mathrm{BL}}(\\mathcal{L} ))$.\nIn particular, for every $\\textbf{s}' \\in %\n  \\mathrel{\\mathop{\\mathbb{T}_r}\\limits^{\n    \\vbox to0ex{\\kern-0.5\\ex@\n    \\hbox{$\\scriptstyle\\longrightarrow$}\\vss}}}$ we have that\n$$\\Delta_{\\nu^\\Phi_{\\textbf{s}'}}(\\mathcal{L} )= \\left(\\mu^{\\mathcal{V} ^\\vee}_{\\textbf{s}^\\vee, \\textbf{s}'^\\vee}\\right)^T(\\Delta_{\\nu^\\Phi_{\\textbf{s}}}(\\mathcal{L} )).$$\n\n*Proof.* We showed in that $\\Delta_{\\nu_{\\textbf{s}}}(\\mathcal{L} )$ is\na positive set. The proof of this result is completely analogous to the\nproof of . ◻\n\nIn either situation (divisors or line bundles) we are of course free to\ncompute the intrinsic Newton–Okounkov body as a usual Newton–Okounkov\nbody in any vector space realization of\n$\\mathrm{Trop} _{\\mathbb{R} }(\\mathcal{V} ^{\\vee})$. However, the\nintrinsic definition has certain advantages as we now explain. For\nsimplicity, from now on we concentrate on line bundles as in ; the\nreader can make the appropriate changes for the case of divisors as in .\nIt is often the case that\n$\\Delta_{\\mathrm{BL}}(\\mathcal{L} ) = \\mathop{\\mathrm{conv_{BL}}}\\Big( \\bigcup_{k=1}^\\ell \\frac{1}{k} \\nu^{\\Phi}\\left(R_k(\\mathcal{L} )\\right) \\Big)$\nfor some finite $\\ell$, meaning in these cases the infinite union\nreduces to finite union. Consider such an instance and let\n$\\ell_{\\textbf{s}}$ be the smallest integer such that\n$\\Delta_{\\nu^{\\Phi}_{\\textbf{s}}}(\\mathcal{L} )=\\mathop{\\mathrm{conv}}\\Big( \\bigcup_{k=1}^{\\ell_{\\textbf{s}}} \\frac{1}{k} \\nu^{\\Phi}_{\\textbf{s}}\\left(R_k(\\mathcal{L} )\\right) \\Big)$.\nThen the corresponding $\\ell$ for the intrinsic Newton–Okounkov body is\nat most $\\min_{\\textbf{s}}\\left\\{\\ell_{\\textbf{s}}\\right\\}$. Moreover,\nwe can give conditions indicating when $\\ell$ has been attained. We will\nstart with a condition that, after adopting a slightly different\nperspective on theta functions, becomes tautological.[^7] We will then\nadapt this condition to give a sufficient criterion that is more likely\nto be known for a given minimal model (and a known line bundle or Weil\ndivisor).\n\n**Proposition 92**.\n\nLet $\\mathcal{L}$ be as in . Suppose there exists a positive integer\n$\\ell$ such that for all $h>\\ell$, each theta function $\\vartheta ^V_r$\nin $R_h(\\mathcal{L} )$ appears as a summand (with non-zero coefficient)\nof some product $\\vartheta ^V_p \\vartheta ^V_q$, where\n$\\vartheta ^V_p \\in R_i(\\mathcal{L} )$ and\n$\\vartheta ^V_q \\in R_j(\\mathcal{L} )$ for some positive integers $i$\nand $j$ with $i+j =h$. Then $$\\begin{split} \n\\Delta_{\\mathrm{BL}}(\\mathcal{L} ) = \\mathop{\\mathrm{conv_{BL}}}\\Bigg( \\bigcup_{k=1}^{\\ell} \\Bigg(\\bigcup_{f \\in R_k(\\mathcal{L} )} \\frac{1}{k} \\mathop{\\mathrm{Newt_{\\vartheta }}}(f) \\Bigg)  \\Bigg) .\n \\end{split}$$\n\n*Proof.* This is an immediate consequence of results in . We adopt the\nterminology and conventions of *loc. cit.* for this proof. In\nparticular, we allow non-generic broken lines (see Remark ).\n\nSince the structure constant $\\alpha(p,q,r)$ is non-zero, there exists a\npair of broken lines $\\left(\\gamma_1,\\gamma_2\\right)$ with\n$I(\\gamma_1) = p$, $I(\\gamma_2) = q$, $\\gamma_1(0)=\\gamma_2(0) = r$, and\n$F(\\gamma_1)+ F(\\gamma_2) = r$. Then the construction of yields a broken\nline segment from $\\frac{p}{i}$ to $\\frac{q}{j}$ passing through\n$\\frac{r}{h}$. As a consequence, we have\n$$\\begin{split} \\frac{r}{h} \\in \\mathop{\\mathrm{conv_{BL}}}\\Bigg( \\bigcup_{k=1}^{\\max(i,j)} \\Bigg(\\bigcup_{f \\in R_k(\\mathcal{L} )} \\frac{1}{k} \\mathop{\\mathrm{Newt_{\\vartheta }}}(f) \\Bigg)  \\Bigg) .  \\end{split}$$\nBy hypothesis, $R_k(\\mathcal{L} )$ has a basis of theta functions for\nall $k$, so $$\\begin{split} \n\\mathop{\\mathrm{conv_{BL}}}\\Bigg(\\bigcup_{f \\in R_h(\\mathcal{L} )} \\frac{1}{h} \\mathop{\\mathrm{Newt_{\\vartheta }}}(f) \\Bigg) =  \\mathop{\\mathrm{conv_{BL}}}\\left( \\frac{r}{h} \\mid \\vartheta ^V_r \\in R_h(\\mathcal{L} )\\right) .\n \\end{split}$$ We have just seen that each such $\\frac{r}{h}$ is\ncontained in\n$$\\begin{split} \\mathop{\\mathrm{conv_{BL}}}\\Bigg( \\bigcup_{k=1}^{h-1} \\Bigg(\\bigcup_{f \\in R_h(\\mathcal{L} )} \\frac{1}{h} \\mathop{\\mathrm{Newt_{\\vartheta }}}(f) \\Bigg)  \\Bigg),  \\end{split}$$\nso $$\\begin{split} \n\\mathop{\\mathrm{conv_{BL}}}\\Bigg(\\bigcup_{f \\in R_h(\\mathcal{L} )} \\frac{1}{h} \\mathop{\\mathrm{Newt_{\\vartheta }}}(f) \\Bigg) \\subset  \\mathop{\\mathrm{conv_{BL}}}\\Bigg( \\bigcup_{k=1}^{h-1} \\Bigg(\\bigcup_{f \\in R_k(\\mathcal{L} )} \\frac{1}{k} \\mathop{\\mathrm{Newt_{\\vartheta }}}(f) \\Bigg)  \\Bigg).\n \\end{split}$$ As this holds for all $h>\\ell$, we conclude that\n$$\\begin{split} \n\\Delta_{\\mathrm{BL}}(\\mathcal{L} ) = \\mathop{\\mathrm{conv_{BL}}}\\Bigg( \\bigcup_{k=1}^{\\ell} \\Bigg(\\bigcup_{f \\in R_k(\\mathcal{L} )} \\frac{1}{k} \\mathop{\\mathrm{Newt_{\\vartheta }}}(f) \\Bigg)  \\Bigg) .\n \\end{split}$$ ◻\n\n**Remark 93**. In dimension 2, Mandel showed that the assumption in\nimplies that $r=p+q$ in some seed. It is a very interesting problem to\ndetermine if this holds for higher dimensions.\n\nNote that as we have (by assumption) a theta basis for\n$R(\\mathcal{L} )$, the condition of is implied by the following\ncondition:\n\n**Condition 1**.\n\nThere exists a positve integer $\\ell$ such that for all $h>\\ell$, the\nnatural map\n$R_i (\\mathcal{L} ) \\otimes R_j(\\mathcal{L} ) \\to R_h (\\mathcal{L} )$ is\nsurjective for some positive integers $i$ and $j$ with $i+j =h$.\n\n**Remark 94**. The is satisfied in our main class of examples coming\nfrom representation theory: recall the setting of where line bundles\n$\\mathcal L_\\lambda$ of the full flag variety $G/B$ are indexed by\ndominant weights $\\lambda$. By the Borel–Weil–Bott Theorem the graded\npieces $R_i(\\mathcal L_\\lambda)$ of the section rings of these line\nbundles satisfy $$R_i(\\mathcal L_\\lambda)\\cong  V(i\\lambda)^*,$$ where\n$V(i\\lambda)$ is the irreducible $G$-representation of highest weight\n$i\\lambda$ and $i\\ge 0$. By work of Baur the tensor product\n$V(i\\lambda)\\otimes V(j\\lambda)$ contains among its irreducible\ncomponents the unique component of maximal weight, called Cartan\ncomponent, which is $V((i+j)\\lambda)$. Hence,\n$$R_i(\\mathcal L_\\lambda)\\otimes R_j(\\mathcal{L}_\\lambda)\\cong V(i\\lambda)^*\\otimes V(j\\lambda)^*\\twoheadrightarrow V((i+j)\\lambda)^*\\cong R_{i+j}(\\mathcal L_\\lambda).$$\nAlthough in we only treat the case of $SL_{n+1}(\\Bbbk)$ it is worth\nnoticing that the Borel–Weil(–Bott) Theorem holds for semisimple Lie\ngroups and algebraic groups over $\\Bbbk$ and Baur’s result holds for\nirreducible representations of connected, simply-connected complex\nreductive groups. Notice further that these observations also hold for\npartial flag varieties, *i.e.* quotient $G/P$ by parabolic subgroups\n$P\\subset G$ as the cohomology of an equivariant line bundles on $G/P$\nis equal to the cohomology of its pullback along the natural projection\n$G/B\\twoheadrightarrow G/P$. So the cohomology of the line bundle on\n$G/P$ can be calculated using the usual Borel–Weil(–Bott) Theorem for\n$G/B$, by the Leray spectral sequence.\n\n# The case of the Grassmannian\n\nWe now consider in detail the case of the Grassmannians. Throughout this\nsection we work over the complex numbers, fix two positive integers\n$k<n$ and let $$Y= \\mathop{\\mathrm{Gr}}_{n-k}(\\mathbb{C} ^n)$$ be the\ncorresponding Grassmannian. Let $\\widetilde{Y}$ be the affine cone of\n$Y$ in its Plücker embedding\n$Y\\hookrightarrow \\mathbb{P}^{\\binom{n}{n-k}-1}$ and $\\mathcal{L} _e$ be\nthe bundle over $Y$ obtained by pullback of $\\mathcal{O}(1)$ under this\nembedding. By definition, the Plücker coordinates are a basis for\n$H^0(Y,\\mathcal{L} _e)$. It is well known that\n$\\mathop{\\mathrm{Pic}}({Y})$ is free of rank one and $[\\mathcal{L} _e]$\nis a generator. Moreover, the universal torsor of $Y$ is\n$$\\text{UT} _{Y}\\cong \\widetilde{Y}\\setminus\\{ 0\\}$$ and the action of\n$T_{\\mathop{\\mathrm{Pic}}({Y})^*}$ on $\\text{UT} _{Y}$ coincides with\nthe diagonal action of $\\mathbb{C} ^*$. Plücker coordinates are denoted\nby $p_{J}$ where $J\\in \\binom{[n]}{n-k}$ is an $n-k$-element subset of\n$\\{1, \\dots , n\\}$. Working with cyclic intervals, we let\n$D_i=\\{ p_{[i+1, i+k]} =0 \\}$ and consider the divisor\n$$D=\\bigcup_{i=1}^nD_{i} \\subset Y.$$ For any $i$ the line bundle\n$\\mathcal{O}_Y(D_i)$ is isomorphic to $\\mathcal{L} _e$ and the Weil\ndivisor $\\sum_{i=1}^nD_i$ is anticanonical. We let\n$\\widetilde{D}_i \\subset \\text{UT} _Y$ be the preimage of $D_i$ under\nthe quotient map $\\text{UT} _Y\\twoheadrightarrow Y$ and set\n$\\widetilde{D}= \\bigcup_{i=1}^n \\widetilde{D}_i$. The divisor\n$\\sum_{i=1}^n\\widetilde{D}_i$ is anticanonical and\n$(\\text{UT} _Y, \\widetilde{D})$ is a log Calabi–Yau pair. It follows\nfrom the work of Scott that the log Calabi–Yau variety\n$\\text{UT} _Y \\setminus \\widetilde{D}$ has a cluster structure of type\n$\\mathcal{A}$ which is skew-symmetric (that is, for its fixed data all\n$d_i=1$ and $N=N^\\circ$) and such that the frozen variables are\nprecisely the Plücker variables $\\{ p_{[i+1,i+k]}\\}_{i=1}^{n}$. This\ncluster structure is given by an inclusion\n$$\\mathcal{A} \\hookrightarrow \\text{UT} _Y  \\setminus \\widetilde{D}.$$\nSince $\\widetilde{D}$ is the locus in $\\text{UT} _{Y}$ where the frozen\nvariables vanish, we have that $\\mathcal{A} \\subset \\text{UT} _Y$ is a\npartial minimal model (see the example below Definition ). In (see also\n), the authors show that $\\mathcal{A}$ has a seed with a maximal green\nsequence so we can use to conclude that the full Fock–Goncharov\nconjecture holds for $\\mathcal{A}$. Proposition 9.4 in together with\nimply that $\\mathcal{A} \\subset \\text{UT} _Y$ is a partial minimal model\nwith enough theta functions in the sense of Definition . In the\nfollowing subsection we exhibit a cluster ensemble lattice map $p^*$ for\n$\\mathcal{A}$ such that for $K := \\ker(p^*)$, the pair $(p,K)$ has the\nPicard property in the sense of with respect to\n$\\mathcal{A} \\subset \\text{UT} _Y$. These considerations allow us to\napply to all the results of § and §. In particular, we can think of the\nGrassmannian as a minimal model for the quotient $\\mathcal{A} /T_K$.\n\n**Remark 95**. The variety $Y \\setminus D$ is usually called the open\npositroid variety. This variety can be endowed with a cluster structure\nof any of the kinds we consider in this paper: $\\mathcal{A}$,\n$\\mathcal{X}$, a quotient of $\\mathcal{A}$ or a fibre of $\\mathcal{X}$.\n\n## The Picard property\n\nIn this section we verify that the Picard property () holds for a\ncertain choice of cluster ensemble map and sublattice. This condition is\nnecessary in order to apply to the Grassmannian.\n\nWe rely on background from but recall important notions below. For\nbackground on plabic graphs we refer the reader to *loc. cit.*. Recall,\nthat plabic graphs[^8] are combinatorial objects encoding those seeds\nwhose associated $\\mathcal{A}$-cluster variables are Plücker\ncoordinates. To simplify the exposition we do not distinguish between a\nplabic graph and its associated seed. Given an index set\n$J\\in\\binom{[n]}{n-k}$ we construct a Young diagram $\\mu_J$ inside an\n$(n-k)\\times k$ grid inside a rectangle. Let $w_J$ be the path along\nedges of the grid from north east to south west corner whose south steps\nare in $J$. Then $\\mu_J$ is the Young diagram (inside the rectangle\nattached to the north west corner) whose south east border is $w_J$.\nAmong all plabic graphs there is a particularly symmetric one know as\nthe **rectangles plabic graph**\n$G_{\\Yboxdim 4pt \\yng(1)}:=G^{\\rm rec}_{k,n}$. The associated cluster\nvariables are naturally indexed by *rectangular* Young diagrams\n(together with the *empty* rectangle, denoted by $\\varnothing$). In what\nfollows we focus on this plabic graph as the initial seed and denote by\n$$\\label{eq_seed}\n\\textbf{s}_{\\Yboxdim 4pt \\yng(1)}=(e_\\varnothing)\\cup(e_{i\\times j}\\mid  1\\le i\\le n-k, 1\\le j\\le k),$$\nthe induced basis of $N=N^\\circ\\cong \\mathbb Z^{k(n-k)+1}$. Let\n$\\{f_\\varnothing\\}\\cup\\{f_{i\\times j}\\mid  1\\le i\\le n-k, 1\\le j\\le k\\}$\ndenote the corresponding basis of $M^\\circ=M$. We write $N_{\\textbf{s}}$\nrespectively $M_\\textbf{s}$ whenever we think of the lattices together\nwith a choice of basis induced by a seed $\\textbf{s}$.\n\nWe start by defining a lattice map\n$$\\psi:N_{\\textbf{s}_{\\Yboxdim 4pt \\yng(1)}} \\to  M_{\\textbf{s}_{\\Yboxdim 4pt \\yng(1)}}$$\nwhich is given with respect to the bases induced by\n$s_{\\Yboxdim 4pt \\yng(1)}$ as follows: for $i\\times j$ a mutable vertex\nand $a\\times b$ with either $a=n-k$ or $b=k$ a frozen vertex we define\n$$\\begin{aligned}\n    e_{i\\times j} &\\mapsto& f_{(i-1)\\times(j-1)} - f_{(i-1)\\times j} + f_{i\\times(j+1)} - f_{(i+1)\\times (j+1)} + f_{(i+1)\\times j} - f_{i\\times (j-1)} \\\\\n   % e_{1\\times 1} &\\mapsto& f_{1\\times 2} - f_{2\\times 2} + f_{2\\times 1} - f_{\\varnothing}\\\\\n    e_{a\\times b} &\\mapsto& f_{a\\times b} - f_{(a-1)\\times b} + f_{(a-1)\\times(b-1)} - f_{a\\times (b-1)} \\\\\n    e_{\\varnothing} &\\mapsto& f_{\\varnothing} - f_{1\\times k} + f_{1\\times 1} - f_{(n-k)\\times 1}\n\\end{aligned}$$ with the convention that $f_{0\\times j}=f_{i\\times 0}=0$\nwhenever $i,j\\not =0$ and $f_{0\\times 0}=f_{\\varnothing}$. We may\npresent the map pictorially by recording the coefficient of the basis\nelement $e_{i\\times j}$ in the $i\\times j$’th position of the grid (with\nan extra position $0\\times 0$ representing the vertex $\\varnothing$).\n$$\\begin{aligned}\n\\label{eq:pictorial p*}\n\\begin{tikzpicture}[scale=.4]\n\\node at (-3,0){$e_{i\\times j}$};\n\\draw[dashed,opacity=.5] (-1.5,.5) -- (1.5,.5);\n\\draw[dashed,opacity=.5] (-1.5,-0.5) -- (1.5,-0.5);\n\\draw[dashed,opacity=.5] (-0.5,-1.5) -- (-0.5,1.5);\n\\draw[dashed,opacity=.5] (0.5,-1.5) -- (0.5,1.5);\n\\node[opacity=.5] at (-1,-1) {\\small $0$};\n\\node[opacity=.5] at (-1,0) {\\small$0$};\n\\node[opacity=.5] at (-1,1) {\\small$0$};\n\\node[opacity=.5] at (0,-1) {\\small$0$};\n\\node at (0,0) {\\small$1$};\n\\node[opacity=.5] at (0,1) {\\small$0$};\n\\node[opacity=.5] at (1,-1) {\\small$0$};\n\\node[opacity=.5] at (1,0) {\\small$0$};\n\\node[opacity=.5] at (1,1) {\\small$0$};\n\n\\node at (2.5,0) {$\\mapsto$};\n\n\\begin{scope}[xshift=5cm]\n\\draw[dashed,opacity=.5] (-1.5,.5) -- (1.5,.5);\n\\draw[dashed,opacity=.5] (-1.5,-0.5) -- (1.5,-0.5);\n\\draw[dashed,opacity=.5] (-0.5,-1.5) -- (-0.5,1.5);\n\\draw[dashed,opacity=.5] (0.5,-1.5) -- (0.5,1.5);\n\\node[opacity=.5]  at (-1,-1) {\\small$0$};\n\\node at (-1,0) {\\small$-1$};\n\\node at (-1,1) {\\small$1$};\n\\node at (0,-1) {\\small$1$};\n\\node[opacity=.5]  at (0,0) {\\small$0$};\n\\node at (0,1) {\\small$-1$};\n\\node at (1,-1) {\\small$-1$};\n\\node at (1,0) {\\small$1$};\n\\node[opacity=.5]  at (1,1) {\\small$0$};\n\\end{scope}\n\n\n\\begin{scope}[xshift=15cm]\n\\node at (-3,0){$e_{a\\times b}$};\n\\draw[dashed,opacity=.5] (-1.5,.5) -- (1.5,.5);\n\\draw[dashed,opacity=.5] (-1.5,-0.5) -- (1.5,-0.5);\n\\draw[dashed,opacity=.5] (-0.5,-1.5) -- (-0.5,1.5);\n\\draw[dashed,opacity=.5] (0.5,-1.5) -- (0.5,1.5);\n\\node[opacity=.5] at (-1,-1) {\\small $0$};\n\\node[opacity=.5] at (-1,0) {\\small$0$};\n\\node[opacity=.5] at (-1,1) {\\small$0$};\n\\node[opacity=.5] at (0,-1) {\\small$0$};\n\\node at (0,0) {\\small$1$};\n\\node[opacity=.5] at (0,1) {\\small$0$};\n\\node[opacity=.5] at (1,-1) {\\small$0$};\n\\node[opacity=.5] at (1,0) {\\small$0$};\n\\node[opacity=.5] at (1,1) {\\small$0$};\n\n\\node at (2.5,0) {$\\mapsto$};\n\n\\begin{scope}[xshift=5cm]\n\\draw[dashed,opacity=.5] (-1.5,.5) -- (1.5,.5);\n\\draw[dashed,opacity=.5] (-1.5,-0.5) -- (1.5,-0.5);\n\\draw[dashed,opacity=.5] (-0.5,-1.5) -- (-0.5,1.5);\n\\draw[dashed,opacity=.5] (0.5,-1.5) -- (0.5,1.5);\n\\node[opacity=.5]  at (-1,-1) {\\small$0$};\n\\node at (-1,0) {\\small$-1$};\n\\node at (-1,1) {\\small$1$};\n\\node[opacity=.5]  at (0,-1) {\\small$0$};\n\\node at (0,0) {\\small$1$};\n\\node at (0,1) {\\small$-1$};\n\\node[opacity=.5]  at (1,-1) {\\small$0$};\n\\node[opacity=.5]  at (1,0) {\\small$0$};\n\\node[opacity=.5]  at (1,1) {\\small$0$};\n\\end{scope}\n\\end{scope}\n\\end{tikzpicture}\n\\end{aligned}$$ All entries in the grid above *not* corresponding to\nvertices in the particular case considered should simply be neglected. A\nstraightforward computation reveals the following\n\n**Proposition 96**.\n\nWe have\n$\\ker(\\psi)=\\langle{(1,1,\\dots,1)}\\rangle=K_{\\textbf{s}_{\\Yboxdim 4pt \\yng(1)}}$\nand $\\psi(N_{\\textbf{s}_{\\Yboxdim 4pt \\yng(1)}})={(1,1,\\dots,1)}^\\perp$.\nSo, the induced map $\\psi:N/K\\to K^\\perp$ is a lattice isomorphism.\n\nIn fact, $\\psi$ defines a cluster ensemble lattice map (Definition ), so\nwe obtain $$\\begin{aligned}\n\\label{eq:p-map Gr}\n    p:\\mathcal{A} \\to\\mathcal{X} , \\quad \\text{determined by }\\quad  (p\\vert_{\\mathcal{A} _{\\textbf{s}_{\\Yboxdim 4pt \\yng(1)}}})^*=\\psi.\n\\end{aligned}$$ There is a combinatorial way to obtain the map $\\psi$ by\nintroducing *frozen arrows* to the quiver of the initial seed to *close\ncycles* involving frozen vertices (see Figure ). These arrows are used\nto determine the submatrix denoted by $*$ in .\n\nAs a direct consequence of and we observe that the action of $T_K$ on\n$\\mathcal{A}$ coincides with the $\\mathbb{C} ^*$-action (of\nsimultaneously scaling Plücker coordinates) on $\\text{UT} _Y$ restricted\nto $\\mathcal{A}$. In particular:\n\n**Corollary 97**. The Picard property holds for $(p,K)$ with respect to\n$\\mathcal A\\hookrightarrow\\text{UT} _Y$.\n\n## Valuations and Newton–Okounkov bodies\n\nThis subsection is the core of our application to the Grassmannian. We\nshow in Theorem  that certain Newton–Okounkov bodies as they appear in\n(see also Remark ) are unimodularly equivalent to Newton–Okounkov bodies\nof Rietsch–Williams. We first introduce the combinatorics that govern\nRietsch–Williams’ flow valuation and the ${\\bf g}$-vector valuation in\nthis case.\n\n### The flow valuation\n\nBased on Postnokiv’s *boundary measurement map* for plabic networks\nRietsch–Williams associate a **flow valuation** to every plabic graph\n$G$ or more generally every seed $\\textbf{s}$ making use of the\n$\\mathcal{X}$-type cluster structure on the Grassmannian We denote it by\n$$\\mathop{\\mathrm{val}}_\\textbf{s}:\\mathbb C(Y)\\setminus \\{0\\}\\to \\mathbb Z^{(n-k)\\times k}.$$\nThe valuation is defined as the multidegree of the lowest degree summand\n(with respect a fixed graded lexicographic order) on Laurent polynomials\nin $\\mathcal{X}$ variables and then extended to rational functions in\nthe natural way. The lattice is of dimension $(n-k)k$ (as apposed to\n$(n-k)k+1$ which is the number of vertices), as the the variable\ncorresponding to $\\varnothing$ never appears (more details below in §).\nNotice that it therefore coincides with our definition of a **c**-vector\nvaluation for cluster $\\mathcal{X}$ varieties (Corollary ). For\n$G=G_{\\Yboxdim 4pt \\yng(1)}$ we simply write\n$\\mathop{\\mathrm{val}}_G=\\mathop{\\mathrm{val}}_{\\Yboxdim 4pt \\yng(1)}$.\nThe flow valuation with respect to the rectangles plabic graph can be\ncomputed in a particularly explicit way as Rietsch–Williams show in . We\nbriefly summarize some of their findings.\n\n**Proposition 98**. For $J\\in\\binom{[n]}{n-k}$, the valuation\n$\\mathop{\\mathrm{val}}_{\\Yboxdim 4pt \\yng(1)}(p_J)$ can be represented\nby a *GT tableau* (defined as follows, see ) of size $(n-k)\\times k$\nwhose $(i\\times j)^{\\text{th}}$ entry represents the coefficient of the\ncorresponding basis element. The entries of the GT tableau are obtained\nas in four steps:\n\n-   draw the Young diagram $\\mu_J$ whose south border is the path $w_J$\n    associated to $J$ in the $(n-k)\\times k$-rectangle;\n\n-   draw another copy of $w_{J}$ shifted by *one step south* and *one\n    step east* (this implies that some steps of the new path $w_J^1$ lie\n    outside of the $(n-k)\\times k$-rectangle);\n\n-   continue repeating Step 2 until the new copy of $w_J$ lies\n    *entirely* outside of the $(n-k)\\times k$-rectangle;\n\n-   lastly, place an $i$ inside every box (that is part of the\n    $(n-k)\\times k$-rectangle) in between the paths $w_{J}^{i-1}$ and\n    $w_{J}^i$.\n\nAll other boxes are filled with zeros.\n\nRietsch–Williams compute the Newton–Okounkov bodies associated to this\nvaluation. In our notation they are of form\n$\\Delta_{\\mathop{\\mathrm{val}}_{\\Yboxdim 4pt \\yng(1)}}(D_{n-k},p_{(n-k)\\times k})$,\nwhere $p_{(n-k)\\times k}=p_{[1,n-k]}$ is the Plücker coordinate (and\nhence section of $\\mathcal L_e$) associated to the frozen vertex\n$(n-k)\\times k$.\n\n**Example 99**. The procedure of is depicted in Figure  for\n$J=\\{3,4,7,9,11,12\\}\\subset [13]$.\n\n### A combinatorial description of ${\\bf g}$-vectors\n\nIn this subsection we consider the cluster variety\n$\\mathcal{A} ^{\\rm op}$ whose initial quiver is obtained by opposing the\ninitial quiver for $\\mathcal{A}$. It is well known that $\\mathcal{A}$\nand $\\mathcal{A} ^{\\rm op}$ are isomorphic (in general opposing the\nquiver gives rise to isomorphic cluster $\\mathcal{A}$-varieties). We\nalso have a partial minimal model\n$\\mathcal{A} ^{\\mathop{\\mathrm{op}}} \\hookrightarrow \\text{UT} _Y$. We\nwrite $\\textbf{s}_{\\Yboxdim 4pt \\yng(1)}^{\\rm op}$ to denote the seed\n$\\textbf{s}_{\\Yboxdim 4pt \\yng(1)}$ of equation thought of as the\ninitial seed for $\\mathcal{A} ^{\\rm op}$. Notice that $-\\psi$ determines\na cluster ensemble map\n$p^{\\rm op}:\\mathcal{A} ^{\\rm op} \\to \\mathcal{X} ^{\\rm op}$ so that the\nPicard property holds for $(p^{\\rm op},K)$ with respect to\n$\\mathcal{A} ^{\\rm op}\\hookrightarrow \\text{UT}_Y$.\n\nIn this setting an explicit combinatorial formula to compute\n**g**-vectors of Plücker coordinates can be deduced from the\ncategorification of the Grassmannian cluster algebra developed in . We\nlearned about it from Bernhard Keller in private email communication.\nThe below formula describes ${\\bf g}$-vectors with respect to the seed\n$\\textbf{s}_{\\Yboxdim 4pt \\yng(1)}^{\\rm op}$ for the cluster variety\n$\\mathcal{A} ^{\\rm op}$ which we think of as another cluster structure\non $\\text{UT} _{Y}$.\n\n**Corollary 100**.\n\n(Hook formula for **g**-vectors) Consider the seed\n$\\textbf{s}_{\\Yboxdim 4pt \\yng(1)}^{\\rm op}$ and\n$J \\in \\binom{[n]}{n-k}$. We let $i_1\\times j_1,\\dots ,i_s \\times j_s$\nbe the rectangles corresponding to the turning points in the path $w_J$\nthat cuts out $\\mu_J$ inside the $(n-k)\\times k$-rectangle. Then\n$$\\begin{aligned}\n\\label{eq:g-vectors for Grec}\n{\\bf g}_{\\Yboxdim 4pt \\yng(1)^{\\rm op}}(p_J):={\\bf g}^{\\mathcal{A} ^{\\rm op}}_{\\textbf{s}_{\\Yboxdim 4pt \\yng(1)}^{\\rm op}}(p_J)= \\sum_{p=1}^{s}f_{i_{p}\\times j_{p}}-f_{i_{p}\\times j_{p+1}},\n\\end{aligned}$$ where we set $f_{i_s\\times j_{s+1}}:=0$.\n\n**Example 101**. The Consider $n-k=4$, $n=9$, and $J=\\{2,4,6,7\\}$. We\nhave that $\\mu_{J}=\\Yboxdim 4pt \\yng(4,3,2,2)$ and by\n$${\\bf g}_{\\Yboxdim 4pt \\yng(1)^{\\rm op}}\\left(p_{\\Yboxdim 2pt \\yng(4,3,2,2)}\\right)= f_{\\Yboxdim 2pt \\yng(4)}- f_{\\Yboxdim 2pt \\yng(3)}+f_{\\Yboxdim 2pt \\yng(3,3)}-f_{\\Yboxdim 2pt \\yng(2,2)}+f_{\\Yboxdim 2pt \\yng(2,2,2,2)}.$$\n\n### Equality of the Newton–Okounkov bodies\n\nThe aim of this section is to identify the Newton–Okounkov bodies of\nflow valuations with Newton–Okounkov bodies of **g**-vector valuations\nfor $\\mathcal{A} ^{\\rm op}$. We use a particular cluster ensemble\nlattice map for the identification and work in the initial seed\n$\\textbf{s}^{\\rm op}_{\\Yboxdim 4pt \\yng(1)}$ (whose quiver is opposite\nto the quiver depicted in Figure  for $n=9,k=5$).\n\nWe think of the open positroid variety inside\n$Y=\\text{Gr}_{n-k}(\\mathbb{C} ^n)$ as the quotient of the cluster\nvariety $\\mathcal{A} ^{\\rm op}$ by the torus $T_{K}$. We choose a\nsection\n$$\\sigma: N/K \\to N \\quad \\text{with image} \\quad N  \\cap f_{\\varnothing}^\\perp;$$\nthat is, a coset $n\\mod K$ is sent to its unique representative\nsatisfying $\\langle n,f_{\\varnothing}\\rangle=0$. It is not hard to see\nthat $\\sigma$ induces an isomorphism between the rings of rational\nfunctions $\\mathbb C(T_{M/\\langle f_{\\varnothing}\\rangle})$ and\n$\\mathbb C(T_{K^\\perp})$ that commutes with cluster $\\mathcal X$\nmutation. We use $\\sigma$ to realize\n$\\mathrm{Trop} _{\\mathbb{Z} }(({\\mathcal{X} _{\\bf 1}^\\vee})_{\\textbf{s}^\\vee})=\\mathrm{Trop} _{\\mathbb{Z} }((\\mathcal{A} ^{\\rm op}/T_K)_{\\textbf{s}^{\\rm op}})= N/K$\ninside\n$\\mathrm{Trop} _\\mathbb{Z} (\\mathcal{A} ^{\\rm op}_{\\textbf{s}^{\\rm op}})=N =\\mathrm{Trop} _\\mathbb{Z} (\\mathcal{X} ^\\vee_{\\textbf{s}^\\vee})$\nfor every seed. Moreover, the dual of $\\sigma$ induces an isomorphism of\nlattices $$\\sigma^*:M/\\langle f_{\\varnothing}\\rangle \\to K^\\perp.$$\nNotice that $T_{K^\\perp}=\\pi^{-1}(\\bf 1)$ where $\\pi:T_M\\to T_{K^*}$ is\nthe restriction of $\\mathcal{X} \\to T_{K^*}$ to a cluster chart. As\nalluded to above, we obtain an isomorphism of cluster\n$\\mathcal{X}$-varieties\n$$\\sigma^*:\\mathcal{X} _{\\setminus \\varnothing}\\to \\mathcal{X} _{\\bf 1},$$\nwhere $\\mathcal{X} _{\\setminus \\varnothing}$ is the\n$\\mathcal{X}$-variety associated with the initial data obtained by\ndeleting the index $\\varnothing$ upon realizing\n$M/\\langle f_{\\varnothing}\\rangle$ as\n$\\langle f_{i\\times j}\\mid 1\\le i\\le n-k,1\\le j\\le k\\rangle \\subset M$.\nGiven a seed $\\textbf{s}$ we denote the corresponding seed of\n$\\mathcal{X} _{\\setminus\\varnothing}$ by\n$\\textbf{s}_{\\setminus\\varnothing}$. In particular, we have\n$\\mathrm{Trop} _{\\mathbb{Z} }((\\mathcal X^\\vee_{\\setminus\\varnothing})_{\\textbf{s}^\\vee_{\\setminus \\varnothing}})=N_{\\textbf{s}}\\cap f_{\\varnothing}^\\perp$.\nThe flow valuation is defined on ring of rational functions on the\npositroid variety which coincides with\n$$\\mathbb C(\\mathcal{X} _{\\setminus \\varnothing}) \\cong \\mathbb C(x_{i\\times j}:1\\le i\\le n-k,1\\le j\\le k).$$\n\nThe next result follows from the preceding discussion and Corollary .\n\n**Proposition 102**. For every choice of seed $\\textbf{s}$ the diagram\ncommutes: $$\\xymatrix{\n\\mathbb C(\\mathcal{X} _{\\bf 1})\\setminus \\{0\\} \\ar[d]_{(\\sigma^*)^*}\\ar[r]^{\\mathbf{c} _{\\textbf{s}}} & N/K\\ar[d]^{\\sigma}\\\\\n\\mathbb C(\\mathcal{X} _{\\setminus \\varnothing})\\setminus \\{0\\} \\ar[r]_{\\mathop{\\mathrm{val}}_{\\textbf{s}}} & N_{\\textbf{s}}\\cap f_{\\varnothing}^\\perp.\n}$$\n\nThe flow valuation is a ${\\bf c}$-vector valuation for the variety\n$\\mathcal{X} _{\\setminus \\varnothing}$ as both are defined by picking\nthe lowest degree exponent of a Laurent polynomial with respect to the\nsame order. That is:\n$$\\mathop{\\mathrm{val}}_{\\textbf{s}}= \\mathbf{c} ^{\\mathcal{X} _{\\setminus \\varnothing}}_{\\textbf{s}\\setminus\\varnothing}.$$\nAlternatively, in light of Proposition we may think of the flow\nvaluation as a ${\\bf c}$-vector valuation for $\\mathcal{X} _{\\bf 1}$.\nOur aim now is to identify the images of\n$\\mathop{\\mathrm{val}}_\\textbf{s}$ with those of a $\\bf g$-vector\nvaluation for $\\mathcal{A} ^{\\rm op}$, or more precisely a\n${\\bf g}$-vector valuation for $\\mathcal{A} ^{\\rm op}/T_{K}$. To avoid\nconfusion we introduce the following notation $$\\begin{aligned}\n\\label{eq:homogenized g vector op}\n\\bar {\\bf g}_{\\Yboxdim 4pt \\yng(1)^{\\rm op}}:R\\setminus \\{0\\}\\longrightarrow K^\\perp\\cong M_{\\textbf{s}}/\\langle f_{(n-k)\\times k}\\rangle \n\\end{aligned}$$ defined for a homogeneous element\n$h\\in R_q\\setminus \\{0\\}$ by $$h\\longmapsto \n{\\bf g}_{\\Yboxdim 4pt \\yng(1)^{\\rm op}}\\left(\\frac{h}{\np_{(n-k)\\times k}^q}\\right),$$ where\n${\\bf g}_{\\Yboxdim 4pt \\yng(1)^{\\rm op}}(p_{(n-k)\\times k})=f_{(n-k)\\times k}$.\nNotice that $\\bar {\\bf g}_{\\Yboxdim 4pt \\yng(1)^{\\rm op}}$ is the\nrestriction of\n${\\bf g}_{\\Yboxdim 4pt \\yng(1)^{\\rm op}}: \\mathbb C(Y)\\setminus \\{0\\}\\to M$\nto the section ring $R\\hookrightarrow \\mathbb C(Y)$ where the embedding\nis defined by $R_q\\ni h\\mapsto h/p_{(n-k)\\times k}^q$.\n\n**Theorem 103**.\n\n*Proof.* We prove the claim in several steps. First, we need to describe\n$\\mathop{\\mathrm{val}}_{\\Yboxdim 4pt \\yng(1)^{\\rm op}}(p_J)$.\nFortunately, this is straightforward using . Let us analyze the image of\nthe *$i$-strip*, *i.e.* the image of the elements of form\n$-ie_{a\\times b}$ corresponding to a box in position $a\\times b$ of the\ngrid lying between the path $w_J^i$ and $w_J^{i-1}$. We deduce\n\nNotice that unless $i=1$ all non zero entries in the picture cancel with\nthe images of the $(i-1)$- and the $(i+1)$-strips. When $i=1$ however,\nthe entry $i=1$ above the path $w_J^{0}=w_J$ stays. Hence, for every\ncorner in $w_{J}$ corresponding to a south step followed by a west step\n$-\\psi\\left(\\mathop{\\mathrm{val}}_{\\Yboxdim 4pt \\yng(1)}(p_J)\\right)$\nhas coefficient $1$ for $f_{a\\times b}$ where $a\\times b$ corresponds to\nthe box whose south east corner coincides with this corner of $w_J$. The\ncase of a corner in $w_{J}$ corresponding to a west step followed by a\nsouth step is very similar, with the only difference that the signs\nchange. In particular,\n$-\\psi\\left(\\mathop{\\mathrm{val}}_{\\Yboxdim 4pt \\yng(1)}(p_J)\\right)$\nhas coefficient $-1$ for $f_{a\\times b}$ where $a\\times b$ corresponds\nto the box whose south east corner is adjacent to this corner of $w_J$.\n\nIt is left to analyze the parts of\n$-\\psi\\left(\\mathop{\\mathrm{val}}_{\\Yboxdim 4pt \\yng(1)}(p_J)\\right)$\ncorresponding to *frozen* vertices. The arguments here are very similar,\nthe only special case being the south east corner of the\n$(n-k)\\times k$-rectangle. Hence, we restrict our attention to this case\nand omit the others.\n\nConsider the vertex in position $(n-k)\\times k$ and assume in\n$\\mathop{\\mathrm{val}}_{\\Yboxdim 4pt \\yng(1)}(p_J)$ the corresponding\nentry is $i$. Notice, that coefficient for the vertex\n$(n-k-1)\\times (k-1)$ necessarily is $i-1$. So, applying $-\\psi$ we see\nthat\n\nObserve that the entries $1$ and $-i$ cancel by similar arguments as\nabove. The only non-zero coefficient in this picture is $-1$ for\n$f_{(n-k)\\times k}$. Summarizing, we have $$\\begin{aligned}\n-\\psi\\left(\\mathop{\\mathrm{val}}_{\\Yboxdim 4pt \\yng(1)}(p_J)\\right) &=& \\sum_{\\begin{smallmatrix}\n  \\text{south to west}\\\\\n  \\text{corners of }w_J\n\\end{smallmatrix} } f_{a'\\times b'} -f_{(n-k)\\times k}\n-\\sum_{\\begin{smallmatrix}\n  \\text{west to south}\\\\\n  \\text{corners of }w_J\n\\end{smallmatrix} } f_{a\\times b}\\\\\n&\\overset{\\text{Equation~\\eqref{eq:g-vectors for Grec}}}{=}& {\\bf g}_{\\Yboxdim 4pt \\yng(1)^{\\rm op}}(p_{J}) - f_{(n-k)\\times k}=\\bar{\\bf g}_{\\Yboxdim 4pt \\yng(1)^{\\rm op}}(p_J).\n\\end{aligned}$$ This implies\n$-\\psi(\\Delta_{\\mathop{\\mathrm{val}}_{\\Yboxdim 4pt \\yng(1)}}(D_{n-k},p_{(n-k)\\times k})=\\Delta_{\\bar{\\bf g}_{\\Yboxdim 4pt \\yng(1)}^{\\rm op}}(D_{n-k},p_{(n-k)\\times k})$. ◻\n\n**Remark 104**. The attentive reader might notice that the Theorem  and\nthe discussion preceding it closely resemble Lemma . However, the\ndifference in convention choices in and the present paper yield the\nnecessity of a non-trivial change of coordinates. To avoid lengthening\nthe exposition even more we decided to give the result in a single seed\nbut allude to the fact that Theorem  indeed is an instance of Lemma \n(after non-trivial changes of cluster coordinates). After making the\nappropriate change of coordinate one can show that the map $-\\psi$ may\nbe described as the tropicalization of a cluster ensemble map. In\nparticular, Theorem  can be extended to all seeds.\n\n## The intrinsic Newton–Okounkov body for Grassmannians\n\nAs before, let $\\mathcal{L} _e$ be the bundle over\n$\\text{Gr}_{n-k}(\\mathbb{C} ^n)$ obtained by pullback of\n$\\mathcal{O}(1)$ under the Plücker embedding\n$\\text{Gr}_{n-k}(\\mathbb{C} ^n)\\hookrightarrow \\mathbb{P}^{\\binom{n}{k}-1}$.\nRecall the definition of the intrinsic Newton–Okounkov body from\nDefinition .\n\n**Corollary 105**. Consider the partial minimal model\n$\\mathcal{A} ^{\\rm op}\\hookrightarrow \\text{UT} _{Y}$, the minimal model\n$\\mathcal{A} ^{\\mathop{\\mathrm{op}}}/T_K \\hookrightarrow Y$ and the map\n${\\bf g}:\\mathbb{B}_{\\vartheta }(\\mathcal{A} ^{\\mathop{\\mathrm{op}}})\\to \\mathrm{Trop} _{\\mathbb{Z} }((\\mathcal{A} ^{\\mathop{\\mathrm{op}}})^\\vee)$\nof . Then $$\\begin{split} \n\\Delta_{\\mathrm{BL}}(\\mathcal{L} _e) = \\mathop{\\mathrm{conv_{BL}}}\\Bigg( \\left\\{ \\mathbf{g} \\left(p_J\\right) \\mid J \\in \\binom{[n]}{n-k}\\right\\} \\Bigg).\n \\end{split}$$\n\n*Proof.* The Newton–Okounkov polytope for the flow valuation with\nrespect to $\\textbf{s}_{\\Yboxdim 4pt \\yng(1)}$ is the convex hull of the\nimages of Plücker coordinates (see ). So by Theorem  the same is true\nfor the Newton–Okounkov body\n$\\Delta_{\\bar{\\mathbf{g} }_{\\Yboxdim 4pt \\yng(1)^{\\mathop{\\mathrm{op}}}}}(D_{n-k},p_{(n-k)\\times k})$.\nBy Theorem ,\n$\\Delta_{\\bar{\\mathbf{g} }_{\\Yboxdim 4pt \\yng(1)^{\\mathop{\\mathrm{op}}}}}(D_{n-k},p_{(n-k)\\times k})$\nis positive, hence it is broken line convex. Therefore, the broken line\nconvex hull of the set\n$\\left\\{ \\bar{\\mathbf{g} }_{\\Yboxdim 4pt \\yng(1)^{\\mathop{\\mathrm{op}}}} \\left(p_J\\right) \\mid J \\in \\binom{[n]}{n-k}\\right\\}$\nin the lattice\n$\\mathrm{Trop} _\\mathbb{R} ({\\mathcal{A} ^{\\mathop{\\mathrm{op}}}/T_{K}}^{\\vee}_{\\textbf{s}_{\\Yboxdim 4pt \\yng(1)^{\\mathop{\\mathrm{op}}}}})$\ncoincides with its convex hull. We take into account Remark  to get that\n$\\Delta_{\\mathbf{g} _{\\Yboxdim 4pt \\yng(1)^{\\mathop{\\mathrm{op}}}}}(\\mathcal{L} _e) = \\Delta_{\\bar{\\mathbf{g} }_{\\Yboxdim 4pt \\yng(1)^{\\mathop{\\mathrm{op}}}}}(D_{n-k},p_{(n-k)\\times k}) + \\mathbf{g} _{\\Yboxdim 4pt \\yng(1)^{\\mathop{\\mathrm{op}}}}(p_{(n-k)\\times k})$.\nThis implies that\n$\\Delta_{\\mathbf{g} _{\\Yboxdim 4pt \\yng(1)^{\\mathop{\\mathrm{op}}}}}(\\mathcal{L} _e)$\nis the broken line convex hull of the set\n$\\left\\{ \\mathbf{g} _{\\Yboxdim 4pt \\yng(1)^{\\mathop{\\mathrm{op}}}} \\left(p_J\\right) \\mid J \\in \\binom{[n]}{n-k}\\right\\}$.\nBeing a broken line convex set is independent of the choice of seed, the\nresult follows. ◻\n\n**Remark 106**. In the proof of Corollary , we implicitly use the\n$\\mathcal{A}$ cluster structure to view the intrinsic Newton–Okounkov\nbody $\\Delta_{\\mathrm{BL}}(\\mathcal{L} _e)$ as the broken line convex\nhull of tropical points indexing Plücker coordinates. We could\nalternatively use the $\\mathcal{X}$ cluster structure as\nRietsch–Williams do, and define theta functions with the corresponding\n$\\mathcal{X}$ scattering diagram. By identifying the Rietsch–Williams\nvaluation with the $\\mathbf{c}$-vector valuation (Corollary ) and noting\nthat there is a cluster ensemble automorphism of the open positroid\nvariety (see , ), we can apply Lemma  to give a completely analogous\nstatement to Corollary  which uses the Rietsch–Williams valuation rather\nthan the $\\mathbf{g}$-vector valuation. In fact, in § we present an\nexample of an explicit computation related to the intrinsic\nNewton–Okounkov body $\\Delta_{\\mathrm{BL}}(\\mathcal{L} _e)$ defined via\nthe $\\mathcal{X}$ scattering diagram.\n\n### Example\n\nIn this subsection, we will give an example of the intrinsic\nNewton–Okounkov body for the case of\n$\\mathop{\\mathrm{Gr}}_3\\left(\\mathbb{C} ^6\\right)$ and compare this to a\nNewton–Okounkov body of . In particular, in , Rietsch–Williams discuss a\nnon-integral vertex appearing in the Newton–Okounkov body\n$\\Delta_{\\mathop{\\mathrm{val}}_{G}}(D_{3},p_{123})$ associated to the\nplabic graph $G$ of Figure . We illustrate how this non-integral vertex\nin the usual Newton–Okounkov body framework corresponds to a point in\nthe interior of a broken line segment in\n$\\Delta_{\\mathrm{BL}}(\\mathcal{L} _e)$ and thus is not a genuine vertex\nfrom the intrinsic Newton–Okounkov body perspective. Here, to facilitate\ncomparison with , we will view the open positroid variety as\n$\\mathcal{X} _{\\mathbf{1}}$ (up to codimension 2). So, the scattering\ndiagram we use to define $\\Delta_{\\mathrm{BL}}(\\mathcal{L} _e)$ in the\nsubsection will be\n$\\mathfrak{D}^{\\mathcal{X} _{\\mathbf{1}}}_{\\text{in},\\textbf{s}_G}$ for\na particular choice of initial seed $\\textbf{s}_G$. The choice of seed\nis encoded by the plabic graph illustrated in Figure .\n\nRecall that the Young diagrams in Figure  label the network parameters\nused in flow polynomial expressions (see ). The $\\mathcal{A}$-cluster\ndetermined by trips in the plabic graph $G$ consists of the Plücker\ncoordinates whose indices are given in Figure  (see ).\n\nAccording to , a non-integral vertex in the Newton–Okounkov polytope\ncomes from half the valuation of the flow polynomial for the element\n$f = (p_{124} p_{356} - p_{123} p_{456}) / {p_{123}^2}$. They compute\n$\\frac{1}{2}\\mathop{\\mathrm{val}}_G(f)$ and express its entries in\ntabular form (see ) as we have reproduced in Table . The function\n$p_{123}^2\\, f$ is one of the two $\\mathcal{A}$ cluster variables that\nare not Plücker coordinates, see *e.g.* . It is obtained by mutation at\n$\\Yboxdim 4pt \\yng(2,1)$.\n\n```latex\n\\begin{tabular}{|C|C|C|C|C|C|C|C|C|} \n\\hline\n  \\vphantom{\\Yboxdim 4pt \\yng(1,1,1,1)} \\Yboxdim 4pt \\yng(3,3,3)  & \\Yboxdim 4pt \\yng(3,3,2)  & \\Yboxdim 4pt \\yng(2,2,2)  & \\Yboxdim 4pt \\yng(1,1,1)  & \\Yboxdim 4pt \\yng(3,3)  & \\Yboxdim 4pt \\yng(2,1)  & \\Yboxdim 4pt \\yng(1,1)  & \\Yboxdim 4pt \\yng(3)  & \\Yboxdim 4pt \\yng(2)\\\\\n\\hline\n     \\vphantom{\\Yboxdim 4pt \\yng(1,1,1)_{\\Yboxdim 4pt \\yng(1,1,1)}}\\frac{3}{2} &     \\frac{3}{2} & 1 &   \\frac{1}{2} & 1 & \\frac{1}{2} &\\frac{1}{2} &\\frac{1}{2} &\\frac{1}{2} \\\\\n\\hline\n\\end{tabular}\n```\n\nNote we can re-interpret the expression for $f$ as the expansion of a\nproduct of theta functions. All Plücker coordinates are $\\mathcal{A}$\ncluster variables, and all $\\mathcal{A}$ cluster monomials are theta\nfunctions. Then $$\\begin{split} p_{124}\\, p_{356} = \np_{123}^2\\, f + p_{123}\\, p_{456}, \\end{split}$$ and the right hand side\nis a sum of two theta functions. This means there are only two balanced\npairs of broken lines contributing to the product. We will see that the\npair with no bending corresponds to the summand $p_{123}\\, p_{456}$,\nwhile the other involves a maximal bend at an initial wall. (Since the\nbend is at an initial wall, we are able to see the relevant broken line\nsegment without constructing the consistent scattering diagram.)\n\nWe interpret the Rietsch–Williams valuation as being valued in\n$\\mathrm{Trop} _{\\mathbb{Z} }((\\mathcal{X} _{\\mathbf{1}})^{\\vee})$ (see\nProposition ) and consider broken lines in the associated $\\mathcal{X}$\ncluster scattering diagram. The choice of seed identifies\n$\\mathrm{Trop} _{\\mathbb{Z} }((\\mathcal{X} _{\\mathbf{1}})^\\vee)$ with\n$N^\\vee / {\\left\\langle{(1,1,\\dots, 1)}\\right\\rangle}$ and we draw the\nscattering in\n$\\left( N^\\vee / {\\left\\langle{(1,1,\\dots, 1)}\\right\\rangle}\\right) \\otimes \\mathbb{R}$.\n\nWe will use Figure to define the fixed data and the seed data for the\ncluster structure. The initial scattering diagram for the $\\mathcal{X}$\nvariety is\n$$\\mathfrak{D}^{\\mathcal{X} _{}}_{\\text{in},\\textbf{s}_G}= \\left\\{   \\left( (v_{\\mu})^{\\perp}, \\  1+ z^{e_{\\mu}}\\right) \\mid \\mu \\in \\{ \\Yboxdim 4pt \\yng(2), \\Yboxdim 4pt \\yng(2,1), \\Yboxdim 4pt \\yng(3,3,2), \\Yboxdim 4pt \\yng(1,1) \\}\\right\\}.$$\nTo get initial scattering diagram for the fibre over $\\mathbf{1}$ we\ntake the quotient of the support\n$\\mathfrak{D}^{\\mathcal{X} }_{\\text{in},\\textbf{s}_G}$ by\n$\\left(\\mathbb{R} \\cdot (1,1,\\dots, 1) \\right)$. (Observe that\n$(v_{\\mu})^{\\perp}$ is invariant under translations by\n$\\mathbb{R} \\cdot (1,1,\\dots, 1)$.)\n\n$$\\mathfrak{D}^{\\mathcal{X} _{\\mathbf{1}}}_{\\text{in},\\textbf{s}_G}= \\left\\{   \\left( (v_{\\mu})^{\\perp}/ \\left(\\mathbb{R} \\cdot (1,1,\\dots, 1) \\right) , \\  1+ z^{e_{\\mu}}\\right) \\mid \\mu \\in \\{ \\Yboxdim 4pt \\yng(2), \\Yboxdim 4pt \\yng(2,1), \\Yboxdim 4pt \\yng(3,3,2), \\Yboxdim 4pt \\yng(1,1) \\}\\right\\}$$\n\nAll pertinent valuations may be found in . We record them here using the\nordering of Table . We choose the representative whose coefficient of\n$e_\\varnothing$ is $0$ and do not record this entry.\n$$\\begin{split} \\mathop{\\mathrm{val}}_{G}({p}_{124}) = (1,0,0,0,0,0,0,0,0) \\end{split}$$\n$$\\begin{split} \\mathop{\\mathrm{val}}_{G}({p}_{356}) = (2,2,2,1,2,1,1,1,1) \\end{split}$$\n$$\\begin{split} \\mathop{\\mathrm{val}}_{G}({p}_{123}) = (0,0,0,0,0,0,0,0,0)  \\end{split}$$\n$$\\begin{split} \\mathop{\\mathrm{val}}_{G}({p}_{456}) = (3,2,2,1,2,1,1,1,1)  \\end{split}$$\nSo, we have\n$\\mathop{\\mathrm{val}}_{G}({p}_{124}) + \\mathop{\\mathrm{val}}_G( p_{356} ) = \\mathop{\\mathrm{val}}_{G}( p_{123} \\, p_{456}  )$.\nThe summand $p_{123} \\, p_{456}$ in the product $p_{124} \\, p_{356}$\ncorresponds to the straight broken line segment from\n$\\mathop{\\mathrm{val}}_{G}({p}_{356})$ to\n$\\mathop{\\mathrm{val}}_G( p_{124} )$, whose midpoint is\n$\\frac{1}{2}  \\mathop{\\mathrm{val}}_{G}( p_{123} \\, p_{456}  )$.\n\nThe bending wall for the other broken line segment is\n$\\left((v_{\\Yboxdim 2pt \\yng(3,3,2)})^{\\perp}, 1+z^{e_{\\Yboxdim 2pt \\yng(3,3,2)}}\\right)$.\nNote that\n$v_{\\Yboxdim 2pt \\yng(3,3,2)}= f_{\\Yboxdim 2pt \\yng(3,3,3)} + f_{\\Yboxdim 2pt \\yng(2,1)}-f_{\\Yboxdim 2pt \\yng(3,3)} - f_{\\Yboxdim 2pt \\yng(2,2,2)}$,\nand\n$\\frac{1}{2}\\mathop{\\mathrm{val}}(f) = \\frac{1}{2}\\mathop{\\mathrm{val}}(p_{123}^2\\, f)$\nis perpendicular to this vector. So,\n$\\frac{1}{2}\\mathop{\\mathrm{val}}(p_{123}^2\\, f)$ lies in the support of\nthis wall. We will see that there is a broken line segment from\n$\\mathop{\\mathrm{val}}_{G}(p_{356})$ to\n$\\mathop{\\mathrm{val}}_{G}(p_{124})$ passing through\n$\\frac{1}{2}\\mathop{\\mathrm{val}}(f)$ and bending maximally here, as\ndepicted in Figure .\n\nRecall that the exponent vector of the decoration monomial along\n$\\ell_i$ is the negative of the velocity vector there. Traveling along\n$\\ell_1$, this velocity vector is positively proportional to\n$$\\frac{1}{2}\\mathop{\\mathrm{val}}_G(f) - \\mathop{\\mathrm{val}}_G(p_{356}) = \n-\\left(\\frac{1}{2}, \\frac{1}{2}, 1, \\frac{1}{2},  1, \\frac{1}{2} , \\frac{1}{2}, \\frac{1}{2}, \\frac{1}{2}\\right).$$\nWe can take a broken line with exponent vector $v_1=(1,1,2,1,2,1,1,1,1)$\nalong $\\ell_1$. The possible bendings after crossing the wall correspond\nto summands of\n$$\\begin{split} z^{v_1}\\left(1+z^{e_{\\Yboxdim 2pt \\yng(3,3,2)}}\\right)^{-\\left\\langle{v_1,v_{\\Yboxdim 2pt \\yng(3,3,2)}}\\right\\rangle}=z^{v_1}\\left(1+z^{e_{\\Yboxdim 2pt \\yng(3,3,2)}}\\right)^{2}. \\end{split}$$\nThe maximal bending corresponds to the summand\n$z^{v_1+ 2 e_{\\Yboxdim 2pt \\yng(3,3,2)}} = z^{(1,3,2,1,2,1,1,1,1)}$. Let\nus call $v_2 := (1,3,2,1,2,1,1,1,1)$. Then observe that\n$\\frac{1}{2}\\mathop{\\mathrm{val}}_G(f) - \\frac{1}{2}v_2 = \\mathop{\\mathrm{val}}_G(p_{124})$.\nSo, we have a broken line segment $\\gamma$ traveling from\n$\\mathop{\\mathrm{val}}_G(p_{356})$ to\n$\\frac{1}{2}\\mathop{\\mathrm{val}}_G(f)$ with decoration monomial\n$z^{v_1}$, bending maximally and continuing to\n$\\mathop{\\mathrm{val}}_G(p_{124})$ with decoration monomial $z^{v_2}$.\nPrecisely $\\frac{1}{2}$ a unit of time is spent in each straight\nsegment. From this perspective, $\\frac{1}{2}\\mathop{\\mathrm{val}}_G(f)$\nis not a genuine vertex; $\\frac{1}{2}\\mathop{\\mathrm{val}}_G(f)$ is in\nthe relative interior of the support of $\\gamma$, and the endpoints of\n$\\gamma$ are in the Newton–Okounkov body.\n\n[^1]: A weaker condition is enough to construct a valuation for a given\n    seed, see Remark .\n\n[^2]: Throughout the text we consider more generally (partial) minimal\n    models for schemes $V$ with a cluster structure given by a\n    birational map $\\mathcal{V} \\dashrightarrow V$.\n\n[^3]: If $Y$ is Fano, then this should be considered as the\n    superpotential used for mirror symmetry purposes.\n\n[^4]: That is, the equivalence class of consistent scattering diagrams\n    for $\\mathcal{A}$ contains a representative whose scattering\n    functions are of this form.\n\n[^5]: In fact, it is enough that the equality holds at the level of\n    integral points, namely,\n    $k\\Delta(\\mathbb{Z} )= P_k(\\mathbb{Z} ) - k\\nu_{\\textbf{s}}(\\tau)$.\n    However, we are able to show the stronger condition\n    $k\\Delta= P_k - k\\nu_{\\textbf{s}}(\\tau)$.\n\n[^6]: That is, there is a scheme $\\mathcal Y$ and a flat morphism\n    $\\mathcal{Y}\\to \\mathbb A^1$ whose generic fibre is isomorphic to\n    $Y$ and special fibre isomorphic to the toric variety associated to\n    $\\Delta_{\\nu_{\\textbf{s}}}(D',\\tau)$.\n\n[^7]: This perspective is essentially the *jagged path* description of\n    theta functions rather than the broken line description. See for\n    example .\n\n[^8]: To be precise, we are only interested in reduced plabic graphs\n    with trip permutation $\\pi_{k,n}$.\n"
}

S2ORC

The Semantic Scholar Open Research Corpus (S2ORC) is a comprehensive dataset designed for natural language processing (NLP) and text-mining research over scientific papers. It includes rich metadata, and abstract and full-text content for millions of academic papers across various disciplines. This dataset is further divided into two components, S2ORC abstract and S2ORC full text.

Download and Extraction: S2ORC was downloaded directly in zip format using S2ORC api key and a get() request: response = urllib.request.urlopen(url)

Filters Applied: Multiple filters are used here after manually verifying output of all the filters as suggested by peS2o dataset

Title and Abstract Filter: must have title and abstract
Language Filter: The paper must be in English. To determine the language of each document, we use the pycld3 library. We run pycld3 on the first 2000 characters of each paragraph in the paper. The language of the paper is the most common language of the paragraphs.
Word Count Filter: less than 500 words (not inclusive) are discarded
Paragraph Count Filter: The paper must have at least 5 paragraphs after removing paragraphs with less than -20 average log word probability
Frequency Filter: The most frequent word in the paper consists of alpha characters only, and it appears in less than 7.5% of the document. Words are obtained by splitting the text on whitespace.

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
S2ORC	12963563	0.00%	0.00%	0.00%	0.00%	100.00%

S2ORC Abstract

Download and Extraction: S2ORC was downloaded directly in zip format using S2ORC api key and a get() request: response = urllib.request.urlopen(url)

Filters Applied: multiple filters are used here after manually verifying output of all the filters as suggested by peS2o dataset. The frequency filter was not used as suggested by peS2o because it was removing good samples as inspected manually

Title and Abstract Filter: must have title and abstract
Majority Language Filter: abstract must be in English
Minimum Word Count Filter: less than 20 (not inclusive) are discarded
Unigram Log Probability Threshold: -20
Note: Frequency Filter: The most frequent word in the paper consists of alpha characters only, and it appears in less than 7.5% of the document. Words are obtained by splitting the text on whitespace.

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
S2ORC Abstract	102324176	18.04%	1.17%	0.00%	0.13%	80.66%

PubMed Central and PubMed Abstract

Download and Extraction: All files were downloaded fromttps://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/. PubMed Central (PMC) files are downloaded in an xml.tar format. The tar files are opened and converted to markdown format using pandocpandoc <raw_xml_path> -s -o <output_markdown_path> -f jats -t markdown_mmd [--lua-filter <lua_filter_path>]. The markdown files are combined to create jsonl files. PubMed Abstract (PMA) files were downloaded in xml. The BeautifulSoup library was used to extract the abstract, title, and PMID. All files were stored in jsonl format.

Unique Data Preparation Challenges:

We tried similar attempts on PMC as we did on ArXiv. The resulted markdown might have slight difference due to the different structure of the XML files.

Filters Applied: Multiple filters are used here after manually verifying output of all the filters as suggested by peS2o dataset.

Minimum Word Count Filter: PMC documents with less than 100 words (not inclusive) are discarded; PMA documents less than 20 words are discarded
Language Filter: English only
Frequency Filter: The most frequent word in the paper consists of alpha characters only, and it appears in less than 7.5% of the document. Words are obtained by splitting the text on whitespace. This filter was not used for PMA
Unigram Log Probability Threshold: -20

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
PubMed - Central	5230932	7.66%	1.29%	0.02%	0.00%	91.03%

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
PubMed - Abstract	25787474	0.01%	0.14%	0.00%	0.00%	98.85%

PubMed Filtering Examples

Data sample: 3 of 9

Raw format

{
    "html": "\n\n38062601\n\n2023\n12\n08\n\n\n\n1681-7168\n\n33\n12\n\n2023\nDec\n\n\nJournal of the College of Physicians and Surgeons--Pakistan : JCPSP\nJ Coll Physicians Surg Pak\n\nPancreatoduodenectomy with Venous Resection or Palliative Therapy? A Meta-Analysis.\n\n1426-1432\n\n10.29271/jcpsp.2023.12.1426\n\nThis review evaluated the risks and survival benefits of pancreatoduodenectomy associated with venous resection compared with palliative surgery. A systematic review with meta-analysis was performed. Higher overall survival was observed in the pancreatic resection group (HR = 4.000; 95% CI 2.800 to 5.200). However, the palliative group had fewer complications (RD = -0.170; 95% CI -0.260 to -0.070). There was no significant difference in the mortality rates (RD = 0.000; 95% CI -0.030 to 0.030). In centres with experience in pancreatic surgery, resection may be considered for locally advanced cancer and major venous invasion. Pancreaticoduodenectomy with vascular resection may improve survival for periampullary tumours compared with palliation therapy. However, pancreaticoduodenectomy with major venous resection has potentially higher morbidity than palliation therapy. Key Words: Pancreatoduodenectomy, Pancreatic neoplasms, Vascular surgical procedures.\n\n\n\nFilho\nJoao Emilio Lemos Pinheiro\nJELP\n\n\nMarque\nStefanie Sophie Buuck\nSSB\n\n\nHenriques\nAlexandre Cruz\nAC\n\n\nDias\nAndre Roncon\nAR\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n\n\nWaisberg\nJaques\nJ\n\n\nTustumi\nFrancisco\nF\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n\n\neng\n\nJournal Article\n\n\n\nPakistan\nJ Coll Physicians Surg Pak\n9606447\n1022-386X\n\nIM\n\n\n\n\n2023\n02\n19\n\n\n2023\n09\n04\n\n\n2023\n12\n8\n6\n41\n\n\n2023\n12\n8\n6\n41\n\n\n2023\n12\n8\n0\n6\n\n\nppublish\n\n38062601\n040579197\n10.29271/jcpsp.2023.12.1426\n\n\n\n",
    "body": "\n\n38062601\n\n2023\n12\n08\n\n\n\n1681-7168\n\n33\n12\n\n2023\nDec\n\n\nJournal of the College of Physicians and Surgeons--Pakistan : JCPSP\nJ Coll Physicians Surg Pak\n\nPancreatoduodenectomy with Venous Resection or Palliative Therapy? A Meta-Analysis.\n\n1426-1432\n\n10.29271/jcpsp.2023.12.1426\n\nThis review evaluated the risks and survival benefits of pancreatoduodenectomy associated with venous resection compared with palliative surgery. A systematic review with meta-analysis was performed. Higher overall survival was observed in the pancreatic resection group (HR = 4.000; 95% CI 2.800 to 5.200). However, the palliative group had fewer complications (RD = -0.170; 95% CI -0.260 to -0.070). There was no significant difference in the mortality rates (RD = 0.000; 95% CI -0.030 to 0.030). In centres with experience in pancreatic surgery, resection may be considered for locally advanced cancer and major venous invasion. Pancreaticoduodenectomy with vascular resection may improve survival for periampullary tumours compared with palliation therapy. However, pancreaticoduodenectomy with major venous resection has potentially higher morbidity than palliation therapy. Key Words: Pancreatoduodenectomy, Pancreatic neoplasms, Vascular surgical procedures.\n\n\n\nFilho\nJoao Emilio Lemos Pinheiro\nJELP\n\n\nMarque\nStefanie Sophie Buuck\nSSB\n\n\nHenriques\nAlexandre Cruz\nAC\n\n\nDias\nAndre Roncon\nAR\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n\n\nWaisberg\nJaques\nJ\n\n\nTustumi\nFrancisco\nF\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n\n\neng\n\nJournal Article\n\n\n\nPakistan\nJ Coll Physicians Surg Pak\n9606447\n1022-386X\n\nIM\n\n\n\n\n2023\n02\n19\n\n\n2023\n09\n04\n\n\n2023\n12\n8\n6\n41\n\n\n2023\n12\n8\n6\n41\n\n\n2023\n12\n8\n0\n6\n\n\nppublish\n\n38062601\n040579197\n10.29271/jcpsp.2023.12.1426\n\n\n\n",
    "pubmedarticle": "\n\n38062601\n\n2023\n12\n08\n\n\n\n1681-7168\n\n33\n12\n\n2023\nDec\n\n\nJournal of the College of Physicians and Surgeons--Pakistan : JCPSP\nJ Coll Physicians Surg Pak\n\nPancreatoduodenectomy with Venous Resection or Palliative Therapy? A Meta-Analysis.\n\n1426-1432\n\n10.29271/jcpsp.2023.12.1426\n\nThis review evaluated the risks and survival benefits of pancreatoduodenectomy associated with venous resection compared with palliative surgery. A systematic review with meta-analysis was performed. Higher overall survival was observed in the pancreatic resection group (HR = 4.000; 95% CI 2.800 to 5.200). However, the palliative group had fewer complications (RD = -0.170; 95% CI -0.260 to -0.070). There was no significant difference in the mortality rates (RD = 0.000; 95% CI -0.030 to 0.030). In centres with experience in pancreatic surgery, resection may be considered for locally advanced cancer and major venous invasion. Pancreaticoduodenectomy with vascular resection may improve survival for periampullary tumours compared with palliation therapy. However, pancreaticoduodenectomy with major venous resection has potentially higher morbidity than palliation therapy. Key Words: Pancreatoduodenectomy, Pancreatic neoplasms, Vascular surgical procedures.\n\n\n\nFilho\nJoao Emilio Lemos Pinheiro\nJELP\n\n\nMarque\nStefanie Sophie Buuck\nSSB\n\n\nHenriques\nAlexandre Cruz\nAC\n\n\nDias\nAndre Roncon\nAR\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n\n\nWaisberg\nJaques\nJ\n\n\nTustumi\nFrancisco\nF\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n\n\neng\n\nJournal Article\n\n\n\nPakistan\nJ Coll Physicians Surg Pak\n9606447\n1022-386X\n\nIM\n\n\n\n\n2023\n02\n19\n\n\n2023\n09\n04\n\n\n2023\n12\n8\n6\n41\n\n\n2023\n12\n8\n6\n41\n\n\n2023\n12\n8\n0\n6\n\n\nppublish\n\n38062601\n040579197\n10.29271/jcpsp.2023.12.1426\n\n\n",
    "medlinecitation": "\n38062601\n\n2023\n12\n08\n\n\n\n1681-7168\n\n33\n12\n\n2023\nDec\n\n\nJournal of the College of Physicians and Surgeons--Pakistan : JCPSP\nJ Coll Physicians Surg Pak\n\nPancreatoduodenectomy with Venous Resection or Palliative Therapy? A Meta-Analysis.\n\n1426-1432\n\n10.29271/jcpsp.2023.12.1426\n\nThis review evaluated the risks and survival benefits of pancreatoduodenectomy associated with venous resection compared with palliative surgery. A systematic review with meta-analysis was performed. Higher overall survival was observed in the pancreatic resection group (HR = 4.000; 95% CI 2.800 to 5.200). However, the palliative group had fewer complications (RD = -0.170; 95% CI -0.260 to -0.070). There was no significant difference in the mortality rates (RD = 0.000; 95% CI -0.030 to 0.030). In centres with experience in pancreatic surgery, resection may be considered for locally advanced cancer and major venous invasion. Pancreaticoduodenectomy with vascular resection may improve survival for periampullary tumours compared with palliation therapy. However, pancreaticoduodenectomy with major venous resection has potentially higher morbidity than palliation therapy. Key Words: Pancreatoduodenectomy, Pancreatic neoplasms, Vascular surgical procedures.\n\n\n\nFilho\nJoao Emilio Lemos Pinheiro\nJELP\n\n\nMarque\nStefanie Sophie Buuck\nSSB\n\n\nHenriques\nAlexandre Cruz\nAC\n\n\nDias\nAndre Roncon\nAR\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n\n\nWaisberg\nJaques\nJ\n\n\nTustumi\nFrancisco\nF\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n\n\neng\n\nJournal Article\n\n\n\nPakistan\nJ Coll Physicians Surg Pak\n9606447\n1022-386X\n\nIM\n",
    "pmid": "38062601",
    "daterevised": "\n2023\n12\n08\n",
    "year": "2023",
    "month": "12",
    "day": "8",
    "article": "\n\n1681-7168\n\n33\n12\n\n2023\nDec\n\n\nJournal of the College of Physicians and Surgeons--Pakistan : JCPSP\nJ Coll Physicians Surg Pak\n\nPancreatoduodenectomy with Venous Resection or Palliative Therapy? A Meta-Analysis.\n\n1426-1432\n\n10.29271/jcpsp.2023.12.1426\n\nThis review evaluated the risks and survival benefits of pancreatoduodenectomy associated with venous resection compared with palliative surgery. A systematic review with meta-analysis was performed. Higher overall survival was observed in the pancreatic resection group (HR = 4.000; 95% CI 2.800 to 5.200). However, the palliative group had fewer complications (RD = -0.170; 95% CI -0.260 to -0.070). There was no significant difference in the mortality rates (RD = 0.000; 95% CI -0.030 to 0.030). In centres with experience in pancreatic surgery, resection may be considered for locally advanced cancer and major venous invasion. Pancreaticoduodenectomy with vascular resection may improve survival for periampullary tumours compared with palliation therapy. However, pancreaticoduodenectomy with major venous resection has potentially higher morbidity than palliation therapy. Key Words: Pancreatoduodenectomy, Pancreatic neoplasms, Vascular surgical procedures.\n\n\n\nFilho\nJoao Emilio Lemos Pinheiro\nJELP\n\n\nMarque\nStefanie Sophie Buuck\nSSB\n\n\nHenriques\nAlexandre Cruz\nAC\n\n\nDias\nAndre Roncon\nAR\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n\n\nWaisberg\nJaques\nJ\n\n\nTustumi\nFrancisco\nF\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n\n\neng\n\nJournal Article\n\n",
    "journal": "\n1681-7168\n\n33\n12\n\n2023\nDec\n\n\nJournal of the College of Physicians and Surgeons--Pakistan : JCPSP\nJ Coll Physicians Surg Pak\n",
    "issn": "1681-7168",
    "journalissue": "\n33\n12\n\n2023\nDec\n\n",
    "volume": "33",
    "issue": "12",
    "pubdate": "\n2023\nDec\n",
    "title": "Journal of the College of Physicians and Surgeons--Pakistan : JCPSP",
    "isoabbreviation": "J Coll Physicians Surg Pak",
    "articletitle": "Pancreatoduodenectomy with Venous Resection or Palliative Therapy? A Meta-Analysis.",
    "pagination": "\n1426-1432\n",
    "medlinepgn": "1426-1432",
    "elocationid": "10.29271/jcpsp.2023.12.1426",
    "abstract": "\nThis review evaluated the risks and survival benefits of pancreatoduodenectomy associated with venous resection compared with palliative surgery. A systematic review with meta-analysis was performed. Higher overall survival was observed in the pancreatic resection group (HR = 4.000; 95% CI 2.800 to 5.200). However, the palliative group had fewer complications (RD = -0.170; 95% CI -0.260 to -0.070). There was no significant difference in the mortality rates (RD = 0.000; 95% CI -0.030 to 0.030). In centres with experience in pancreatic surgery, resection may be considered for locally advanced cancer and major venous invasion. Pancreaticoduodenectomy with vascular resection may improve survival for periampullary tumours compared with palliation therapy. However, pancreaticoduodenectomy with major venous resection has potentially higher morbidity than palliation therapy. Key Words: Pancreatoduodenectomy, Pancreatic neoplasms, Vascular surgical procedures.\n",
    "abstracttext": "This review evaluated the risks and survival benefits of pancreatoduodenectomy associated with venous resection compared with palliative surgery. A systematic review with meta-analysis was performed. Higher overall survival was observed in the pancreatic resection group (HR = 4.000; 95% CI 2.800 to 5.200). However, the palliative group had fewer complications (RD = -0.170; 95% CI -0.260 to -0.070). There was no significant difference in the mortality rates (RD = 0.000; 95% CI -0.030 to 0.030). In centres with experience in pancreatic surgery, resection may be considered for locally advanced cancer and major venous invasion. Pancreaticoduodenectomy with vascular resection may improve survival for periampullary tumours compared with palliation therapy. However, pancreaticoduodenectomy with major venous resection has potentially higher morbidity than palliation therapy. Key Words: Pancreatoduodenectomy, Pancreatic neoplasms, Vascular surgical procedures.",
    "authorlist": "\n\nFilho\nJoao Emilio Lemos Pinheiro\nJELP\n\n\nMarque\nStefanie Sophie Buuck\nSSB\n\n\nHenriques\nAlexandre Cruz\nAC\n\n\nDias\nAndre Roncon\nAR\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n\n\nWaisberg\nJaques\nJ\n\n\nTustumi\nFrancisco\nF\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n\n",
    "author": "\nTustumi\nFrancisco\nF\n\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n\n",
    "lastname": "Tustumi",
    "forename": "Francisco",
    "initials": "F",
    "affiliationinfo": "\nDepartment of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.\n",
    "affiliation": "Department of Gastroenterology, Hospital Israelita Albert Einstein, Morumbi, Brazil.",
    "language": "eng",
    "publicationtypelist": "\nJournal Article\n",
    "publicationtype": "Journal Article",
    "medlinejournalinfo": "\nPakistan\nJ Coll Physicians Surg Pak\n9606447\n1022-386X\n",
    "country": "Pakistan",
    "medlineta": "J Coll Physicians Surg Pak",
    "nlmuniqueid": "9606447",
    "issnlinking": "1022-386X",
    "citationsubset": "IM",
    "pubmeddata": "\n\n\n2023\n02\n19\n\n\n2023\n09\n04\n\n\n2023\n12\n8\n6\n41\n\n\n2023\n12\n8\n6\n41\n\n\n2023\n12\n8\n0\n6\n\n\nppublish\n\n38062601\n040579197\n10.29271/jcpsp.2023.12.1426\n\n",
    "history": "\n\n2023\n02\n19\n\n\n2023\n09\n04\n\n\n2023\n12\n8\n6\n41\n\n\n2023\n12\n8\n6\n41\n\n\n2023\n12\n8\n0\n6\n\n",
    "pubmedpubdate": "\n2023\n12\n8\n0\n6\n",
    "hour": "0",
    "minute": "6",
    "publicationstatus": "ppublish",
    "articleidlist": "\n38062601\n040579197\n10.29271/jcpsp.2023.12.1426\n",
    "articleid": "10.29271/jcpsp.2023.12.1426"
}

Extracted format

{
    "pmid": 38062601,
    "abstract": "\nThis review evaluated the risks and survival benefits of pancreatoduodenectomy associated with venous resection compared with palliative surgery. A systematic review with meta-analysis was performed. Higher overall survival was observed in the pancreatic resection group (HR = 4.000; 95% CI 2.800 to 5.200). However, the palliative group had fewer complications (RD = -0.170; 95% CI -0.260 to -0.070). There was no significant difference in the mortality rates (RD = 0.000; 95% CI -0.030 to 0.030). In centres with experience in pancreatic surgery, resection may be considered for locally advanced cancer and major venous invasion. Pancreaticoduodenectomy with vascular resection may improve survival for periampullary tumours compared with palliation therapy. However, pancreaticoduodenectomy with major venous resection has potentially higher morbidity than palliation therapy. Key Words: Pancreatoduodenectomy, Pancreatic neoplasms, Vascular surgical procedures.\n",
    "title": "Pancreatoduodenectomy with Venous Resection or Palliative Therapy? A Meta-Analysis."
}

Phil Papers

Papers from the PhilPapers database, a comprehensive index and bibliography of philosophy research maintained by the Center for Digital Philosophy at the University of Western Ontario.

Download and Extraction: Original PDF files download from https://philarchive.org/oai.pl. All available PDF's were downloaded. Each PDF was converted to text using java-jar ../philpapers_resources/src/pdfbox-app-2.0.21.jar ExtractText {f0} {FOUT.name}. After converting to text formatting, a language was detected and added using the langdetect (citation needed) library.

Filters Applied:

Hyphenation Removal:end-of becomes end of
Newline Filtering:This is/na sentence. becomes This is a sentence.
Header/Footer Filtering:(c) 2023 Company Name. is removed
Double Whitespace Filtering:This is a test. becomes This is a test.
Mean Line Length Check: removes paragraphs with an average line length of < 2.0
CID Percentage Filter: removes LaTex heavy paragraphs that contain over 10% “CID” font artifacts.
Letterness Filter: discards paragraphs with a low proportion of letters
Removing Leading/Trailing Numbers: removes numbers at the start or end of paragraphs. 1 This is a sentence. becomes This is a sentence.
Fixing Unicode Issues: fixes Unicode issues.
Combining Diacritics Correction: a' becomes å
Unigram Log Probability: the document must have higher than -20 average unigram log probability.

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
Phil Papers	49389	0.00%	0.00%	0.12%	0.00%	99.88%

Phil Papers Filtering Examples

PhilPapers

Data sample: 2 of 9

{
"meta": {
"title": "A Trivialist's Travails",
"type": "info:eu-repo/semantics/article",
"creator": "Donaldson, Thomas",
"subject": "Philosophy",
"date": "2014",
"identifier": "oai:philarchive.org/rec/-1345",
"language": "en",
"description": null,
"datestamp": "2023-11-03T00:59:01Z"
},
"text": "A Trivialist's Travails1 Donaldson; DRAFT It's always a pleasure to read a small book with big ambitions. Agustín Rayo's manifesto, The Construction of Logical Space,2 is an outstanding example. In only a couple of hundred pages, Rayo presents a novel metaphysics and epistemology for mathematics and modality, a new defence of a Stalnakerian theory of the propositional attitudes, and some philosophy of language and reflections on philosophical methodology. In his introduction, Rayo states that he wants to 'explain how mathematical knowledge is possible' [pg. ix]. In this critical notice, I will focus on Rayo's epistemology of mathematics. There are two discussions of this in the book. The first is part of his discussion of 'trivialist Platonism' in the first four chapters; the second is in chapter 8, where he introduces a position that I'll call 'postulationism'.3 I'll discuss trivialist Platonism in §1 and §2. In §3 I'll explain Rayo's postulationism. I'll raise a problem for postulationism in §4, to which I'll offer a solution in §5. 1. Background: 'just is'-statements 1.1 Introducing the 'just is' operator Let's start by taking a look at Rayo's 'just is' operator. He introduced the operator in earlier work (Rayo [2009]) but his new book contains a much more thorough discussion of its uses. Here's one of his examples, slightly adapted [pg. 3]: To be composed of water just is to be composed of H2O. This can be paraphrased, 'there is no difference between being composed of water and being composed of H2O', or 'the property of being composed of water is identical with the property of being composed of H2O'.4 'Just is' can also be used to identify relations with various arities: For x to be taller than y just is for y to be shorter than x. For x to be the child of y and z just is for y and z to be the parents of x. 1 I would like to thank Zeynep Soysal, Jonathan Schaffer, Tobias Wilsch, Ernie Lepore, Jennifer Wang, Cian Dorr, and Andy Egan for their comments on drafts of this paper. I would also like to express my gratitude to Agustín Rayo for discussing his book with me. Most of all I would like to thank Brian Weatherson, who was the best of all possible PhD advisors. 2 Rayo [2013]; throughout the rest of this paper, whenever I refer to Rayo's work it is this book that I have in mind, unless I specify otherwise. 3 This is not Rayo's term. I've taken the word from Potter [2004, pg. 10]. 4 Rayo would add that he accepts this latter paraphrase only on the condition that the property-talk is 'understood in a suitably deflationary way' [pg. 68]. 2 The following exemplifies a rather different use of the expression 'just is': For it to be the case that Chicago is in Illinois just is for it to be the case that Illinois contains Chicago. In this case, it's tempting to say that the 'just is' operator is being used to assert an identity between propositions. But since questions about the nature of propositions are so difficult and highly contested, it is understandable that Rayo is cautious about this way of putting it [pg. 66]. Instead, he prefers to paraphrase this 'just is'-statement like so [pg. 52]: 'Chicago is in Illinois' has the same truth-condition as 'Illinois contains Chicago'. As we will see in §2, one of Rayo's claims is that true statements in pure mathematics have 'trivial truth-conditions', in the sense that 'nothing is required of the world in order for the truth-conditions of a mathematical truth to be satisfied' [pg. 98]. Rayo uses his 'just is' operator to clarify this assertion: Rayo's claim is that if φ is a true purely mathematical sentence, the truth-condition of φ is the same as the truth-condition of '∀x x=x'; that is, the following is true: ⌜For it to be the case that φ just is for it to be the case that ∀x x=x.⌝ Of course, the choice of '∀x x=x' in this definition is rather arbitrary; one could also use (for example) any truth of the form ⌜φ→φ⌝.5 1.2 'Just is'-statements and necessity According to Rayo, any set S of 'just is'-statements defines a 'conception of logical space'. Another set of statements T describes a possible scenario, relative to this conception, just in case T is consistent with S. A statement φ is necessary, relative to this conception, just in case S entails φ. For example, if you accept that to be composed of H2O just is to be composed of water, you thereby adopt a conception of logical space relative to which the sentence 'This raindrop is composed of water, but not H2O' does not describe a possible scenario. 5 Rayo's definition of 'trivial' is on pg. 53. I've changed the definition slightly, but it's easy to see that my definition is equivalent to Rayo's. 3 Let's look at a mathematical example. Consider:6 STRONG NUMBERS For it to be the case #x φ = n just is for it to be the case that ∃!n x φ. WEAK NUMBERS #x φ = n ↔ ∃!n x φ Here, the operator #x ... x ... means the number of things x such that ... x ... . For example, '#x Dog(x)' refers to the number of dogs, while '#x (cat(x)∧black(x))' refers to the number of black cats. ⌜∃!n x φ⌝ means there exist exactly n things x such that φ, where this is defined recursively in the usual way. Now if one accepts STRONG NUMBERS, one thereby adopts a conception of modal space according to which WEAK NUMBERS is necessarily true. 1.3 The epistemology of 'just is' Rayo argues that when deciding which 'just is'-statements to accept, we should use a certain sort of cost-benefit analysis. His example is this statement of the kinetic theory of heat: For a gas to be hot just is for it to have high mean kinetic energy.7 The benefit of accepting this 'just is'-statement is that one thereby avoids having to answer certain questions, which might otherwise be problematic: [Suppose you accept this 'just is'-statement. Then] you should think there is no need to answer the following question. 'I can see that the gas is hot. But why does it also have high mean kinetic energy?' You should think, in particular, that the question rests on a false presupposition. It presupposes that there is a gap between the gas's being hot and its having high mean kinetic energy – a gap that should be plugged with a bit of theory. But to accept the 'just is'-statement is to think that the gap is illusory. There is no need to explain how the gas's being hot might be correlated with its having high mean kinetic energy because there is no difference between the two... [Pg. 18] 6 Strictly speaking, WEAK NUMBERS and STRONG NUMBERS are schemas; for convenience, I will talk as though they are statements. What I call 'STRONG NUMBERS', Rayo calls simply 'NUMBERS'. 7 As Rayo points out [pg. 18, fn. 9] this statement of the kinetic theory is 'baldly inaccurate'. This doesn't matter: it's just an example. 4 More generally, Rayo claims that anyone who accepts a 'just is'-statement is consequently exempted from having to explain the corresponding universally quantified biconditional. He calls this 'why closure' [section 2.2.4]. But accepting a 'just is'-statement comes with a cost too. By accepting the above 'just is'-statement, one adopts a conception of modal space relative to which there are no possible scenarios at which a gas is hot but does not have high mean kinetic energy, and no possible scenarios at which a gas has high mean kinetic energy but is not hot. Rayo comments, 'having extra scenarios to work with can ... prove advantageous, since it makes room for additional theoretical positions, some of which could deliver fruitful theorizing' [pg. 19]. In order to make our final decision about whether to accept this 'just is'-statement, we should weigh the cost and the benefit against each other: There is no quick-and-easy criterion for determining whether the extra theoretical space is fruitful enough to justify paying the price of having to answer a new range of potentially problematic questions. The only reasonable way to proceed is to roll up one's sleeves and do some metaphysics. [Pg. 19] We'll see this style of cost-benefit analysis at work in §2, when we'll look at Rayo's defence of STRONG NUMBERS. 1.4 Objectivism and subjectivism Rayo comments: [T]he decision to adopt a particular conception should be guided by its ability to combine with the rest of one's theorizing to deliver a fruitful tool for scientific or philosophical inquiry. But fruitfulness is a goal-relative notion: a theoretical apparatus that constitutes a fruitful way of pursuing one set of goals may not be a fruitful way of pursuing another. So one might end up with a situation in which one has grounds for accepting a particular conception of logical space relative to one set of goals, and a different conception relative to another. [Pg. 57] Let's suppose that conception C1 is optimal relative to one set of goals that you have, and conception C2 is optimal relative to some other set of goals that you have. What should your attitude be towards these two conceptions? 5 Here, we need to distinguish the 'objectivist' from the 'subjectivist' versions of Rayo's account of necessity and possibility.8 According to the objectivist, there is one privileged, 'objectively correct' conception of logical space. So at most one of C1 and C2 can be 'objectively correct', even if you are unable to choose between them. The subjectivist on the other hand rejects the claim that there is an 'objectively correct' conception; for the subjectivist, asking whether it is C1 or C2 which is 'objectively correct' is like asking whether it is chess or draughts which has the 'objectively correct' rules [pg. 58]. In earlier work, Rayo committed himself to the subjectivist position (Rayo [2009, pg. 255]), but in the current book he is more cautious. He challenges the objectivist with the question, 'What does it mean to say that a conception of logical space is objectively correct?' 'The most straightforward answer,' he goes on, 'would be to say that for a conception of logical space to be objectively correct is for the 'just is'-statements it is based on to be objectively true' [pg. 57]. But Rayo argues that it is difficult to understand what it means to say that a 'just is'-statement is objectively true. So Rayo's current position is this. He prefers subjectivism, because he doesn't know how to make sense of the idea that one conception of logical space is 'objectively correct'. However, he leaves open the possibility that someone in the future will find a way of making sense of this idea, thereby vindicating objectivism. So far, what I have said has been purely expository. However, to finish the section I will make a point of my own. I will argue that, however difficult it may be to make sense of the idea that some 'just is'statements are objectively true, even a subjectivist should admit that some 'just is'-statements are objectively false. I'll start by introducing an ugly technical term (my own, not Rayo's): Given a 'just is'statement, its 'demodalisation' is the result of replacing the 'just is' operator with a universally quantified biconditional. So for example, the demodalisation of (a) is (a*): (a) To be composed of H2O just is to be composed of water. (a*) Everything that is composed of water is also composed of H2O and vice versa. I presume that every 'just is'-statement entails its demodalisation; e.g. (a) entails (a*). Now imagine someone who accepts the following 'just is'-statement: (b) To be a bird just is to be winged creature. 8 Rayo uses the term 'objectivist' on pg. 57. He does not use the term 'subjectivist', but it's the obvious term to use. 6 The demodalisation of (b) is: (b*) Every bird is a winged creature, and every winged creature is a bird. Now (b*) is false, and presumably objectively false.9 Since (b) entails (b*), (b) must be objectively false too. This point will be important later, when we turn to Rayo's views about mathematics. Recall these two statements: STRONG NUMBERS For it to be the case #x φ = n just is for it to be the case that ∃!n x φ. WEAK NUMBERS #x φ = n ↔ ∃!n x φ As I will explain in §2, Rayo accepts STRONG NUMBERS. So he puts himself at odds with nominalists – those who think that numbers don't exist. Nominalists will say that WEAK NUMBERS is false (objectively false, if you like) from which it follows that STRONG NUMBERS is (objectively) false too. If Rayo's defence of STRONG NUMBERS is to be successful, he must establish that this nominalist position is mistaken.10 2. Trivialist Platonism 2.1 Introducing trivialist Platonism Chapters 3 and 4 of Rayo's book are largely devoted to the defence of what he calls 'trivialist Platonism'. Rayo's 'Platonism' is 'the claim that mathematical objects exist' [pg. viii]. 'Trivialism' is rather harder to define. Rayo himself defines trivialism as the 'the view that the truths of pure mathematics have trivial truth-conditions, and the falsities of pure mathematics have trivial falsityconditions' [pg. 74] (see §1.1 of this paper for an explanation of the term 'trivial'). 9 The adverb 'objectively' is far from the clearest term in the philosophers' lexicon, but I take it that on any reasonable construal of this term, (b*) is 'objectively' false. 10 For an interesting discussion of the role of subjectivism in Rayo's epistemology of mathematics, see Burgess [forthcoming]. 7 However, Rayo makes it clear that being a trivialist also involves accepting STRONG NUMBERS [pg. 35]: STRONG NUMBERS For it to be the case #x φ = n just is for it to be the case that ∃!n x φ. For our purposes, it will suffice to define 'trivialist' in a rather open ended way by saying that the trivialist is committed to the view that a range of mathematical sentences have trivial truth-conditions, including the truths of pure mathematics, and WEAK NUMBERS.11 Rayo begins his discussion of trivialist Platonism with a cost-benefit analysis, designed to show that we should accept that WEAK NUMBERS and the truths of pure mathematics have trivial truth-conditions. I will consider this cost-benefit analysis in detail in §2.2 and §2.3. He then deals with some other issues: • It follows from STRONG NUMBERS that, for example, the truth-condition of '#x Planet(x) = 8' is the same as the truth-condition of '∃!8 x Planet(x)'. But what should the trivialist say about the truth conditions of more complicated statements about numbers, such as this? 2 + #x WelshCity(x) = #x Planet(x) In order to answer this question, in section 3.3 Rayo develops a 'compositional specification of truth-conditions for arithmetical sentences that assigns to each sentence in the language of arithmetic the truth-conditions that a trivialist thinks it should have' [pg. 76]. • In section 3.4, Rayo extends his compositional semantics to cover set-theoretic vocabulary. • Finally, in chapter four, Rayo offers an account of 'cognitive accomplishment in logic and mathematics'. Rayo adopts a Stalnakerian12 account of the propositional attitudes, according to which the objects of belief are sets of possible worlds. On this view, anyone who believes the proposition that 7+5=12 also believes every theorem of set theory. It would seem that anyone who maintains this position will have a hard time explaining what mathematical learning consists in. Rayo's goal in chapter 4 is to deal with this problem. My goal in the rest of §2 is to criticise Rayo's cost-benefit analysis, which he uses to defend his trivialism. 11 Thanks are due to an anonymous referee at Philosophia Mathematica for pointing out that Rayo's definition of 'trivialist' doesn't quite capture his intention. 12 See Stalnaker [1984]. 8 2.2 Rayo's cost-benefit analysis As I said, the trivialist thinks that a number of mathematical statements (including WEAK NUMBERS and all purely mathematical truths) have trivial truth-conditions. When Rayo defends trivialism using cost-benefit analysis, he uses WEAK NUMBERS as an example. Since I intend to criticise the cost-benefit analysis in some detail, I'll quote the relevant passage in full: [I]t is natural to think that the costs of accepting [STRONG NUMBERS] are far outweighed by the benefits. For by accepting [STRONG NUMBERS], one eliminates the need to answer questions such as the following: I can see that there are no dinosaurs. What I want to know is whether it is also true that the number of the dinosaurs is Zero. And I would like to understand how one could ever be justified in taking a stand on the issue, given that we have no causal access to the purported realm of abstract objects. There is no need to explain how the non-existence of dinosaurs might be correlated with dinosaurs' having Zero as a number because there is no difference between the two: for the number of the dinosaurs to be Zero just is for there to be no dinosaurs. It is true that there is also a cost. By accepting [STRONG NUMBERS] one loses access to a certain amount of theoretical space, since one is no longer in a position to work with scenarios in which there are no numbers. But it seems to me that this is not much of a price to pay, since the availability of such scenarios is not very likely to lead to fruitful theorizing. The upshot is that there is significant theoretical pressure to accept [STRONG NUMBERS] – at least provided that it can be used to construct a viable philosophy of mathematics. [Pg. 74] 13 Let's unpack this argument, which is stated rather quickly. Rayo identifies two benefits and one cost of accepting STRONG NUMBERS. The first benefit of accepting STRONG NUMBERS is epistemological: by accepting STRONG NUMBERS, the trivialist can deal with the question of 'how one could ever be justified in taking a stand on the issue [of whether #x Dinosaur(x) = 0].' 13 There is an almost identical discussion on pg. 22. 9 The second benefit is that for someone who accepts STRONG NUMBERS, '[t]here is no need to explain how the non-existence of dinosaurs might be correlated with dinosaurs' having Zero as a number'. I take it that the correlation in question here is something like this: Dinosaur Correlation For any time t, the number of dinosaurs at t is Zero just in case there are no dinosaurs at t. This is an instance of 'why closure' (see §1.2 of this paper). Rayo also mentions a cost associated with accepting STRONG NUMBERS. The cost is that 'one loses access to a certain amount of theoretical space, since one is no longer in a position to work with scenarios in which there are no numbers'. Rayo comments that this cost is small, since 'the availability of such scenarios is not very likely to lead to fruitful theorizing'. 2.3 A critique of Rayo's cost-benefit analysis Before getting into the details, it will be helpful to establish some names for the alternative positions. I'll use the term 'nominalism' for the view that there are no mathematical objects, and so no numbers. According to the nominalist, WEAK NUMBERS is false. Platonists on the other hand believe that numbers exist. I'll distinguish two varieties of Platonist. Trivialist Platonists accept STRONG NUMBERS; nontrivialist Platonists accept WEAK NUMBERS but not STRONG NUMBERS.14 The non-trivialist Platonist agrees that WEAK NUMBERS is true, but adopts a conception of logical space relative to which there are some possible scenarios at which WEAK NUMBERS is false – perhaps these are scenarios at which there are no numbers. Now let's take a closer look at Rayo's cost-benefit analysis. My conclusion will be that while Rayo's analysis may indeed show that trivialist Platonism is preferable to non-trivialist Platonism, it does not show that Platonism itself is true. The first of the two benefits that Rayo claims for trivialist Platonism is epistemological. Suppose that a sentence S is true, but not trivially true. That is, suppose that S is true, but there is a possible scenario at which S is false. According to Rayo's epistemology, to be justified in accepting S, one must rule out those possible scenarios at which S is false [pg. 36-8]. One is justified for example in 14 This categorization is not exhaustive: there are Platonist positions according to which even WEAK NUMBERS is false. We need not consider such positions, however. 10 accepting 'John is in Paris' only once one has ruled out scenarios in which he is in Rome, scenarios in which he is in Omaha, and so on. Now suppose that one adopts a conception of logical space according to which WEAK NUMBERS is non-trivially true. Then to know that this sentence is true one has to rule out possible scenarios at which it is false. But since we have 'no causal access to the purported realm of abstract objects' [pg. 22] it is hard to see how we could ever rule out such scenarios. So non-trivialist Platonism is an unstable position: if the non-trivialist Platonist is right that WEAK NUMBERS is not trivial, she cannot justify her claim that it is true. This is Rayo's version of the 'access' or 'Benacerraf problem'.15 Rayo argues that the trivialist Platonist doesn't face this problem. For her, there are no possible scenarios at which the WEAK NUMBERS is false, and so knowing that this sentence is true doesn't involve ruling out any scenarios ([Pg. 98]; See also [pg. 22] and [pg. 74]). I am inclined to agree with Rayo that trivialist Platonism is preferable to non-trivialist Platonism for this epistemological reason. At least, I am happy to concede the point. However, I don't think that there is an argument here for preferring Platonism to nominalism. To deal with an epistemological objection to Platonism is not to provide a positive argument for the Platonist position. Someone who was initially agnostic between the trivialist Platonist, non-trivialist Platonist, and nominalist positions might be convinced by this epistemological argument that she should take non-trivialist Platonism off the table. But she has been given no reason to reject nominalism.16 Now let's consider the second putative benefit of trivialist Platonism: Rayo's claim is that it is an advantage of the trivialist Platonist position that (because of 'why closure') she doesn't have to explain the following correlation: Dinosaur Correlation For any time t, the number of dinosaurs at t is zero just in case there are no dinosaurs at t. Once again, it seems to me that however well this works as a defence of the trivialist version of Platonism over the non-trivialist version, this point provides no justification for Platonism itself. It is perhaps a negative feature of non-trivialist Platonism that its proponent is stuck with the difficult task of explaining Dinosaur Correlation. However, nominalism does not share this negative feature: the nominalist, after all, doesn't accept Dinosaur Correlation and so is under no pressure to offer an explanation for it. So the nominalist's position is not threatened by this point. 15 See Benacerraf [1973]. 16 We could distinguish trivialist from non-trivialist versions of nominalism: the trivialist thinks that it is necessary that there are no numbers; the non-trivialist thinks that it is contingently true that there are no numbers. A variant on Rayo's epistemological argument against non-trivialist Platonism could be used to attack non-trivialist nominalism. But it seems that trivialist nominalism (like trivialist Platonism) survives such arguments. 11 Finally, let's look at Rayo's assessment of the cost of trivialist Platonism. Rayo claims that the disadvantage of the trivialist position is that by accepting it 'one loses access to ... scenarios in which there are no numbers'; he comments, 'this is not much of a price to pay, since the availability of such scenarios is not very likely to lead to fruitful theorizing' [pg. 74]. A nominalist might reasonably accuse Rayo of begging the question at this point. According to the nominalist, the actual scenario is one at which there are no numbers, and so to 'lose access' to such scenarios would be a huge theoretical cost. My conclusion, to repeat, is that while Rayo's cost-benefit analysis may establish that the trivialist version of Platonism is preferable to the non-trivialist version, it does not establish Platonism itself. On its own, this point is hardly devastating – for Rayo can appeal to his postulationism in defence of Platonism. But as we shall see, postulationism has its own problems. To finish, it's worth pointing out a further limitation of Rayo's cost-benefit analysis: Rayo has not yet provided us with a complete account of what justifies us in accepting particular purely mathematical statements – hence, he has not yet completely explained 'how mathematical knowledge is possible' [pg. ix]. Imagine, to pick an example more or less at random, someone who is unsure about whether the axiom of choice is true. Rayo's cost-benefit analysis plausibly establishes that the axiom is trivially true if it is true at all, but that doesn't tell us whether it is true. For the trivialist, if the axiom is true, then in order to establish this fact one does not need to 'go to the world to check whether any requirements have been met' [pg. 98]; on the contrary, it suffices to show that the axiom has trivial truth-conditions. But how is one to do that? Rayo is aware of this issue: It is important to keep in mind that getting clear about the truth-conditions of a given mathematical sentence can be highly non-trivial, so determining whether [a given purely mathematical sentence] is true is not, in general, a trivial affair – more on this later. [Pg. 98] The 'later' that Rayo mentions here must be his discussion of postulationism. So let's take a look at that. 3. Rayo's postulationism The final chapter of The Construction of Logical Space is titled, 'Introducing Mathematical Vocabulary'. Rayo begins by stating the goals of the chapter, and these stated goals are modest. He tells the reader that '[a] familiar way of introducing mathematical vocabulary is by linguistic stipulation'; he promises an account of how such stipulations work, including a 'sufficient condition for successful stipulation' [pg. 180]. However, later on it becomes clear that Rayo's goals in the chapter are at least partly 12 epistemological.17 So I will suppose that this chapter forms part of Rayo's attempt to 'explain how mathematical knowledge is possible' [pg. ix]. I'll explain some of the details presently; for now, here's the gist. One can learn mathematical truths by deducing them from stipulative definitions: both explicit definitions, and implicit definitions (which are more usually called 'axioms'). For example, one might learn number theory by stipulating that the Peano axioms are implicit definitions of 'number', 'plus', 'times', 'successor' and 'zero', and then deducing theorems from these axioms, introducing additional terms by explicit definition as necessary. Two comments. First, Rayo does not make the mistaken claim that this is how actual mathematical research advances.18 I suppose that Rayo's claim is, more modestly, that in principle one could acquire mathematical knowledge in this way. Second, in the past philosophers who have defended the view that mathematical theorems are entailed by definitions have also defended the claim that these theorems are true a priori. In earlier work, Rayo himself made this claim (Rayo [2008]). In the current book, however, Rayo is more cautious.19 He does not commit himself to the view that purely mathematical claims are a priori; at the same time, he does not at any point suggest that empirical data might be needed when using the method described. So we should keep in mind the question, 'Can one, using Rayo's method, achieve a priori knowledge in pure mathematics?' Now, the idea that one might learn mathematical truths by deducing them from explicit and implicit definitions is hardly original to Rayo. It has been defended, for example, by Reichenbach and (arguably) by Carnap.20 What's novel in Rayo's book is a defence of this position against a certain now standard objection, which I will presently explain. It's interesting to note that in earlier work, Rayo himself used a version of this objection when criticising a position rather like postulationism.21 Let's take a look at the objection. 17 See in particular pg. 185, where Rayo indicates that he's working on an 'account of mathematical knowledge'. 18 The claim would be mistaken because there have been many cases in the history of mathematics in which theorems were established before adequate definitions of the relevant terms were given. For example, much was known about continuity before Bolzano gave the first rigorous definition of 'continuous' in 1817. As Lakatos put it, we should reject the claim that 'the logic of discovery is deduction' [Lakatos 1976, pg. 143]. 19 See in particular pg. 185, where Rayo is conspicuously non-committal on this question. 20 Reichenbach [1924]; Carnap [1934/1937]. 21 See Rayo [2003], which is a critique of Crispin Wright's neofregeanism (for which see Hale and Wright [2001]). See section 3.2 of Rayo's book for his current take on neofregeanism. 13 Suppose that someone introduces the name 'Goliath' stipulatively using this sentence as a definition: Goliath is a horse at least 10cm taller than any other horse. It is obvious that this definition will succeed (that is, become true) only if there exists a horse at least 10cm taller than any other. If such a horse does not exist, the definition will 'fail': the term 'Goliath' will be an empty name, and the definition will be either false or truth-value-less. We could say that the 'success condition' of the definition is that there exists a horse at least 10cm taller than any other. More generally, the success condition of a definition is the condition that needs to be met in order for the definition to succeed. It seems that in order to know that one's definition is true, one must know independently that its success condition is met. To see what I mean by 'independently', imagine meeting someone who claims to know that the definition of 'Goliath' is true; you ask her if she knows that its success condition is met, and she says that she does; she tells you that the success condition of the definition is that there exists a horse at least 10cm taller than any other, and it is analytic that this condition is met, because this follows from the definition of 'Goliath'. This is surely absurd. One can't measure horses a priori. Now let's apply this in the case of mathematics. Let's imagine an agent – 'Clare', say – who sets out to learn some mathematics using Rayo's method. We'll suppose that Clare defines some primitive number-theoretic vocabulary using some axioms as implicit definitions, perhaps the Peano axioms.22 What is the success condition of Clare's definition? Here's a natural line of thought. A definition will succeed only if there exist suitable referents for any newly introduced singular terms – for example, the success of the definition of 'Goliath' was contingent on the existence of a suitably tall horse to act as referent for the name. We can suppose that Clare's new singular terms include: 0, 0′, 0′′, 0′′′, 0′′′′, ... Now it's not quite clear what some things would have to be like in order to be 'suitable' referents for these terms, but it does seem that no two of these singular terms should be assigned the same referent. And so it seems that the definition can succeed only if there exist infinitely many things. And so, apparently, the success condition of Clare's definition is at least that there exist infinitely many things. 22 For simplicity, I assume that she uses a multi-sorted language, with a new style of variables and new quantifiers to range over the natural numbers. Also, I suppose that her axioms contain no vocabulary other than logical vocabulary and the new number-theoretic vocabulary, and that only the new numbertheoretic quantifiers are used in the theory. 14 So in order to know that her definition has succeeded, Clare must establish independently that there exist infinitely many things. In order to know a priori that the definition has succeeded, Clare would have to establish independently and a priori that there exist infinitely many things. And it is not clear how Clare could do this. Now it might be replied that Clare can establish independently and a priori that infinitely many things exist, because this is an implication of set theory (or some other mathematical theory), which is a priori. This is not a very helpful response, for it leaves us with the task of explaining how knowledge (a priori or otherwise) in this other branch of mathematics is possible.23 It would not do to reply that knowledge in this other branch of mathematics can be achieved by deducing theorems from implicit and explicit definitions – that would be the first step in an unending regress. Alternatively, it might be replied that Clare can defend the claim that infinitely many things exist by showing empirically that there exist infinitely many physical things – say, by showing that there exist infinitely many spacetime-points. I will return to this idea in §5, but prima facie this approach is rather unattractive. First, if Rayo takes this line he will be left without a defence of the claim that pure mathematical knowledge is a priori. Second, it's really not clear that we are justified in believing that there are infinitely many physical things: so in taking this line, Rayo would make the success of his epistemology of mathematics dependent on a difficult empirical claim, which would require its own defence. This is the 'infinity problem'. Let's take a look at Rayo's response. According to Rayo, the infinity problem results from a faulty understanding of how definitions work. For Rayo, the function of a definition is not to assign referents to the newly introduced terms; rather, the function of a definition is to assign truth conditions (thought of as sets of possible worlds) to whole sentences containing the new terms. A definition will succeed, Rayo claims, provided that a suitable assignment of truth conditions to the newly introduced sentences exists. So Rayo must address the question of what it takes for an assignment of truth conditions to be 'suitable'. Let's assume that L is the set of sentences in Clare's 'old' language, sentences that do not contain the new terms, and that L+ is the set of sentences in Clare's new, extended language. We can assume that there already exists an assignment J of truth conditions to the sentences in L; we want to know what a 'suitable' assignment J+ of truth conditions to the sentences in L+ would have to look like. Rayo supposes that any assignment which meets four conditions will be suitable. 23 In addition, if Clare already knows set theory, she can introduce her number-theoretic vocabulary just using explicit definition. In this case, implicit definition is otiose. 15 Here's the first condition:24 (C1) If δ is one of the new definitions, J+(δ) is the set of all possible worlds. Now this might not be an appropriate condition for all definitions. For example, if one defines 'Jack the Ripper' using the sentence, 'Jack the Ripper is the man who murdered five women in Whitechapel in 1888', one would presumably not intend one's definition to be necessarily true. But in our case, I think condition (C1) is appropriate. We can assume that Clare intends her axioms to be true at all possible worlds. The second condition is also straightforward:25 (C2) For any φ in L, J(φ)=J+(φ). This condition is motivated by the thought that, in introducing her new terms, Clare does not intend to change the meanings of any of her existing sentences. She is extending her language, without changing those parts of the language which already exist. Next: (C3) J+ may be a partial function, but it must assign a truth condition to all sentences that Clare 'wishes to make available for use.'26 For example, Clare may have no use for the sentence '2=Julius Caesar',27 in which case J+ need not assign any truth condition to this peculiar sentence. However, any sentence that Clare intends to use should have a truth condition. Finally:28 (C4) J+ should 'respect logical consequence' in the following sense. If Γ is a subset of L+, and J+ is defined at each element of Γ, and possible world wJ+(γ) for each γΓ, and if φ is a logical consequence of Γ, then J+ is defined at φ and wJ+(φ). It is very plausible that these four conditions are necessary for the success of Clare's definition. What is less clear is whether they are jointly sufficient. I'll return to this question later. 24 This is condition (a) on [pg. 183]. 25 Rayo does not state this condition explicitly, but it is clearly assumed throughout. This is especially clear in the parenthetical remark at the end of the antepenultimate paragraph of section 8.2.3. 26 This is condition 2 from [pg. 181]. 27 This is an allusion to Frege [1884/1974: pg. 68]. 28 This is condition 3, on [pg. 181]. (I've adapted Rayo's condition slightly, but as far as I can tell the change makes no relevant difference). 16 Rayo shows that an interpretation meeting these four conditions will exist provided that Clare's axioms are 'internally coherent' [pg. 183].29 So Rayo concludes that the success condition of Clare's definition is no stronger than the claim that her axioms are internally coherent. Plausibly, this is something that Clare can confirm a priori, and so the infinity problem is apparently solved. The resulting position is reminiscent of Hilbert: if the arbitrarily given axioms do not contradict one another with all their consequences, then they are true and the things defined by the axioms exist. This is for me the criterion of truth and existence.30 So according to Rayo, when Clare stipulates that her axioms are to be necessary truths, her stipulation will succeed provided that the axioms are internally coherent. Clare can know that her axioms are true provided she can establish that they are coherent. And she can know a priori that her axioms are true provided that she is able to establish a priori that they are coherent. She does not, however, need any independent way of showing that there exist enough objects to serve as referents for her newly introduced terms. One feature of this position deserves particular emphasis. Rayo thinks that in order to come to know that her axioms are true, Clare must first show that they are coherent. Rayo might, on the contrary, have adopted the position that as long as her axioms are in fact coherent, she will be justified in thinking that they are true without having first to check that they are coherent, at least in the absence of a defeater. According to this latter position, her axioms are 'innocent until proven guilty', as it were. I call this 'the externalist approach'. I think that Rayo was right to reject the externalist approach, and I'd like to finish the section with an (all too brief!) explanation of this point.31 Yitang Zhang recently proved that there exist infinitely many pairs of prime numbers that differ by no more than seventy million32 – a major step forward in the search for a proof of the twin primes hypothesis. Now according to the externalist approach, mathematicians could have established the truth of this statement decades ago, by just including it as one of the axioms of number theory. This is surely wrong: mathematical knowledge is not so easily acquired. So it seems that Rayo is correct to insist that one must establish that one's axioms are coherent before one knows them to be true.33 29 I'll sometimes omit 'internally' for stylistic reasons. 30 This is from a letter to Frege, 29th December 1899. See McGuinness and Kaal [1980: pg. 39]. 31 Rayo himself skillfully criticizes the externalist approach (not under this name) in Rayo [2003]. 32 See Zhang (Forthcoming). Actually, Zhang's theorem is rather stronger than the result stated. 33 See Ebert and Shapiro [2009] for a similar argument in a difference context. Also see Boghossian [2003]. 17 4. But what is coherence? As we've seen, the following thesis is crucial to Rayo's epistemology: The Rayovian Thesis As long as Clare's axioms are coherent, there will exist a 'suitable' interpretation for her newly extended language. There are several different ways of understanding the term 'coherent'.34 We will see that on one natural interpretation on the term, The Rayovian Thesis is just false. There are other interpretations of the term on which The Rayovian Thesis seems to be true, but these interpretations raise other problems for Rayo's view, as we will see. I will proceed by looking a number of different interpretations of the term 'coherent' in turn. Option One: Coherence as proof-theoretic consistency First, let's consider the suggestion that some sentences are coherent just in case they are 'consistent' in the sense of proof theory: a set of sentences Γ is 'consistent' (in this proof-theoretic sense) just in case it is not the case that there exists a proof of ⊥ from premises drawn from Γ.35 Will this interpretation of 'coherent' work out for Rayo? It won't: it turns out that the Rayovian thesis is false on this interpretation of 'coherent'. Let 1,...,N⊢T mean ' is provable from premises 1,..., N in formal system T' and suppose that the property of being a proof in T is effectively decidable. Then assuming that the second-order Peano axioms ('PA2') is consistent in T, there will exist a formula φ(x) in the language of second order arithmetic in which only x occurs free, such that: (i) PA2 ⊢T φ(n), for each numeral n. (ii) PA2 ⊬T ∀x φ(x). Now consider the set PA* = PA2∪{¬∀xφ(x)}. By (ii), PA*⊬T⊥, and so PA* is consistent in T, and hence 'coherent' in the current sense. Nevertheless, PA* doesn't have a suitable interpretation. For suppose 34 There are also several ways of interpreting the term 'logical consequence' in condition (C4). It is clear from the argument on pg. 185 that Rayo's intention is that a set of sentences is coherent just in case ⊥ is not a logical consequence of the set. I will make this assumption in what follows. 35 Strictly speaking, of course, there are many different proof-theoretic notions of consistency, corresponding to different formal deductive systems. 18 that J+ meets conditions (C1)-(C4). Then by (C1), J+(¬∀xφ(x)) is the set of all possible worlds, so by (C4), J+(∀xφ(x))=∅. But by (i), and conditions (C1) and (C4), J+(φ(n)) is the set of all possible worlds, for each numeral n. But this violates what ought to be a fifth constraint on suitability: J+(⌜∀x φ(x)⌝) = J+(φ(0)) ∩ J+(φ(0′)) ∩ J+(φ(0′′)) ∩ J+(φ(0′′′)) ∩ ... J+ fails to respect the intended range of the number-theoretic quantifiers. So it is false that whenever a set of axioms is consistent in T, the set has a suitable interpretation. Option Two: Coherence as consistency in the informal sense Mathematicians frequently ask questions of the form 'Is such-and-such provable?' or 'Is such-and-such provable without such-and-such assumption?' without having any particular formal system of proof in mind. When a mathematician does this, it would be gratuitous to assume that her question must be understood as making some tacit reference to a particular formal system. So there's a notion of provability-let's call it 'informal provability'-that isn't captured by the proof-theoretic definitions. Now there may be some formal system T such that provability in T coincides with informal provability, but if so this is an important and interesting feature of system T, not a triviality. Having introduced the notion of informal provability, we can say that a set of sentences Γ is consistent in the informal sense just in case a contradiction is not provable, in the informal sense, from Γ. I will now consider the suggestion that Rayo's 'coherence' is consistency in the informal sense. Consider again The Rayovian Thesis: The Rayovian Thesis As long as Clare's axioms are coherent, there will exist a 'suitable' interpretation for her newly extended language. I showed a moment ago that this thesis is false if 'coherent' is understood as meaning 'consistent in T' for some formal system T that meets certain reasonable conditions. The same argument establishes that the thesis is false if 'coherent' means 'consistent in the informal sense' and informal provability coincides with provability in some such formal system. So in order to maintain The Rayovian Thesis on this interpretation, Rayo will have to commit himself to the view that informal provability outstrips provability in any formal system. This is by no means absurd: but it is a substantial theoretical commitment, which would require defence. 19 Option Three: Coherence as Model-Theoretic Satisfiability A set of sentences Γ is 'satisfiable' in the model-theoretic sense just in case there is some model M (for the relevant language) such that every element of Γ is true at M. Perhaps Rayo's 'internally coherent' means satisfiable? On this interpretation of 'coherent', the Rayovian thesis seems very plausible. But this interpretation of Rayo's position gives rise to another problem. Consider Clare again – and suppose that her axioms are the Peano axioms. Rayo's position, as I've said, is that Clare will only come to know that her axioms are true if she first establishes that they internally coherent. On the current interpretation, this means that in order for Clare to know that the Peano axioms are true she must first establish that they have a model. But the claim that the Peano axioms have a model is already a substantial mathematical claim, and we're left with the task of explaining how Clare could establish it. It won't do to say that Clare could derive this model-theoretic claim from implicit and explicit definitions – that would be the first step of an unending regress. The problem is particularly acute because showing that a model exists for the Peano axioms would involve showing that there exist infinitely many things – so this approach just reintroduces the infinity problem in a new form. Rayo might reply: My goal in chapter 8 was not to describe a method for achieving mathematical knowledge ab initio: more modestly, I was attempting to describe a method for learning one mathematical theory, given pre-existing knowledge of some other mathematical theory or theories. If this is the extent of Rayo's ambition in the chapter, then I have no objection to the position he describes. However, on this reading, Rayo fails 'to explain how mathematical knowledge is possible' [pg ix] which is one of the stated goals of the book. To achieve this goal, it does not suffice to explain how one can extend a pre-existing body of mathematical knowledge. An analogy might help. Suppose someone offers the following as an explanation of the possibility of mathematical knowledge: One can obtain mathematical knowledge by deduction: one simply deduces new results from statements already known. While it is true that one can extend one's mathematical knowledge by deduction, this is hardly a satisfactory explanation of 'how mathematical knowledge is possible'. 20 Option Four: Coherence as Informal Satisfiability Suppose you are asked to establish that the following theory is coherent: x Rxx xyz[(Rxy  Ryz) → Rxz] You might respond by offering an 'interpretation' on which both sentences in the theory are true. For example, you might point out that both sentences are true relative to an interpretation on which the quantifiers range over people, and 'R' means is at least as old as. Notice that in giving this interpretation you do not commit yourself to the existence of a set of people, or indeed any set. So this interpretation is not a model in the normal sense. The term 'interpretation' does not have a formal definition, but informally the idea is this: to specify an interpretation for a language, you have to identify some objects for the quantifiers to range over, you have to choose referents for the singular terms, and you have to say which objects satisfy the various predicates in the language. We can say that some sentences are 'informally satisfiable' if there is an interpretation on which all of the sentences are true.36 Now let's consider a version of Rayo's position on which 'internal coherence' is informal satisfiability. On this version of the view, in order for Clare to establish that her definition has succeeded, she would have first to establish that there is an interpretation on which her axioms are all true. It's not at all clear how Clare could do this. Suppose, for example, that her axioms are the Peano axioms. Then specifying an interpretation for the axioms would involve identifying infinitely many objects – to serve as referents for the singular terms, to form a domain of quantification, and to form extensions for the predicates. So, on this view, if Clare is to establish that her axioms are true she must first show that there exist infinitely many things. And if she is to establish a priori that her axioms are true, she must first show a priori that there exist infinitely many things. This introduces a new version of the infinity problem. If Clare already has sufficient mathematical knowledge, she may be able to use this knowledge to establish that her new axioms, the Peano axioms, are informally satisfiable. For example, if she already knows ZFC set theory she can interpret the Peano axioms in the universe of sets. But then again we are stuck with the question of how Clare could establish this pre-existing body of mathematical knowledge. Once again, we are threatened with a regress. 36 Informal satisfiability is a close relative of Kriesel's 'informal validity': see Kreisel [1972] and Smith [2011]. S1,...,Sn are informally satisfiable just in case (S1 ∧ ... ∧ Sn) is not informally valid. 21 Alternatively, it might be said that Clare can establish that her axioms are informally satisfiable by interpreting them in some well establish physical theory. For example, she could 'identify' the natural numbers with an -sequence of spacetime points. I will return to this idea in §5, but prima facie this proposal is unsatisfactory. It's not clear that we are justified in believing that there exist infinitely many physical things, so if Rayo takes this approach he makes his epistemology of mathematics dependent on a difficult empirical claim, which would require independent defence. Second, even if the approach is workable in the case of Peano arithmetic, it seems unlikely to work for richer theories, such as second-order Zermelo-Fraenkel set theory. It also worth noting that if Rayo takes this line, he will be left without a defence of the claim that mathematical knowledge is a priori. Option Five: Primitivism about logical necessity Hartry Field37 has recently been defending the view that the concept of logical necessity (represented with a box '') should be used as a primitive: that is, we should continue to use the term, but we should not attempt to define it. Now given this primitive concept, we can introduce a concept of coherence (at least for finite sets of sentences) using the following schematic definition: {S1, ..., Sn} is coherent iff (S1 ∧ ... ∧ Sn) I don't have much to say about this proposal, except that if Rayo takes this approach he is left with two substantial tasks to carry out. First, he would have to defend the Rayovian thesis understood in this new way. We've seen that this thesis is false on one way of understanding the term 'coherent'; it's not obvious that the thesis is true if 'coherent' is defined as suggested above using a primitive notion of logical necessity. Second, Rayo would have to explain how Clare could establish that (say) the Peano axioms are 'coherent' in this sense, without drawing upon some pre-existing body of mathematical knowledge. I do not claim that these tasks are impossible – perhaps Rayo could make this view work – but I do claim that further theorizing is required. 5. Towards a solution to the coherence problem. I've now explained my criticisms of Rayo's account as it stands. I'd like to finish, very briefly, by recommending a way forward for Rayo. 37 See for example Field [2008: pp. 47-8]. See Smith [2011] for criticisms of Field's argument. 22 Rayo's thinks that the following method is knowledge-producing: Method 1: Suppose that some set of mathematical terms are entirely defined by definitions D1,...,Dn, and that no other non-logical terms occur in D1,...,Dn. Then if you can show that D1,...,Dn are coherent, you may infer that they are all true.38 In the last section, I discussed various interpretations of 'coherent'. I now suggest that Rayo identify coherence with informal satisfiability (that's 'option four'). The problem with Rayo's method, on this interpretation, is that it is not clear how someone not already equipped with a good deal of mathematics could establish that her definitions are coherent. I suggest that Rayo should respond to this difficulty by specifying further methods of acquiring logicomathematical knowledge. Each such method would no doubt require considerable discussion, but for now I will do no more than provide a rough-and-ready list of suggestions. Given the informal semantic account of 'coherence', one can establish that a set of sentences is consistent by defining an interpretation relative to which they are all true; hence: Method 2: Suppose that S is a set of sentences, and that T is a theory you know to be true. Then if you can interpret S in T, you may infer that S is coherent. The term 'interpret' here is meant to be understood as it is used in proof-theory. For example, someone who already knew the axioms of ZF to be true might establish that the Peano axioms are coherent by 'interpreting' those axioms in ZF (by identifying the natural numbers with, say, the finite von Neumann ordinals). Another example: using Method 2 one might (or might not!) be able to establish that the axioms of Peano arithmetic are coherent by interpreting them within an empirically well-confirmed theory of space-time, by identifying the natural numbers with an ω-sequence of spacetime points. Next: Method 3: Suppose that S is a set of sentences. Then if you can deduce ⊥ from S, you may conclude that S is incoherent. For example, Russell showed that the axioms presented in Frege's Grundgesetze are incoherent by deriving ⊥ from them. Next: Method 4: Suppose that you try, and repeatedly fail, to deduce ⊥ from a set S. Then with repeated attempts you accumulate evidence that S is in fact coherent. In using both Method 3 and Method 4, one is in effect using consistency in the informal sense as a guide to informal satisfiability. Next, a Quinean proposal: Method 5: If a mathematical theory T plays an indispensable role in an empirically well-confirmed scientific theory, one has good reason for believing that T is true.39 38 In order to deal with cases of mathematical theories which entail that there exist only a limited number of things, we should add here as an extra condition that the quantifiers in D1,...,Dn are appropriately restricted. For example, the quantifiers in one's theory of the natural numbers might be restricted so that they range over only the numbers. 39 See Colyvan [2001] for an interpretation and defence of Quine's views about indispensability. See also Putnam [1972] for another version. 23 The next method is inspired by the work of Charles Parsons:40 Method 6: One can establish truths about types of strings of symbols by experimenting (both on paper and in imagination) with token strings of symbols. One can then establish the coherence of Peano arithmetic by interpreting it within one's theory of string-types – for example by 'identifying' the natural numbers with the following sequence of strings: 1, 11, 111, 1111, 11111, ... I have one more method to add: Method 7: Changes to one's overall system of logico-mathematical beliefs can be justified on grounds of simplicity, elegance etc..41 As I say, each of these methods require a great deal of discussion. But my very tentative suggestion is that these methods together could be used to achieve a good deal of mathematical knowledge. This is, at least, a position which Rayo might like to explore. A couple of consequences of the view require mention. First, notice that on this view even rather basic mathematical knowledge might turn out not to be a priori. Second, notice that on this view there is no priority of logical knowledge to mathematical knowledge. 6. Summary of Conclusions • In his discussions of 'trivialist Platonism', Rayo provides a response to the 'Benacerraf problem' or 'access problem' for Platonists. However, he does not in these discussions provide any convincing positive argument for Platonism. • Rayo provides a convincing defence of the claim that this method of belief-formation produces knowledge: Suppose that some set of mathematical terms are entirely defined by definitions D1,...,Dn, and that no other non-logical terms occur in D1,...,Dn. Then if you can show that D1,...,Dn are coherent, you may infer that they are all true. • However, it is by no means clear how someone who doesn't already have a good deal of mathematical knowledge could establish that (say) the axioms of Peano arithmetic are coherent. • I suggest that Rayo should respond to the problem just described by specifying other methods for acquiring logico-mathematical knowledge. 40 See in particular Parsons [1979] and Parsons [2009]. Thanks are due to Sofia Ortiz and Michael Friedman for suggesting that this method be included. 41 A great deal of Penelope Maddy's work could be read as a demonstration of the power of 'Method 7'. See for example her famous papers Maddy [1988a] and [1988b]. 24 References Benacerraf [1973] 'Mathematical Truth', Journal of Philosophy (70) 19:661-79. Boghossian [2003] 'Blind reasoning', Aristotelian Society Supplementary (77) 1:225–248. Burgess [forthcoming] Review of The Construction of Logical Space, in Critica. Carnap [1934/1937]. The Logical Syntax of Language, trans. A. Smeaton. London: Kegan Paul. Colyvan [2001] The Indispensability of Mathematics. Oxford: Oxford University Press. Ebert and Shapiro [2009] 'The Good, the Bad and the Ugly', Synthese (170) 3:415-441. Field [2008] Saving Truth From Paradox. Oxford: Oxford University Press. Frege [1884/1974] The Foundations of Arithmetic, trans. J. Austin. Oxford: Blackwell. Frege [1893] Grundgesetze der Arithmetik, Band I. Jena: Verlag Herman Pohle. Frege [1903] Grundgesetze der Arithmetik, Band II. Jena: Verlag Herman Pohle. Hale and Wright [2001] The Reason's Proper Study: Essays towards a Neo-Fregean Philosophy of Mathematics. Oxford: Oxford University Press. Kreisel [1972] 'Informal Rigour and Completeness Proofs', in Imre Lakatos (ed.), Problems in the Philosophy of Mathematics, pp. 138-86. Amsterdam: North-Holland. Lakatos [1976] Proofs and Refutations. Cambridge: Cambridge University Press. Maddy [1988a] 'Believing the Axioms I,' Journal of Symbolic Logic (53) 2:481-511. Maddy [1988b] 'Believing the Axioms II,' Journal of Symbolic Logic (53) 3:736-764 McGuinness (Editor) and Hans Kaal (Translator) [1980] Philosophical and Mathematical Correspondence of Gottlob Frege. Chicago: University of Chicago Press. Parsons [1979] 'Mathematical Intuition', Proceedings of the Aristotelian Society, (80): 145-68. 25 Parsons [2009] Mathematical Thought and its Objects. Cambridge: Cambridge University Press. Potter [2004] Set Theory and Its Philosophy. Oxford: Clarendon Press. Putnam [1972] The Philosophy of Logic. London: Allen and Unwin. Reichenbach [1924] Axiomatik der Relativischen Raum-Zeir-Lehre. Braunschweig: Vieweg. Rayo [2003] 'Success by Default', Philosophia Mathematica 11 (3):305-322. Rayo [2008] 'On Specifying Truth Conditions', Philosophical Review 117 (3): 385-443. Rayo [2009] 'Towards a Trivialist Account of Mathematics', in Bueno and Linnebo (eds.), New Waves in Philosophy of Mathematics, pp. 239-60. New York: Palgrave-Macmillan. Rayo [2013] The Construction of Logical Space. Oxford: Oxford University Press. Russ [1980] 'A Translation of Bolzano's Paper on the Intermediate Value Theorem', Historia Mathematica (7) 2: 156-185. Smith [2011] 'Squeezing Arguments,' in Analysis (71) 1: 22-30. Stalnaker [1984] Inquiry. Cambridge: Cambridge University Press. Zhang (Forthcoming) 'Bounded Gaps Between Primes,' Annals of Mathematics."
}

EuroParl

A collection of multilingual parallel corpora of parliamentary debates from the European Parliament. This is a high-quality legacy dataset earlier used for translation tasks.

Download and Extraction: Original dataset was downloaded from http://www.statmt.org/europarl/v7/europarl.tgz. The files were converted to jsonl lines for filtering.

Filters Applied: EuroParl was initially filtered during the download process. Documents with fewer than 200 characters were removed. The documents also contained HTML tags which were removed.

"""Raw single line in data: Hi I am speaker After tag removal: P Hi I am speaker We remove everything that starts with ["P", "BRK", "CHAPTER", "/P"] and only keep tagname == SPEAKER because line starting with <SPEAKER> TEXT TEXT ....... has the relevant text""" def process_tag(original_tag): tag = original_tag.strip(">").strip("<") # Skip empty tags if not tag: return None tagname = tag.split()[0] # Skip paragraph, break, and chapter tags if tagname in ["P", "BRK", "CHAPTER", "/P"]: return None # For speaker tags, return the name if tagname == "SPEAKER": soup = bs4.BeautifulSoup(original_tag, "html.parser") name = soup.speaker["name"] return name # Raise a error here if there is a tag we don't know raise ValueError(f"Unknown tag {tag}")

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
EuroParl	69814	0.00%	0.00%	0.00%	1.00%	99.00%

EuroParl Filtering Examples

Europarl

Data sample: 0 of 9

{
    "text": "Associeringsavtal mellan Europeiska unionen och centralamerikanska stater - Associeringsavtal mellan Europeiska unionen och Andinska gemenskapen (debatt)\nTalmannen\nNästa punkt är en gemensam debatt om\nbetänkandet av Willy Meyer Pleite, för utskottet för utrikesfrågor, med ett förslag till Europaparlamentets rekommendation till rådet om förhandlingsmandatet för ett associeringsavtal mellan Europeiska unionen och dess medlemsstater å ena sidan och de centralamerikanska staterna å andra sidan, och\nbetänkandet av Luis Yañez-Barnuevo García, för utskottet för utrikesfrågor, med ett förslag till Europaparlamentets rekommendation till rådet om förhandlingsmandatet för ett associeringsavtal mellan Europeiska unionen och dess medlemsstater å ena sidan och Andinska gemenskapen och dess medlemsstater å andra sidan.\nLuis Yañez-Barnuevo García \nföredragande. - (ES) Herr talman! Vid det fjärde toppmötet mellan EU:s, Latinamerikas och Västindiens stats- och regeringschefer förra våren i Wien gavs klartecken för att inleda förhandlingar om ett strategiskt associeringsavtal mellan EU och Andinska gemenskapen. I mitt betänkande föreslår jag ett trettiotal rekommendationer som rådet och kommissionen bör ta hänsyn till när de utarbetar förhandlingsdirektiven. Vi vill att denna associering ska vara ambitiös, bred och omfattande, i linje med associeringen med såväl Mercosur som Centralamerika, eftersom vi anser att detta är ett strategiskt krav för båda regionerna.\nMed hänsyn till Latinamerikas och EU:s historia, språk, kultur, värderingar och gemensamma syn på världen, samt deras stöd till multilateralism och FN-systemet, måste de bli strategiska allierade i en globaliserad värld. Detta gäller särskilt för de andinska länderna där det på sina håll förekommer extrem fattigdom och där kontinentens största ojämlikheter återfinns.\nAvtalet måste vila på tre pelare: en politisk och institutionell pelare, en samarbetspelare och en handelspelare. Inom området för politik och säkerhet bör vi inrätta en freds- och säkerhetsstadga för EU och Andinska gemenskapen, föra en ständig politisk dialog, främja demokratisk kvalitet, social sammanhållning, stöd till god förvaltning, fattigdomsminskning, utbytesverksamhet, terroristbekämpning, konfliktförebyggande arbete och samordning i fråga om reformen av FN, samt civil och militär krishantering.\nDen andra pelaren är att främja hållbar mänsklig utveckling och ett progressivt ökat tillträde för andinska produkter till de europeiska marknaderna, på konkurrenskraftiga villkor, med hänsyn till de enorma ekonomiska skillnaderna och graden av integration mellan EU och Andinska gemenskapen, vilket kommer att kräva en översyn av den gemensamma jordbrukspolitiken och EU:s subventioner.\nDen tredje pelaren är handeln i sig, men till skillnad från andra modeller med tredjeländer, som de andinska ländernas avtal med Förenta staterna, får det inte handla om frihandelsavtal i strikt bemärkelse, eller enbart frihandelsavtal, utan hänsyn måste tas till den enorma klyftan mellan de båda regionerna. Utan ekonomiska åtgärder för att skapa stöd, samarbete och finansiering, kommer en rent kommersiellt politik inte att kunna bidra till utvecklingen.\nArbetstagarrättigheter, särskilt för urbefolkningar och stamfolk, skydd av värdiga arbetsförhållanden, icke-diskriminering och jämställdhet mellan kvinnor och män på arbetsmarknaden samt avskaffande av barnarbete måste ingå i avtalet. Vi måste också särskilt betona vikten av europeiska investeringar som en avgörande faktor för utvecklingen i dessa länder, liksom behovet av att europeiska företag tillämpar samma normer för arbetsförhållandena som de tillämpar i europeiska länder.\nInvandring, som fenomen och som källa till möjligheter, måste ingå i avtalet, med skydd för invandrares rättigheter, och valutaöverföringar måste bli enklare, billigare, öppnare och säkrare.\nKapitlet om miljön, som måste få en framträdande plats i avtalet, måste omfatta utarbetandet av en gemensam politik som syftar till att åstadkomma energibesparingar, diversifiering, främjande av alternativa och förnybara energikällor och minskning av förorenande utsläpp, i enlighet med den strategi som antogs vid Europeiska rådets senaste möte.\nSammanfattningsvis anser jag att vårt mål måste vara att kunna ingå detta ambitiösa strategiska associeringsavtal mellan Europeiska unionen och dess medlemsstater å ena sidan och Andinska gemenskapen och dess medlemsstater å andra sidan vid det femte toppmötet mellan EU, Latinamerika och Västindien i Lima år 2008.\nWilly Meyer Pleite \nföredragande. - (ES) Herr talman! Det är uppenbart för alla att denna debatt som ska hållas i parlamentet äger rum vid en viktig tidpunkt för Latinamerika i allmänhet. Den äger rum vid en tidpunkt då Latinamerikas befolkning kraftfullt verkar ifrågasätta den politik som har gjort dem fattiga. De ifrågasätter nu den nyliberala formeln för politiken. President George Bush besök i Latinamerika är ett talande bevis för detta.\nNär det gäller associeringsavtalet med Centralamerika står EU i historisk skuld till den regionen. Vi spelade en mycket viktigt roll under 1980-talet i Centralamerikas freds- och demokratiseringsprocess - San Joséavtalen, Esquipulasavtalen - där EU distanserade sig från Förenta staterna, intog en självständig ståndpunkt och spelade en avgörande roll.\nLäget i Centralamerika i dag är mycket tydligt: den ekonomiska tillväxten är mycket låg - för närvarande 0,6 procent - fattigdomen är ungefär lika stor som under 1990-talet och orättvisorna ökar.\nFredsavtal har ännu inte godkänts. Detsamma gäller i fråga om mänskliga rättigheter, straffrihet och korruption samtidigt som den regionala integrationen fortfarande är mycket låg.\nI detta sammanhang valde jag en viss typ av betänkande för att fastställa vilket slags associering vi ville ha. Jag byggde det på tre grundläggande pelare: politisk dialog för att skapa god förvaltning, utvecklingssamarbete för att bidra till att undanröja de strukturella orsakerna till fattigdom och orättvisor samt handel under rättvisa och ömsesidigt fördelaktiga förhållanden som bygger på att man kompletterar varandra och visar solidaritet. Ett avtal som syftar till att skapa regional integration för att bidra till en balanserad och rättvis omfördelning av Centralamerikas inkomster och välstånd. Detta var sammanhanget. Vi ville ha ett avtal som inte förvandlas till ett avtal om ett frihandelsområde och om privatisering av offentliga tjänster. Vi ville kort sagt inte att politisk dialog och samarbete skulle dränkas i frihandelsformler.\nJag är övertygad om att ett avtal med en uttalad nyliberal prägel mellan ojämlika regioner - ojämlika i ordets alla bemärkelser - bara skulle spä på denna ojämlikhet och bana väg för näringslivselitens exploatering, som skulle leda till en ännu större cirkel av beroende, utestängning, fattigdom och extremt höga sociala och miljömässiga kostnader.\nJag anser att handel och samarbete måste inriktas på hållbar utveckling på regional nivå, och komma folket till godo, i stället för på en rad projekt som gynnar det transnationella kapitalet, i likhet med Puebla-Panama-planen eller Europeiska investeringsbanken.\nDet var i den avsikten jag utarbetade mitt betänkande, i samarbete med många civila samhällsorganisationer från EU och Centralamerika. Parlamentets utskott för utveckling och utskott för internationell handel har därefter naturligtvis yttrat sig över betänkandet. Jag vill förstås tacka er alla för era bidrag som har förbättrat texten i fråga om den inställning jag ville upprätthålla under hela denna process.\nJag vill särskilt tacka Miguel Ángel Martínez för hans alltid lika rättvisa och samarbetsvilliga bidrag, i det här fallet från utskottet för utveckling. I yttrandet från utskottet för internationell handel har Gianluca Susta lagt fram några mycket genomgripande ändringsförslag till texten, som verkligen snedvrider det betänkande jag hade för avsikt att lägga fram för parlamentet.\nMin verkliga avsikt var att skapa ett balanserat betänkande utifrån de tre pelare jag beskrev tidigare, men i praktiken skapade ändringarna som helhet ett dokument som i grund och botten eftersträvade att ett frihandelsområde skulle inrättas.\nI det avseendet hade jag för avsikt att försöka tona ned den strategin så mycket som möjligt. Jag talar om strategin som går ut på att ge Centralamerika intryck av att vi i EU i första hand vill skapa ett frihandelsområde. Vi kom överens om sju kompromissändringar tillsammans med Ignacio Salafranca från gruppen för Europeiska folkpartiet (kristdemokrater) och Europademokrater, Raimon Obiols i Germà från socialdemokratiska gruppen i Europaparlamentet och Gianluca Susta från gruppen Alliansen liberaler och demokrater för Europa, och jag vill än en gång tacka dem varmt för deras ansträngningar för att nå en överenskommelse om hur betänkandet kunde tonas ned och inte förstöras.\nJag vill naturligtvis också tacka Raimon Obiols i Germà och Véronique De Keyser från socialdemokratiska gruppen i Europaparlamentet och Raül Romeva i Rueda från gruppen De gröna/Europeiska fria alliansen för deras ändringsförslag, som förbättrade och utvecklade denna önskan om att göra mandatet till ett tydligt mandat för ett associeringsavtal som inte omfattar ett frihandelsområde.\nDetta har till viss del varit bra, eftersom vi - som jag sa - har lyckats tona ned så viktiga punkter som v), där det uttryckligen rekommenderas att frihandelsområdet ska vara ett prioriterat strategiskt mål och det hänvisas till Cafta, och vi har förstås kunnat tona ned det, men inte tillräckligt mycket.\nJag vet inte om detta har inträffat tidigare, men jag kommer att rekommendera min grupp att avstå från att rösta om detta betänkande, eftersom jag inte anser att det har nått fram till mitt mål, vilket var att utarbeta ett balanserat betänkande.\nI varje fall tycker jag att det ska bli mycket intressant att höra åsikterna från det centralamerikanska parlamentet, Parlacen, och även från de politiska organisationerna i Centralamerika, och min förhoppning är att Europeiska kommissionen när förhandlingarna inleds kommer att hålla i minnet att det Centralamerika frågar efter inte är en kopia av Förenta staternas ståndpunkt, utan en motsatt, annorlunda och självständigt ståndpunkt.\nPeter Mandelson\nledamot av kommissionen. (EN) Herr talman! Jag vill inleda med att också på min kollega Benita Ferrero-Waldners vägnar välkomna det synnerligen förtjänstfulla arbete som de båda föredragandena har utfört samt den konstruktiva analys och de positiva synpunkter som utskottet för utrikesfrågor, utskottet för utveckling och utskottet för internationell handel har framfört i anslutning till de olika aspekterna av och utsikterna för kommande avtal med dessa regioner.\nAtt ingå associeringsavtal med Centralamerika och Andinska gemenskapen är ett långsiktigt strategiskt mål för de båda regionerna och har getts upprepat stöd av stats- och regeringscheferna vid toppmötena i Guadalajara och Wien.\nGenom att förhandla om dessa avtal visar EU sitt engagemang i regionen och sin beslutsamhet att stärka förbindelserna med samtliga latinamerikanska länder. Europa och Latinamerika är naturliga partner, och starkare band med Centralamerika och Andinska gemenskapen kommer att innebära att grunden läggs för ett såväl politiskt som ekonomiskt förstärkt partnerskap.\nAvtalsförhandlingarna kommer att föras regionsvis i syfte att ge ytterligare stimulans åt de regionala integrationsprocesserna i både Centralamerika och inom Andinska gemenskapen. Såsom också Europaparlamentet upprepade gånger framhållit är den regionala integrationen nyckeln till politisk och social stabilitet. Integrationen kommer dessutom att innebära att regionerna inlemmas mer effektivt i världsekonomin genom att ekonomierna blir allt större och stabilare och därmed kan locka till sig investeringar. Det är trots detta viktigt att avliva myten om att EU försöker ”pådyvla” någon sin egen modell, eftersom den fortsatta regionala integrationen är en angelägenhet för de enskilda regionerna utifrån de egna målsättningarna och den egna agendan.\nAssocieringsavtalen är tänkta att vara övergripande avtal och innefatta hela skalan av EU:s mångfacetterade förbindelser med de båda regionerna, det vill säga såväl politisk dialog som samarbete och handel.\nRespekten för och främjandet av de demokratiska principerna, de grundläggande mänskliga rättigheterna, rättsstatsprincipen och goda styrelseformer kommer fortsatt att vara grundbulten i våra förbindelser med Centralamerika och Andinska gemenskapen. Kommissionen anser dessutom att associeringsavtalen bör medverka till att särskild uppmärksamhet ägnas åt att effektivt genomföra internationellt överenskomna standarder på området för mänskliga rättigheter, liksom på det sociala området och området för grundläggande arbetsnormer samt på miljöområdet i syfte att främja en hållbar utveckling.\nNär det gäller den politiska dialogen kommer avtalen att medföra att ett brett spektrum av ämnen tas upp, exempelvis klimatförändringen, energi, migration och narkotikabekämpning. Dessa är livsviktiga för såväl våra regioner som hela vår planet. En fördjupad dialog med Centralamerika och Andinska gemenskapen syftar till att få till stånd ett konstruktivt engagemang i en effektiv multilateralism och en internationell förvaltning, som innebär att vi kan möta 2000-talets utmaningar.\nDet politiska kapitlet i associeringsavtalen kommer att följas upp av åtgärder som syftar till att på ett välavvägt och rättvist sätt främja biregional handel och biregionala investeringar. Detta bör ske genom en stegvis och ömsesidig avreglering av handeln med varor och tjänster och genom inrättandet av ett rättvist och öppet regelverk. Asymmetrierna mellan våra regioner bör också beaktas. Handelsdelen av avtalet kommer att stå helt i överensstämmelse med Världshandelsorganisationens (WTO) bestämmelser och åtaganden men ändå vara mer långtgående än de grundläggande bestämmelserna i syfte att vinna största möjliga ömsesidiga och långsiktiga fördelar av den biregionala avregleringen av handeln.\nSamarbetet mellan de båda sidorna ska vara ordentligt förankrat i de globala mål och principer som ingår i vår utvecklingspolitik, exempelvis Europeiskt samförstånd för utveckling, liksom i de internationella avtal där vi är part, bland annat millennieutvecklingsmålen och Parisdeklarationen om biståndseffektivitet. Den sociala sammanhållningen kommer att prioriteras. Samarbetskapitlet bör återspegla viljan att arbeta tillsammans och att utbyta erfarenheter. Kapitlet bör även återspegla solidariteten med de fattigaste och mest marginaliserade människorna.\nJag vill avsluta med en sammanfattning av förhandlingsförberedelserna. Utkasten till förhandlingsdirektiv antogs av kommissionen den 6 december 2006 och diskuteras för närvarande med medlemsstaterna. Det är kommissionens förhoppning att förhandlingsdirektiven ska antas och att förhandlingar med de två latinamerikanska underregionerna om möjligt ska inledas redan under det första halvåret i år. Om vi lyckas hålla den ambitiösa tidsplanen kommer vi till stor del att ha ert stöd att tacka för detta, liksom er beslutsamhet att stärka förbindelserna mellan EU och Latinamerika och i första hand med dessa båda regioner.\nMiguel Angel Martínez Martínez \nföredragande för yttrandet från utskottet för utveckling. - (ES) Herr talman! Det betänkande som Willy Meyer Pleite ursprungligen lade fram för oss om associeringsavtalet mellan Europeiska unionen och länderna i Centralamerika utgjorde grunden för det yttrande vi utarbetade i utskottet för utveckling. Vi höll i mycket generella drag med om hans förslag och vi kom också överens om en rad rekommendationer från utskottet för utveckling i fråga om dessa. Willy Meyer Pleite visade sig vara mycket lyhörd och vi undertecknade tillsammans sju ändringsförslag som införlivade de särskilda frågeställningarna från utskottet för utveckling.\nJag måste påpeka att jag tycker att den text som har lagts fram inför kammaren är mycket blekt jämfört med de ursprungliga förslagen. Dessa har omarbetats i en i stort sett nyliberal anda, vilket kanske avspeglar åsikterna hos majoriteten i parlamentet.\nSanningen är att vi kan leva med dessa texter, tack vare kompromisserna. Vi kommer att rösta för dem, men vi kommer inte att rösta med någon entusiasm, eftersom de inte uppfyller Centralamerikas behov, eller ambitionerna hos folken i Centralamerika, och dessutom kommer denna text inte att förbättra EU:s anseende i dessa samhällen.\nAv de sju ändringsförslag som lades fram av utskottet för utveckling har tre godkänts. I dessa ändringsförslag betonas att associeringsavtalet mellan EU och Centralamerika måste omfatta utvecklingssamarbete och därför ta upp de prioriteringar som fastställts i EU:s samsyn på samarbete, som kommissionsledamoten sa: utrota fattigdomen och uppfylla millennieutvecklingsmålen. Som ett resultat av dessa erkännanden och den vikt vi lägger vid att detta avtal utarbetas, innehåller den text som vi kommer att rösta om just det minimum som krävs för att vi ska stödja den.\nMałgorzata Handzlik \nHerr talman, herr kommissionsledamot! Jag tackar föredraganden för ett omfattande och balanserat betänkande som är oerhört viktigt i dagens värld. Det är ett viktigt tecken och ett stöd för förhandlingarna om associeringsavtalet mellan EU och Andinska gemenskapen i ett avgörande skede med politiska och ekonomiska förändringar i regionen.\nAndinska gemenskapen är ett produktivt och sammanhängande system som integrerar enskilda latinamerikanska länder. Båda parterna - Europeiska unionen och Andinska gemenskapen - kommer att gynnas av de fördjupade ömsesidiga politiska och ekonomiska förbindelserna. Förhandlingsdirektiven till rådet är ett sammanhängande och omfattande dokument som innehåller alla delar som krävs för ett tillfredsställande samarbete. Föredraganden lyfter fram den avgörande betydelsen av politisk dialog, främjande av hållbar utveckling, utbildning samt mänskliga rättigheter. Han betonar också vikten av att bekämpa narkotika, vapenhandel och organiserad brottslighet och poängterar att detta samarbete måste bygga på frihandel. Associeringsavtalet måste progressivt liberalisera handeln och utveckla de politiska förbindelserna samtidigt som det främjar demokrati och de sociala och kulturella rättigheter som är utmärkande för regionen.\nDet gläder mig att små och medelstora företags roll i associeringsprocessen har tagits med i förhandlingsdirektiven, något jag betonade i mitt yttrande för utskottet för internationella handel. Som vi alla vet är sektorn för små och medelstora företag en av de viktigaste källorna till ekonomisk tillväxt och har avgörande betydelse för levnadsstandard och minskad fattigdom. Därför anser jag att vi särskilt måste betona främjandet av denna sektor genom att göra det lättare för små och medelstora företag att få tillgång till lån, undanröja onödiga handelshinder och genomföra program för innovation och utveckling.\nGianluca Susta \nföredragande för yttrandet från utskottet för internationell handel. - (IT) Herr talman, mina damer och herrar! Jag ska inrikta mig på Meyerbetänkandet, som tar upp ett viktigt initiativ för EU, som måste börja betrakta Centralamerika som en möjlighet, uppmuntra handeln och gradvis minska tullhindren, men inte minska den fria rörligheten för personer, varor och tjänster, och därmed på bästa sätt utnyttja dessa länders specifika särdrag.\nDetta innebär ökat samarbete och ökad utveckling, skydd av social värdighet och personlig värdighet hos de svagaste medlemmarna i samhället och att gradvis öppna våra marknader, främst för dessa länders lokala jordbruksprodukter, som fortfarande utgör en stor andel av deras BNP.\nUtskottet för internationell handel har som vanligt utarbetat sitt bidrag i enlighet med sin sakkunskap, men en ökad konkurrenskraft i länderna i Centralamerika är tveklöst en förutsättning för att skapa politisk stabilitet i ett område som fortfarande lider av följderna av den våldsamma sammandrabbningen mellan de tyranniska institutionerna och de revolutionära krafterna för några år sedan, en sammandrabbning som krävde hundratusentals dödsoffer och som skakade detta geopolitiska område.\nBetänkandets kulturella och politiska strategi är därför positiv och jag anser inte att den har urvattnats av förslagen från utskottet för internationell handel. Dessutom har det faktum att innehållet i vissa av förhandlingsdirektiven har godkänts bidragit till att kombinera frågan om att inrätta ett frihandelsområde med de mer allmänna frågor som hänger samman med utvecklingen av demokrati i detta geopolitiska område.\nJosé Ignacio Salafranca Sánchez-Neyra\nför PPE-DE-gruppen. - (ES) Herr talman, herr kommissionsledamot, mina damer och herrar! De betänkanden som Willy Meyer Pleite och Luis Yañez-Barnuevo García har lagt fram är ett svar på parlamentets krav sedan länge att de andinska och centralamerikanska länderna också ska ha associeringsavtal, i likhet med dem som vi har med andra länder i regionen, och därmed dra nytta av de bästa och mest utvecklade instrument som EU använder i sina förbindelser med tredjeländer.\nDessa är tydligen inte de enda länderna med vilka EU förhandlar om associeringsavtal. Eftersom kommissionsledamoten med ansvar för handelsfrågor är närvarande i eftermiddag skulle jag vilja ta tillfället i akt att uppmana honom att anstränga sig särskilt när det gäller vissa förhandlingar som har pågått alltför länge, nämligen EU:s förhandlingar med Mercosur.\nJag kan förstå att det finns svårigheter i samband med dessa förhandlingar. De är naturligtvis inte helt och hållet avhängiga EU:s vilja, men jag anser att vi måste anstränga oss för att sätta fart på dem så att något blir gjort.\nHerr talman! Jag vill påpeka att man i första och andra generationens avtal mellan EU och de latinamerikanska länderna lade tonvikten på forskning och utveckling, i den tredje generationen lades tonvikten på den demokratiska klausulen och i denna fjärde generation av associeringsavtal läggs vikten på en gradvis och ömsesidig liberalisering av handeln.\nDetta innebär naturligtvis inte att de kommersiella aspekterna är de viktigaste. Som kommissionsledamoten sa för en stund sedan lägger denna associering grunden för förbindelserna när det gäller politisk dialog, respekt för de mänskliga rättigheterna, för demokratiska värderingar och för rättssäkerheten samt kampen mot korruptionen.\nDet är emellertid tydligt att vi inte kan nonchalera frihandelns betydelse, vilket är något som de centralamerikanska och andinska länderna vill ha, och i detta avseende är min enda rekommendation att denna ambitiösa tidsplan som kommissionsledamoten har tagit upp - med hänsyn till att kommissionen har godkänt förhandlingsdirektiven och att även parlamentet kommer att godkänna dem i morgon - kan omsättas i praktiken så snart som möjligt, för vi har redan väntat för länge på associeringsavtalen med dessa andinska och centralamerikanska länder, samma typ av avtal som vi har med Mexiko och Chile, som förresten har gett utmärkta resultat.\nRaimon Obiols i Germà\nför PSE-gruppen. - (ES) Herr talman! Vår grupp har försökt att nå samförstånd om betänkandena av Luis Yañez-Barnuevo García och Willy Meyer Pleite. Kompromissändringsförslag har eftersträvats eftersom vi anser att det är viktigt att skicka ett budskap till de aktuella latinamerikanska underregionerna att det som EU föreslår inte bara är ett frihandelsavtal, utan ett avtal med bredare räckvidd som i grunden tar hänsyn till politiska avtal och utvecklingssamarbete.\nOm jag har förstått den viktiga diskussionen i parlamentet om dessa två betänkanden rätt så verkar det som att man i PPE-DE:s yttrande lägger större vikt vid frihandelsaspekterna av dessa förhandlingar, samtidigt som andra, däribland vår PSE-grupp, fäster större vikt vid politiska avtal, solidaritet, stöd för demokratiska institutioner, kampen mot fattigdomen och kampen mot våldet.\nOm vi beaktar de faktiska förhållandena för de kommersiella förbindelserna mellan EU och t.ex. Centralamerika ser vi att EU:s handel med Centralamerika uppgår till omkring 0,3 procent av vår utrikeshandel och att, även i Centralamerika, handeln med EU inte uppgår till mer än 9 eller 10 procent av deras utrikeshandel.\nOm vi tillämpar den klassiska maximen primum vivere, deinde philosophare (lev först och främst, filosofera senare), kommer vi med hänsyn till situationen i dessa länder snart att dra slutsatsen att den viktigaste aspekten av våra förbindelser inte har att göra med handel i lika hög grad som med kampen mot fattigdomen, kampen mot bristen på säkerhet, kampen mot våldet och, i vissa länder, kampen mot det allt större problemet med narkotikasmuggling och organiserad brottslighet. Detta är den grundläggande sakfrågan.\nFör ett kort tag sedan sa en berömd europeisk journalist, polacken Ryszard Kapucinski, att vi bara uppmärksammar dessa länder när blodsutgjutelse äger rum, och han tillade: ”detta är tråkigt, men så är fallet”. Det är tydligt att vi står inför en situation där vi, efter att ha slutat bry oss och efter tio år av undertecknande av fredsavtal i Centralamerika, nu i högre grad måste börja bry oss och dra största möjliga nytta av de möjligheter som erbjuds genom förhandlingarna om ett associeringsavtal, som vi anser måste få största möjliga samförstånd och majoritetsstöd i parlamentet.\nLeopold Józef Rutowicz\nför UEN-gruppen. - (PL) Herr talman! Jag vill tacka föredragandena Willy Meyer Pleite och Luis Yañez-Barnuevo García för deras utmärkta arbete i samband med associeringsavtalen med de centralamerikanska länderna. I betänkandena tar de på ett bra sätt upp de politiska mål som ett utvidgat samarbete vilar på.\nDe centralamerikanska länderna delar vår europeiska och latinska kultur. De står nära oss och det är bara naturligt att vi bör förhandla om associeringsavtal med dem. Avtalet syftar till att stärka båda parters ställning i en globaliserad värld. För närvarande är vårt stöd till denna region huvudsakligen av humanitär karaktär. Vi ger dem en fisk istället för ett fiskespö. Det är Kina, Indien och världskapitalet som hjälper dessa länder att hjälpa sig själva genom att bygga vägar, gruvor, fabriker, skapa arbetstillfällen och framgångsrikt sälja sina produkter där.\nVåra förhandlingar om associeringsavtalen bör säkra ekonomiska kopplingar som kommer att främja både EU och övriga centralamerikanska länder med associeringsavtal. Det är endast på denna grund som vi kan bygga ett hållbart system av ekonomiska och politiska förbindelser mellan våra samhällen. Man kan bara hoppas att det europeiska kapitalet kommer att spela en större roll i de länder med vilka vi vill ingå associeringsavtal, tillsammans med kinesiskt och indiskt stöd.\nAssocieringsavtalen mellan andra länder och EU är av stor politisk betydelse och om de visar sig vara framgångsrika när det gäller att säkra det pågående ekonomiska samarbetet kommer de att klara av sin uppgift med glans.\nRaül Romeva i Rueda\nför Verts/ALE-gruppen. - (ES) Herr talman! Jag vill också börja med att gratulera de två föredragandena till deras ansträngningar att försöka nå samförstånd mellan grupperna för att upprätta ett mandat för förhandlingar om associeringsavtalen med de centralamerikanska och andinska länderna.\nNågot som också har sagts är att man under utarbetandet av dessa betänkanden har avslöjat djupa och betydande meningsskiljaktigheter mellan grupperna. Trots föredragandenas ansträngningar uppvisar den slutliga texten en verklig brist på balans när det gäller de tre grundläggande delarna i detta avtal: politisk dialog, samarbete och handel.\nVi anser inte att ett frihandelsområde är ett realistiskt eller lämpligt mål för regioner som är så sårbara som dem som vi diskuterar här.\nVi anser därför att vi har missat ett bra tillfälle att främja mellanregionala förbindelser som gör det möjligt att stärka dessa förbindelsers många dimensioner och säkra en hållbar människorättslig utveckling för de andinska och centralamerikanska folken. Vår grupp kommer därför att avstå från att rösta i morgon. Vi beklagar detta. Vi vill påpeka att vi vill fortsätta att arbeta, men vi beklagar att man inte har nått ett bättre resultat när det gäller de både betänkandena.\nJens Holm\nför GUE/NGL-gruppen. - Herr talman! Dessa betänkanden kräver att utvecklingsländerna skall avreglera, ge europeiska företag makten vid offentlig upphandling, skydda europeiska och nordamerikanska patent, göra allt för att europeiska storföretags investeringar skall skyddas. I ett av betänkandena kräver man t.o.m. att det skall upprättas ett frihandelsområde och nu citerar jag ”utan att exkludera någon sektor”. Sug på den formuleringen. Nej, det är inte denna väg vi skall gå. Ju mer man avreglerar desto bättre blir det kanske för storföretagen, men desto sämre för dem som skall skyddas av de lagar man tar bort, dvs. arbetarna, miljö och lokala småföretag.\nTvå exempel: Det är bra för Monsanto om de lyckas patentera grödor i Sydamerika, men dåligt för bönder och miljön och bra för europeiska vårdbolag om vårdsektorn konkurrensutsätts, men dåligt för dem som inte har råd att betala för vården. Det finns ett alternativ: rättvis handel istället för otyglad frihandel, samarbete och trygghet istället för konkurrens och marknadsvansinne. Det är både vad Europas och Latinamerikas folk behöver. Nu ska jag avsluta med den förenade vänstergruppens hållning, nämligen att vi avstår från att rösta.\nGerard Batten\nför IND/DEM-gruppen. - (EN) Herr talman! Vilket är det bästa sättet att höja levnadsstandarden och främja de medborgerliga och mänskliga rättigheterna i Centralamerika och länderna inom Andinska gemenskapen? Frågan skulle lika gärna kunna gälla samtliga länder i Central- och Sydamerika samt övriga utvecklingsländer.\nDet ligger i de demokratiskt styrda industriländernas eget långsiktiga intresse att utnyttja sin ekonomiska styrka till att främja demokrati och ekonomisk tillväxt i utvecklingsländerna. Detta åstadkoms bäst genom en begränsning av handelshindren världen över och genom handels- och samarbetsavtal som ska träda i kraft under förutsättning att respekten för rättsstaten, respekten för egendom och avtalade rättigheter samt för de mänskliga och medborgerliga rättigheterna upprätthålls.\nVi har sett hur Kina till och med under kommunistdiktaturens ok lyckas uppnå en förbluffande ekonomisk utveckling genom att ge de kapitalistiska marknadskrafterna fritt spelrum. Kapitalismen fungerar trots alla brister. Den ger upphov till välstånd och valfrihet och bidrar till att tillhandahålla de villkor som är nödvändiga för demokrati och civiliserade värderingar. Socialismen i all dess idealism fungerar inte. Den ger upphov till förtryck, bristande valmöjligheter och varubrist samt politisk stagnation.\nVad världens utvecklingsländer följaktligen inte är betjänta av är att följa det kvasimarxistiska EU:s exempel. De klarar sig utan de rekommendationer i betänkandena som innebär att EU:s sämsta inslag ska gå på export, nämligen ekonomisk och politisk integration samt en harmoniserad lagstiftning.\nDet som länderna minst av allt behöver är att ta efter en sviktande ekonomisk modell och ett alltmer centraliserat EU med dess tilltagande odemokratiska och ansvarslösa institutioner. I betänkandena uppmanas till frihandel, vilket i och för sig är positivt, men detta får inte villkoras av att länderna också måste införa EU:s fallfärdiga strukturer.\nMarcello Vernola\n(IT) Herr talman, mina damer och herrar! Jag vill börja med att gratulera föredraganden Luis Yañez-Barnuevo García till betänkandet om avtalet med Andinska gemenskapen. Eftersom det grundas på de tre pelarna ger det en ram som inte bara begränsas till ekonomiska aspekter. Alla institutioner hade i själva verket avsikten att ta med frågor som t.ex. arbetslöshet, säkerhet, migration, social utveckling, miljö, hållbar utveckling och därmed politisk stabilitet i det kommande associeringsavtalet.\nVi är intresserade av att bevara skyddet för mänskliga, medborgerliga, politiska, ekonomiska och sociala rättigheter och t.o.m., i linje med EU:s politik, den biologiska mångfalden och skyddet för ekosystemen. Det finns ett behov av att bekämpa barnarbete och att investera i utbildning, forskning, vetenskap och teknik. De stora skillnaderna i Andinska gemenskapen kräver ett engagemang för att minska fattigdomen. Vi vill också betona behovet av att bekämpa gisslet narkotikaterrorism och göra allt vi kan för att utrota den organiserade brottsligheten, korruptionen, straffriheten, terrorismen, penningtvätten och vapensmugglingen. Genom detta avtal måste vi därför främja sysselsättningen och, framför allt, odlingen av andra grödor än narkotika.\nVi hoppas också att associeringsavtalet sätter fart på handels- och marknadsavregleringen genom frihandelsområdet, såväl som på de kontrollerade tulltaxorna och på förenklingen och harmoniseringen av tullförfarandena. Vi måste dessutom garantera rättssäkerhet för investerarna genom att inte bara utan vidare acceptera de påtvingade förstatligandena som har ägt rum under den senaste tiden.\nJózef Pinior\n(PL) Herr talman, herr kommissionsledamot! Jag vill först och främst tacka Willy Meyer Pleite för att han har utarbetat ett betänkande som innehåller Europaparlamentets rekommendationer till rådet om riktlinjer för förhandlingarna om ett associeringsavtal mellan EU och länderna i Centralamerika, och Luis Yañez-Barnuevo García för hans betänkande om riktlinjer för förhandlingarna om ett associeringsavtal mellan EU och Andinska gemenskapen.\nI Europaparlamentets rekommendationer betonas att associeringsavtalen, samtidigt som de syftar till en gradvis avreglering av handeln och till politisk dialog och samarbete, också har som mål att stödja en fortsatt social utveckling, social sammanhållning, stärkande av demokratin, rättssäkerheten och respekten för mänskliga, politiska, medborgerliga, ekonomiska och sociala rättigheter och sist men inte minst de kulturella och miljömässiga dimensionerna i samband med dessa rättigheter.\nLänderna i Andinska gemenskapen och i Centralamerika har under de senaste 20 åren genomgått en fredlig övergång från auktoritära regimer till demokrati. Under 1980-talet spelade EU en viktig roll i denna process. Genom sina rekommendationer bevarar Europaparlamentet denna tradition.\nNuförtiden kan handelsavreglering inte vara ett mål i sig. Jag betonar: det kan inte vara ett mål i sig, utan bara ett steg i riktning mot ett genomförande av demokrati och rättssäkerhet samt social och hållbar utveckling i Latinamerika. Associeringsavtalen med länderna i Centralamerika och i Andinska gemenskapen måste innefatta politik, handel och utveckling.\nRyszard Czarnecki\n(PL) Herr talman! I Centralamerika har termen ”Europeiska unionen” gradvis börjat att överföras till ordboken för ovanliga uttryck. Det europeiska politiska inflytandet i regionen håller på att minska även om samma europeiska länder var mycket viktiga för demokratiseringen av regionen under 1980-talet.\nUnder elva år föll handelsomsättningen mellan Europeiska unionen och Centralamerika med 11 procent till den nuvarande nivån på 13 procent, trots ensidiga förmånliga villkor från vår sida. Associeringsavtalet bör förändra denna situation i någon mån.\nDet andra associeringsavtalet med Andinska gemenskapen sammanfaller med en spännande politisk period i regionen. Den antiamerikanska vänsterns seger i Venezuela och Bolivia och den ändrade maktbalansen i regionen är en utmaning för Europeiska unionen. Detta är i själva verket en fördel för den ekonomiska och politiska integrationen för hela Latinamerika i högre grad än Mercosur.\nJag vill tacka föredragandena Willy Meyer Pleite och Luis Yañez-Barnuevo García och beklagar bara att vi diskuterar dessa viktiga frågor strax före midnatt.\nWilly Meyer Pleite\n- (ES) Herr talman! Denna gång vill jag använda min talartid till att redovisa min grupps inställning, dvs. gruppen Europeiska enade vänstern/Nordisk grön vänster, till Luis Yañez-Barnuevo Garcías betänkande.\nHan kommer att förstå att vi kommer att rösta på samma sätt som vi röstade om det betänkande som jag hade nöjet att lägga fram för parlamentet. Vi kommer att lägga ned våra röster. Vi gör detta av samma anledningar och i vetskapen om att Luis Yañez-Barnuevo García verkligen har ansträngt sig för att lägga fram ett mycket välavvägt betänkande med betoning på de grundläggande aspekter som Latinamerika just nu efterlyser, dvs. politisk dialog och samarbete. Med hänsyn till samarbetsaspekten kan vi spela en mycket betydelsefull roll jämfört med den roll som Förenta staterna spelar i Latinamerika, men tyvärr har andra ledamöter, framför allt i utskottet för internationell handel, ändrat detta förhållningssätt avsevärt.\nVi kommer att lägga ned våra röster. Sanningen är den att våra instinkter ibland uppmanar oss att gå lite längre. Men vi lägger ned våra röster därför att vi anser att vi också måste lyssna på Latinamerikas åsikter och, i detta fall, även på dess sociala organisationer. Vi kommer att se till att vårt agerande i detta fall bidrar till den grundläggande diskussionen om associeringsavtalet med Centralamerika och vi kommer att vara mycket kritiska om det rör sig om ett associeringsavtal som inte innebär ett frihandelsområde.\nBogusław Sonik\n- (PL) Herr talman! Trots EU:s obestridliga insatser för att stärka fredsprocessen och skapa demokratiska strukturer i Centralamerika har dess roll i området minskat betydligt under det senaste decenniet.\nSom vi redan har hört kan samma trend observeras inom handeln, som minskade med 24 procent till knappt 13 procent 2001. Denna situation visar tydligt hur viktigt det är att ett nytt associeringsavtal sluts mellan EU och länderna i Centralamerika.\nEtt sådant avtal, förutom dess obestridliga ekonomiska fördelar, kommer också att omfatta vissa skyldigheter för EU, huvudsakligen när det gäller stödet för demokratiserings- och decentraliseringsprocessen, och förbättra den administrativa effektiviteten i samband med kampen mot våld, korruption och överträdelser av de mänskliga rättigheterna. Dessa skyldigheter är skälet till varför det framtida associeringsavtalet bör vara något mer än bara ett handelsavtal. Det måste också omfatta politiskt och socialt samarbete. Kampen mot fattigdomen och sociala orättvisor kan bli ett mycket användbart verktyg när det gäller att stärka demokratin, skapa förtroende för offentliga institutioner och även för den politiska eliten, som borde vara dem som värnar om dessa värderingar.\nYtterligare en mycket viktig del i ett framtida associeringsavtal är upprättandet av obligatoriska miljöskyddsnormer. Systemet med incitament som provats tidigare kanske kan visa sig vara användbart i detta sammanhang.\nAlla de delar som jag har nämnt bör ingå i ett framtida associeringsavtal och på samma gång utgöra pelare i samarbetet mellan EU och länderna i Latinamerika. Det är bara genom att spela en aktiv och engagerad roll i denna region som vi kan bidra till en faktisk ekonomisk utveckling, social och politisk stabilitet och till upprättandet av demokratiska värderingar.\nTalmannen\nDebatten är härmed avslutad.\nOmröstningen kommer att äga rum på torsdag kl. 12.00.\n",
    "meta": {
        "language": "sv"
    }
}

HackerNews

A dialog-based dataset where user comments on the links as the head post aggregated by Y Combinator.

Download and Extraction: The dataset was downloaded from the HackerNews repo here: https://hacker-news.firebaseio.com/v0/item/. The dataset was parsed using the Story ID. In this dataset each post is a story, and each reply is considered subsequent story. Story IDs were considered between ID 1 to 37500000. The URL for all Story IDs was pinged. If that ID returned an error, the ID was removed. Each request was given a 2 second wait to account for network time.

The HackerNews dataset contains a vast amount of stories and is known for lively discussions. Due to the number of replies a story may contain, only longest comment thread for each story was sampled past level 3. All stories included the title (1st level) and all direct replies (2nd level). We may consider relax this constrain and extract more data.

Unique Data Preparation Challenges:

The converesation and forum style structure can be a very helpful signal for language model training. During processing the dataset, we try to encode such structure but without introducing too much noise. We choose to use an<AUTHOR> tag to encode the main thread text by the original poster, and use a <COMMENT> tag to encode the replies. We initially choose as a tag since it is used by some instruction tuning dataset, but realize the tag can easily conflict with the original text.
As discussed above, the comment hierarchies required a thoughtful approach to extracting meaningful data.
In the comment thread heirarchy, relationships had to be assigned to between the comments, sub-comments, and original story ID.

Filters Applied:

Language Filter: English
Minimum Word Count Filter: 10
Unigram Log Probability Threshold: -20

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
HackerNews	2064931	2.62%	0.02%	0.34%	61.84%	35.18%

USPTO

Patent documents from the United States Patent and Trademark Office.

Download and Extraction: Data was downloaded and extracted using tags from https://bulkdata.uspto.gov/data/patent/grant/redbook/fulltext/. There were three different formats that needed three different functions to download and extract the data based on year:Pre_2002, 2002_to_2004 andpost_2004. We used the exact code used in The Pile (citation needed).

Filters Applied:

Language Filter: English
Minimum Word Count Filter: 50
Unigram Log Probability

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
USPTO	6880276	0.02%	1.88%	0.01%	22.94%	75.15%

FreeLaw

Legal documents and court cases from various jurisdictions provided by US-registered non-profit firm Free Law Project. We have included data from CourtListener which included millions of legal opinions from federal and state courts.

Download and ExtractionThe dataset was downloaded from: https://storage.courtlistener.com/bulk-data/. There are 19 CSV files which contain overlapping content. CSV files can contain content in multiple columns requiring a holistic extraction approach. Text was extracted from the following using html2text function. The block below shows how each text type was extracted.

("html", html2text), ("html_lawbox", html2text), ("html_columbia", html2text), ("html_anon_2020", html2text), ("html_with_citations", html2text), ("xml_harvard", html2text), plain_text

All content was downloaded leading to high number of documents filtered during local deduplication. Following The Pile, priority was given to plain_text first, followed by the columns in the table in reverse order.

Unique Data Preparation Challenges:

The Freelaw text uses a lot of whitespaces and newlines to format the document visually. These lines are not necessary for language model learning and sometimes have confusing semantic meanings. We attempt to unify how whitespaces appear in this dataset with the following heuristics.

Consecutive whitespaces and tabs were found. Consecutive Whitespaces and tabes were reduce to one, single whitespace.
Whitespaces were found between new lines with no addition text. These whitespaces were removed.
Consecutive new lines were found in some documents without leading to a new paragraph. All consecutive newline to a single new line.
Converted all single new lines to whitespace. If whitespace was found after a new line with no text, the whitespace was removed. All leading and trailing whitespace was removed.
All form feed (\f)characters were removed.

Filters Applied:

Language Filter: English
Minimum Word Count Filter: 50
Unigram Log Probability

Note: Local deduplication within FreeLaw itself removed 90%+ of the dataset as duplicate.

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
FreeLaw	75971288	3.00%	7.49%	0.07%	82.73%	6.71%

FreeLaw Filtering Examples

Data sample: 2 of 9

Raw format

{
    "id": 2207254,
    "date_created": "2013-10-30 08:36:18.400331+00",
    "date_modified": "2023-08-26 19:06:18.188241+00",
    "author_str": "Lillie",
    "per_curiam": "f",
    "joined_by_str": null,
    "type": "010combined",
    "sha1": "8505aab072258b0a3cc7b79994e4c11615e07bcc",
    "page_count": null,
    "download_url": null,
    "local_path": null,
    "plain_text": null,
    "html": null,
    "html_lawbox": "<div>\n<center><b>208 Cal.App.2d 246 (1962)</b></center>\n<center><h1>THE PEOPLE, Plaintiff and Respondent,<br>\nv.<br>\nSOLLY TERENO, Defendant and Appellant.</h1></center>\n<center>Crim. No. 8014. </center>\n<center><p><b>California Court of Appeals. Second Dist., Div. One.  </b></p></center>\n<center>Oct. 5, 1962.</center>\n<p> Matthews &amp; Stanley for Defendant and Appellant.</p>\n<p> Stanley Mosk, Attorney General, William E. James, Assistant Attorney General, and Norman H. Sokolow, Deputy Attorney General, for Plaintiff and Respondent.</p>\n<p> LILLIE, J.</p>\n<p> Having found defendant guilty of four counts of bookmaking in violation of section 337a, subdivisions 1 and 3, Penal Code, the trial judge on February 10, 1961, sentenced <span class=\"star-pagination\">*247</span> him to a term of 180 days in the county jail; he suspended sentence and granted defendant probation for a period of three years on certain specified terms and conditions, among them, that defendant \"not gamble or engage in any bookmaking activities or have paraphe[r]nalia thereof in his possession, and not be present in places where gambling or bookmaking is conducted,\" and that he \"obey all laws, orders, rules and regulations of the probation department and of the court.\" Shortly thereafter defendant was again arrested and charged with four counts of bookmaking; he was found guilty of a violation of section 337a, subdivision 3, Penal Code, as alleged in count 3 of the information (no. 242588). On August 22, 1961, the court denied probation and sentenced defendant to 90 days in the county jail. At the same time, and in the instant case, the court found defendant to be in violation of the probation order of February 10, 1961, and ordered the same modified to provide that he serve the next 90 days in the county jail, probation to continue under the same terms and conditions upon his release. The court ordered the jail terms in case no. 242588 and in the instant case to run concurrently. From the judgment defendant appeals.</p>\n<p> [1] It appearing that defendant engaged in bookmaking activities on March 23, 1961, for which he was charged and convicted (judgment affirmed by this court on August 29, 1962, People v. Tereno, 207 Cal.App.2d 246 [24 Cal.Rptr. 501], in violation of section 337a, subdivision 3, Penal Code, and the probation order of February 10, 1961, the lower court properly found defendant to be in violation of the order, and modified the same. The judgment is affirmed.</p>\n<p> [2] While defendant has also appealed from an order denying a motion for new trial, the record in both cases is silent concerning such a motion and no order relative to denial of a new trial appears therein. Thus, appeal from the purported order is dismissed.</p>\n<p> Wood, P. J., concurred.</p>\n</div>",
    "html_columbia": null,
    "html_anon_2020": null,
    "xml_harvard": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<opinion type=\"majority\">\n<author id=\"b266-12\">\n  LILLIE, J.\n </author>\n<p id=\"Aoy\">\n  Having found defendant guilty of four counts of bookmaking in violation of section 337a, subdivisions 1 and 3, Penal Code, the trial judge on February 10, 1961, sentenced\n  <span citation-index=\"1\" class=\"star-pagination\" label=\"247\"> \n   *247\n   </span>\n  'him to a term of 180 days in the county jail; he suspended sentence and granted defendant probation for a period of three years on certain specified terms and conditions, among them, that defendant “not gamble or engage in any bookmaking activities or have paraphe [r] nalia thereof in his possession, and not be present in places where gambling or bookmaking is conducted,” and that he “obey all laws, orders, rules and regulations of the probation department and of the court. ’ ’ Shortly thereafter defendant was again arrested and charged with four counts of bookmaking; he was found guilty of a violation of section 337a, subdivision 3, Penal Code, as alleged in count 3 of the information (no. 242588). On August 22, 1961, the court denied probation and sentenced defendant to 90 days in the county jail. At the same time, and in the instant case, the court found defendant to be in violation of the probation order of February 10, 1961, and ordered the same modified to provide that he serve the next 90 days in the county jail, probation to continue under the same terms and conditions upon his release. The court ordered the jail terms in case no. 242588 and in the instant case to run concurrently. From the judgment defendant appeals.\n </p>\n<p id=\"b267-6\">\n  It appearing that defendant engaged in bookmaking activities on March 23, 1961, for which he was charged and convicted (judgment affirmed by this court on August 29, 1962,\n  <em>\n   People\n  </em>\n  v. Tereno, 207 Cal.App.2d 246 [24 Cal.Rptr. 501], in violation of section 337a, subdivision 3, Penal Code, and the probation order of February 10, 1961, the lower court properly found defendant to be in violation of the order, and modified the same. The judgment is affirmed.\n </p>\n<p id=\"b267-7\">\n  While defendant has also appealed from an order denying a motion for new trial, the record in both cases is silent concerning such a motion and no order relative to denial of a new trial appears therein. Thus, appeal from the purported order is dismissed.\n </p>\n<p id=\"b267-8\">\n  Wood, P. J., concurred.\n </p>\n</opinion>",
    "html_with_citations": "<div>\n<center><b><span class=\"citation no-link\"><span class=\"volume\">208</span> <span class=\"reporter\">Cal. App. 2d</span> <span class=\"page\">246</span> </span>(1962)</b></center>\n<center><h1>THE PEOPLE, Plaintiff and Respondent,<br>\nv.<br>\nSOLLY TERENO, Defendant and Appellant.</h1></center>\n<center>Crim. No. 8014. </center>\n<center><p><b>California Court of Appeals. Second Dist., Div. One.  </b></p></center>\n<center>Oct. 5, 1962.</center>\n<p> Matthews &amp; Stanley for Defendant and Appellant.</p>\n<p> Stanley Mosk, Attorney General, William E. James, Assistant Attorney General, and Norman H. Sokolow, Deputy Attorney General, for Plaintiff and Respondent.</p>\n<p> LILLIE, J.</p>\n<p> Having found defendant guilty of four counts of bookmaking in violation of section 337a, subdivisions 1 and 3, Penal Code, the trial judge on February 10, 1961, sentenced <span class=\"star-pagination\">*247</span> him to a term of 180 days in the county jail; he suspended sentence and granted defendant probation for a period of three years on certain specified terms and conditions, among them, that defendant \"not gamble or engage in any bookmaking activities or have paraphe[r]nalia thereof in his possession, and not be present in places where gambling or bookmaking is conducted,\" and that he \"obey all laws, orders, rules and regulations of the probation department and of the court.\" Shortly thereafter defendant was again arrested and charged with four counts of bookmaking; he was found guilty of a violation of section 337a, subdivision 3, Penal Code, as alleged in count 3 of the information (no. 242588). On August 22, 1961, the court denied probation and sentenced defendant to 90 days in the county jail. At the same time, and in the instant case, the court found defendant to be in violation of the probation order of February 10, 1961, and ordered the same modified to provide that he serve the next 90 days in the county jail, probation to continue under the same terms and conditions upon his release. The court ordered the jail terms in case no. 242588 and in the instant case to run concurrently. From the judgment defendant appeals.</p>\n<p> [1] It appearing that defendant engaged in bookmaking activities on March 23, 1961, for which he was charged and convicted (judgment affirmed by this court on August 29, 1962, People v. Tereno, <span class=\"citation no-link\"><span class=\"volume\">207</span> <span class=\"reporter\">Cal. App. 2d</span> <span class=\"page\">246</span> </span>[<span class=\"citation no-link\"><span class=\"volume\">24</span> <span class=\"reporter\">Cal. Rptr.</span> <span class=\"page\">501</span></span>], in violation of section 337a, subdivision 3, Penal Code, and the probation order of February 10, 1961, the lower court properly found defendant to be in violation of the order, and modified the same. The judgment is affirmed.</p>\n<p> [2] While defendant has also appealed from an order denying a motion for new trial, the record in both cases is silent concerning such a motion and no order relative to denial of a new trial appears therein. Thus, appeal from the purported order is dismissed.</p>\n<p> Wood, P. J., concurred.</p>\n</div>",
    "extracted_by_ocr": "f",
    "author_id": 6484.0,
    "cluster_id": 2207254
}

Extracted format

{
"text": "\n\n LILLIE, J.\n \n\n Having found defendant guilty of four counts of bookmaking in violation of section 337a, subdivisions 1 and 3, Penal Code, the trial judge on February 10, 1961, sentenced\n \n *247\n \n 'him to a term of 180 days in the county jail; he suspended sentence and granted defendant probation for a period of three years on certain specified terms and conditions, among them, that defendant “not gamble or engage in any bookmaking activities or have paraphe [r] nalia thereof in his possession, and not be present in places where gambling or bookmaking is conducted,” and that he “obey all laws, orders, rules and regulations of the probation department and of the court. ’ ’ Shortly thereafter defendant was again arrested and charged with four counts of bookmaking; he was found guilty of a violation of section 337a, subdivision 3, Penal Code, as alleged in count 3 of the information (no. 242588). On August 22, 1961, the court denied probation and sentenced defendant to 90 days in the county jail. At the same time, and in the instant case, the court found defendant to be in violation of the probation order of February 10, 1961, and ordered the same modified to provide that he serve the next 90 days in the county jail, probation to continue under the same terms and conditions upon his release. The court ordered the jail terms in case no. 242588 and in the instant case to run concurrently. From the judgment defendant appeals.\n \n\n It appearing that defendant engaged in bookmaking activities on March 23, 1961, for which he was charged and convicted (judgment affirmed by this court on August 29, 1962,\n \n People\n \n v. Tereno, 207 Cal.App.2d 246 [24 Cal.Rptr. 501], in violation of section 337a, subdivision 3, Penal Code, and the probation order of February 10, 1961, the lower court properly found defendant to be in violation of the order, and modified the same. The judgment is affirmed.\n \n\n While defendant has also appealed from an order denying a motion for new trial, the record in both cases is silent concerning such a motion and no order relative to denial of a new trial appears therein. Thus, appeal from the purported order is dismissed.\n \n\n Wood, P. J., concurred.\n \n"
}

StackExchange

A network of question-and-answer websites on various subjects, including programming, science, mathematics, and more. This is one of the largest publicly available repositories for question-answer pairs. We have included comments also to include an overall discussion on each post.

Download and Extraction: The archive dataset was used to download all data from StackExchange and 364 StackExchange's sub URLs including: math.stackexchange.com. Raw data was extracted an XML format and only two files Posts.xml and Comments.xml were considered. To match the StackExchange hierarchy, each file was parsed using post_id to connect questions to answers and then to comments. We will include the full list of sub URLs in when the code is released.

1. Questions: 2. Comment1: 3. Comment2: 4. Answer1: 5. Comment1: 6. Comment2: 7. Answer2: 8. Comment1: 9. Comment2:

Unique Data Preparation Challenges:

Handling code block was a required finding the specific blocks and extracting the details in one snippet.
Question and Answer formatting had to be rewritten to match the question and the anwer.
Occasionally a title was not included at the beginning of a question. For consistent formatting, a title was added.

Filters Applied:

Minimum Word Count Filter: 10

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
StackExchange	23246548	0.00%	0.00%	0.00%	0.00%	100.00%

StackExchange Filtering Examples

Data sample: 3 of 9

Raw format

{
"Id": 4,
"PostTypeId": 2,
"AcceptedAnswerId": null,
"CreationDate": "2014-09-16T18:29:58.093",
"Score": 12,
"ViewCount": null,
"Body": "The human body would adapt and deal with elevation. Many top athletes often train at high altitudes to help train their bodies to absorb oxygen more efficiently from the thinner air.\n\nIf the terrain was pure mountains then the ability to adapt to the local geography would be key. Natural shelters in the forms of caves for early survival, though a medieval culture would be able to extend and possibly even create man-made caves for habitation over time. This will help them deal with the cold and natural predators as our, and their ancestors did. Fires at the entrance to the cave would ward off predators and help keep the cold at bay.\n\nThe most significant hazards would be natural. Snow and ice would create dangerous living and working conditions. The constant need for food would also be a source of difficulty, with meat being a large part of the diet. For the survival of a large settlement, farming is necessary, though in mountains this will be near impossible limiting settlements to a small size.\n\nCave ins, earth quakes, avalanches, all sorts of natural disasters would affect their daily lives. We struggle to deal with these in this day and age. They would simply be a fact of life for those in the medieval age that they had to contend with and get through. Any one serious act could end the settlement though.\n",
"OwnerUserId": 15.0,
"LastEditorUserId": null,
"LastEditDate": null,
"LastActivityDate": "2014-09-16T18:29:58.093",
"Title": null,
"Tags": null,
"AnswerCount": null,
"CommentCount": 0,
"ContentLicense": "CC BY-SA 3.0",
"ParentId": 1.0,
"ClosedDate": null,
"FavoriteCount": null,
"LastEditorDisplayName": null,
"OwnerDisplayName": null,
"CommunityOwnedDate": null
}

Extracted format

{
"text": "Q:\n\nCould a cave-in or avalanche in low gravity be dangerous?\n\nI have a character on the surface of Enceladus, one of the moons of Saturn. It has an icy surface on top of what is believed to be a liquid ocean. There is no atmosphere, though near its south polar region there are what appear to be canyons filled with long lines of geysers that eject mostly water vapor into space.\nI need to have a way for this character to be trapped and buried on Enceladus (don't worry about his fate), either as a result of the ground caving in beneath him causing him to fall below the surface, or in something like an avalanche of ice down the side of the one of the canyons. The trouble is, the surface gravity on Enceladus is only 1.13% of Earth's. In that case, it seems to me most scenarios would be easily escapable, both because a cave-in or avalanche would occur very slowly and because the character could jump so high he should be able to simply leap away from the danger. I thought of having him hit by a large piece of ice and knocked out, thus preventing him from escaping; but could a person really be knocked out by a large mass that is slow-moving, or would he just be crushed?\nSo my question is: is there a realistic way for someone to get trapped an buried under these conditions?\n\nC:\n\nIf the ground opens up below you, then there's nothing for you to push of of for a jump. Thus, you'll fall.\n\nC:\n\nYour astronaut won't be able to hear the avalanche. There is less sunlight at Saturn compared to Earth and you have to ask how far electric lights will illuminate the surroundings. There's probably very little warning before the avalanche hits.\n\nC:\n\nIf a pound of feathers lands on you... does it weigh you down more or less than a pound of rocks...\n\nC:\n\nA couple of comments not worthy of an answer: 1) I think you (the author) can contrive an artificial scenario where the character drops something valuable in a crevasse, has to retrieve it, and then gets trapped by a mini cave-in or similar. 2) I might suggest the New Yorker article on Antarctic expeditioner Henry Worsley for some insight on visible and hidden dangers (crevasses!) in a similarly harsh environment https://www.newyorker.com/magazine/2018/02/12/the-white-darkness\n\nC:\n\nIf he's wearing a spacesuit I'm not sure how he can be \"knocked out\" without suffering decompression\n\nC:\n\n@WernerCD it weighs less but has less momentum because of its slower fall\n\nC:\n\n@MCMastery Erm, no, it does *not* weigh less. A pound of feathers weighs exactly as much as a pound of rocks, per the definition of \"a pound of [substance]\". Also, I haven't been able to find anything definitive regarding the density of Enceladus' atmosphere, but my impression is that it's probably thin enough that feathers and rocks would fall equally slowly, so they would have the same momentum as well.\n\nC:\n\n@DaveSherohman Oh my god, I must have been so tired. I meant to write the exact opposite; it weighs the same. I think I remember seeing a comedy skit about that once ;)\n\nC:\n\na little bit off-topic but from the wikipedia page, Surface gravity =\n0.113 m/s2 (0.0113 g). However, g = 9.80665, not 10... shouldn't surface gravity be 0.110853m/s2?\n\nC:\n\n@SilverCookies it depends on how the helmet and pressure collar fit. If the occipital bone or the mastoid process are struck with sufficient force in the fall it will result in a knockout.\n\nC:\n\n@pojo-guy indeed, though that does sound like a poorly designed helmet\n\nC:\n\n@BurnsBA thanks for the article, it was helpful\n\nC:\n\nAnother non-answer, but useful thought. As a backcountry skier *on earth*, I know that one of the biggest threats of an avalanche is not actually being crushed by snow, but being battered by snow, ice, and natural terrain features (trees, rocks) while you're being carried. So, being buried is not the only danger; if there is other \"astronaut equipment\" nearby, or if the avalanche consists of massive chunks of ice, astro-guy could be dead/injured before he even comes to a rest. Also, remember that **low gravity has no effect on the momentum** of massive chunks of ice...\n\nC:\n\nEven in zero gravity an astronaut can definitely be severely crushed between heavy moving objects.\n\nA:\n\nThe surface gravity of Enceladus is 0.113m/s2. At such a low gravity, you cannot run, for the force you would use in a step will send you on a very long jump that may last more than a minute (if you don't hit anything along the way before you touch ground again). This may be quite dangerous. If you don't have the means to fly, like a jetpack, you may end up landing on a sharp shard of ice that will rip your spacesuit open. Alternatively, you may accidentally jump from a high place to a lower one. And while lower gravity means smaller acceleration, the fact that you can jump dozens to hundreds of meters upwards, to fall on a hole/crater/depression that might be dozens to hundreds of meters lower than your starting point, means that you can land with enough speed on hardened ice to break bones and equipment.\nIf you want to see how walking on such a gravity might look like, I can recommend you a simulator. Like any simulator, this one does not model reality with 100% accuracy, but it is close enough to reality to give you a general idea. Get yourself a copy of Kerbal Space Program and go take a walk on Gilly (surface gravity = 0.049m/s2) or Pol (surface gravity = 0.373m/s2), which are the bodies with gravity that is closest to Enceladus.\nThat said, unless your astronaut has a jetpack, even walking may be suicidal. But if he does have a jetpack, he would never be in trouble in the first hand.\nAs for whether the snow can crush him... the density of snow on Earth is 0.1 to 0.8g/cm3. Let us assume that the density of snow on Enceladus is around the lowest range, 0.1g/cm3 so as to be nice with your astronaut. Now let's say that he gets 100 meters of snow on him. Let's do some calculations.\nUnder 100 meters of snow, the mass of snow above a section of one square meter is:\n$$ 10^2m \\times 1m^2 \\times 10^{-1}g/cm^3 = \\frac{10m^3g}{cm^3} = \\frac{10^6 cm^3g}{cm^3} = 10^6 g = 1 \\space metric \\space ton $$\nImpressive, right? But at 1.13% the gravity of the Earth, that metric ton would do for a pressure of 11.3 kilograms per square meter.\nThe average surface of an adult humans is around 2m2. This means that, laying down, your astrounaut is exposing about one square meter to the snow. We can then infer that under ten meters of snow, he would be facing 11.3 kilograms of pressure. That is a laughable fraction of an atmposphere.\nSo is he out of the hook? No.\nDon't forget that the astronaut is considerably denser than the snow around him. If he were naked, he could be ten times as dense as that snow - I figure the equipment in his spacesuit might be denser yet.\nIn other words, he will sink in the snow. The snow will behave like a very viscous liquid, and it should feel like sinking in quicksand for the astrounaut. In the end, he is in for a very slow death in the dark and cold bottom of the avalanche.\n\nC:\n\nWhat goes up must come down. Get in the way of one the geysers and that's going to be a bad day.\n\nC:\n\nReminds me of A Fall Of Moondust by Clarke\n\nC:\n\nFor the KSP surface gravities, I'm assuming you meant 0.049\\0.373 meters per second per second, rather than 0.049\\0.373 square meters?\n\nC:\n\n@Sean thanks, I will fix the metrics in the post.\n\nC:\n\nWhat about some sort of \"gecko tape\" like substance on the boots to help maintain traction? You'd still be able to \"peel\" your boot away from most surfaces just through the normal heel-to-toe roll of a footfall, but if you walk such that your other foot makes contact (and thus adheres) before your first foot's toe departs from the surface, you won't be launching yourself into space in the course of ordinary walking. Incidentally, it would make jumping almost impossible, thus preventing that means of escape!\n\nC:\n\n@DoktorJ I don't think you can outrun an avalanche in any gravity.\n\nA:\n\nYes, though Enceladus is probably much safer than Earth for these sorts of things.\nIt all depends on how high the avalanche starts from and how much material is involved. A ton of rock or ice hitting you at 50 mph is going to hurt regardless of whether it's on Earth or Enceladus -- if anything, Enceladus would be worse because of the likelihood of damage to your spacesuit.\nIt's certainly true that an avalanche starting from the same height will do less damage on Enceladus, but if it's high enough, it will still kill. (Also, there's the distinct possibility that Enceladus's low gravity may make much greater elevation differences more common.)\nLikewise with getting buried. The same volume avalanche will weigh less on Enceladus, and for that reason will be easier to get out of (pressure suit damage aside). But a big enough avalanche will still bury you under too much overlaying material for you to dig your way out even if you survived.\nNext there's the question of escape. Once more, it's probably easier to escape on Enceladus -- though just how athletic and controlled you can be in a spacesuit is an open question -- but escape is far from guaranteed. Further, in a vacuum, will you always be aware of an oncoming mass of ice? And if you're in a confined space, will you be able to escape? Consider a deep valley and an avalanche which starts far above you. By the time you're aware of it, it's moving 30 mph and is quite inexorable, with the same momentum it would have on Earth. Can you escape? Probably not.\nThe lack of atmosphere had a negligible effect, as air resistance doesn't play a large role in the dynamics. (Its major impact is that the lack of air on Enceladus forces people into space suits and this makes them more vulnerable.)\nSo, assuming your space suit is reasonably rugged, you're most likely safer on Enceladus, but a large avalanche can still trap you and kill you.\n\nC:\n\n@JanDoggen But it has something to do with the weight...\n\nC:\n\nIn H. Beam Piper's *Cosmic Computer*, he has the lines: \"Yves Jacquemont began posting signs in conspicuous places:\nWEIGHT IS WHAT YOU LIFT, MASS IS WHAT HURTS\nWHEN IT HITS YOU.\nWEIGHT DEPENDS ON GRAVITY; MASS IS ALWAYS CONSTANT.\"\n\nC:\n\n@JanDoggen And that's to do with inertia(l mass) – momentum – kinetic energy. None of those have anything to do with weight.\n\nC:\n\n@MarkOlson I'm not convinced by your answer. Lack of air resistance means terminal velocities are higher and things like a cave collapse can happen without needing to push the air out of the way and also less warning. (You won't hear an avalanche coming although you may feel a rumble in the ground). On the other hand though fast-moving events often ride on or mix with air in order to reduce friction so without air the avalanche may behave differently. I don't know the answer to all of this but I doubt the net effect is \"nothing\"..\n\nC:\n\n@Tim B: It's not \"nothing\", it's \"not much, negligible\". One way to look at it is to consider how much effect air resistance has on Earth -- it will be less on Enceladus. And when you're dealing with materials like ice and rock, it has very little effect on Earth. (Snow's another matter, of course, but that's not what the OP was asking about.)\n\nC:\n\n@Tim B: Not much. (But I added that.)\n\nC:\n\n*The same volume avalanche will weigh less on Enceladus*? Are you saying that because it wll be ice instead of rock? Because gravity has nothing to do with the mass itself.\n\nC:\n\n@wizzwizz4 Yes, in the case of being buried under it. But *A ton of rock or ice hitting you at 50 mph* is the same here as there.\n\nC:\n\nThis answer is good but misses an important detail - what effect will the lack of atmosphere have?\n\nC:\n\nJust want to point out that weight is a product of gravity. So less gravity means less weight.\n\nA:\n\nYes, it's definitely feasible for either a cave in or an avalanche to trap this character, and ice is heavy enough that chunks the size of two sedans would be very difficult for the average person to move even under Enceladus' gravity.\n\nMaterial Required To Trap a Person: Since the gravity is ~1% of Earth's (rounded for easier math), 100kg on Earth would be only 1kg on Enceladus. Assuming that the character did get trapped under some amount of ice, let's see how much is needed to prevent the character from just pushing their way out once they've been buried.\nBenchpress world records are around 485kg, so if your character is a world record body builder they could theoretically lift 48,500kg, or about 40 Toyota Corollas. Let's assume a more modest 100kg to make the math easy.\nThis site claims the volume of their truck trailers are 82 cubic meters, and this site claims that 82 cubic meters of ice is about 75,000kg. A Toyota Corolla is about 12 cubic meters, which is about 7,300kg of ice. So, a chunk of ice the size of 1 and a half sedans could trap, but not completely crush, someone on Enceladus, and presumably your disaster would involve much more than that.\nAvoiding a Cave In: This is trivially easy to avoid if the character is next to a stable wall to grab on to since they'd fall slowly, so let's assume the entire area around the character is collapsing. \nThis question covers the idea of climbing up falling debris, however the answer's best case scenario involves large pieces of rubble that you were already about to jump off of. If the character is just standing, then they will fall at the same speed as the ground below them so they would not be able to push off of anything. Therefore, the character could not jump to safety if the ground below them caves in and they had no solid ground to grab onto.\nAvoiding an Avalanche: Although it would be moving slow, it would actually be pretty hard to avoid being buried in an avalanche on Enceladus. I don't have enough physics degrees to understand the math, however I'd imagine that since the avalanche would behave much like a liquid, trying to stay on top of it would be like try to walk through a flood of quicksand or molasses. This, coupled with a cloud of powdery ice blocking attempts to find a safe route, could definitely lead to the character sinking and getting buried. \n\nC:\n\n@Andrey I happen to have pushed vehicles as heavy as a Toyota Landcruiser, and I can verify that it does NOT take minutes worth of output from a human to get meaningful velocities in multi-ton objects. Think accelerations on the order of ~0.1 M/s^2. Absolutely doable and worthwhile, although one thing that I haven't seen mentioned is that the posture of the person trapped may prevent the same kind of leverage one applies with the proper form. Anyone who has benched can tell you proper form is everything.\n\nC:\n\n@wedstrom that's a good point. Just try pushing at car on a 25 degree incline, and you will have 10% gravity perfectly simulated. See if you can still move it. On a flat surface you are converting 95% of your energy into acceleration, just a little loss to friction, on a lift most of it is being lost to fighting gravity\n\nC:\n\nYou would probably not be able to bench the Toyota. Mass is the same. It would take you minutes of pushing at your full stength to accelerate the mass and get it moving. Once it finally moves it would blast off taking you with them if you made the mistake of holding on.\n\nC:\n\n@Andrey: Unless there's no gravity then you don't lift mass, you lift weight. Lifting 'Thing A' that weighs 100kg on Earth would take the same effort as lifting 'Thing B' that weighs 100kg on Enceladus. However, Thing A would weigh 1kg on Enceladus, and Thing B would weigh 10,000kg on Earth.\n\nC:\n\nnot exactly. V=F/M So even at 0 gravity, it takes a huge amount of force to put any useful velocity on an object. Heavy object in low G are extremely dangerous. They soak energy like a sponge and then become freight trains slowly moving forward crushing you\n\nC:\n\n@Andrey: Ah, I see what you mean now. I was focusing more on how much material was needed to trap the character after they're buried, so they couldn't just push the ice out of the way and escape. I'll update that part to be more clear.\n\nC:\n\n\"100kg on Earth would be only 1kg on Enceladus.\" It would be *equivalent* to 1kg, but it wouldn't *be* 1kg. \"Unless there's no gravity then you don't lift mass, you lift weight. \" If you had semi on a completely frictionless track, and were pushing horizontally, you would have no weight. Does that mean it wouldn't take any effort to push it?\n\nC:\n\n@Acccumulation: Pushing an object is not the same as lifting an object. If a small car is in neutral, pushing it is pretty easy, whereas lifting it is not even though the only friction when lifting is from air. Also, it *would* be about 1kg on Enceladus. Weight is just mass*gravity, so if Earth gravity is '1' and Enceladus is '1% of 1': 100kg*1=100kg, 100kg*0.01=1kg\n\nC:\n\nLifting an object is, in part, pushing it. And kg is a unit of mass, not weight.\n\nC:\n\n@Acccumulation: Good point, lifting is basically pushing something that is accelerating towards you. And I used kg as weight because most people can think of what a kilogram or pound or whatever 'weighs', but have little reference for a newton or pound-force.\n\nA:\n\nDespite the lower gravity, a cave-in in a sufficiently deep crevasse or cave could still easily happen quickly enough to block the escape.\nAnd once the only entrance is blocked by several tons of ice (and remember, the ice still has the same mass as on earth, so you can't just push it away), your explorer is truly trapped without advanced mining equipment.\nIn fact, the use of such equipment might even pose another hazard, if it melts the ice or causes tremors, which could instabilize the rest of the ice.\nBeing knocked out is also a possibility: Even if the actual collapse is far slower than on earth, the large masses colliding can cause shards and boulders of ice to be ejected at dangerous velocities.\nFinally, ice can be quite sharp, so if your character falls on or is hit by an icycle, they could end up pinned in place, with the ice stuck in their pressure suite being the only thing between them and decompression.\n\nA:\n\nI think the other answers have sufficiently covered how dangerous a cave-in or avalanche might be, but I want to point out that the chance and severity of them will be far higher.\nThe lower gravity will create a far steeper angle of repose as the cohesiveness of snow, rock, etc. will be much greater relative to gravity than we are used to on Earth. That means you can have far more material build up into very steep, even over-hanging and exotic structures. Add to that the lower atmospheric disturbance (no wind) and no critters or humans to disturb this moon's surface and you will probably have large, critically balanced structures that are ready to be knocked over at any moment.\nWhether or not cave-ins or avalanches would be as dangerous as on Earth, you will have them occur far more often in virgin territory, and the mass of the material involved will likely be much greater.\n\nA:\n\nWhat I miss in other answers is that the morphology of mountains and avalanche material will be completely different at 1.13% of gravity. The amount of material stacking up before surfaces get crushed to the degree of starting a conversion from sticking to moving friction will be quite higher. So when finally things start getting ugly, the amount of ugliness unleashed will be quite different from that on Earth and the amount of potential energy leading to a chain reaction will be comparable, making the involved masses quite larger. Avalanches will be quite slower at taking up speed, but they will be just as deadly in their effects and the height colloding material will take on will have similar relations and densities compared to the jumping height of a human as on Earth. It's not just the human energy and time frame getting better payoff.\n\nC:\n\nYou apparently didn't read my answer because I addressed that.\n\nA:\n\nAbsolutely yes. Even though the force of gravity is 1.3% of the Earth's, the planetary weight of a landslide or cave-in could still be fatal.\nWeight is the force of gravity on a mass. Newton's second law formula (F = m•a) shows the relationship between mass, acceleration and force. The following weight formula uses Newton's second law:\nw = m • g\nwhere:\n\nw is the weight (force of gravity on a mass)\nm is the mass\ng is the acceleration due to gravity.\n\nUsing this weight calculator with a g value of 0.1274 m/s^2 (1.13% of 9.8 m/s^2), you can make some simple calculations. If 1,000 pounds of material would crush you on Earth (453.59 kg), that's a fourth of the weight of a VW Beetle. On your planet, ~34,800 kg would have the same effect, and that could be 21 cubic meters of stone, i.e. not much. \n\n",
"meta": {
"source": "worldbuilding.stackexchange"
}
}

Ubuntu IRC

Chat logs from the Ubuntu Internet Relay Chat (IRC) channels on the Freenode IRC chat server. This data is also another form of dialog dataset on niche topics.

Download and Extraction: The dataset was downloaded from: https://irclogs.ubuntu.com/{date.year}/{date.month:02d}/{date.day:02d}/ based on the year.

During extraction, the logs were cleaned using following functions:

def exclude_system(x): return ' '.join(line for line in x.split(' ') if not line.startswith('===')) def exclude_select_system(x): return ' '.join(line for line in x.split(' ') if not (line.startswith('===') and any(term in line for term in ['has joined #', 'has left #', 'Topic for #', "Topic (#", "is now known as"]) )) def clean(x): return ' '.join('* ' + line[4:] if line.startswith('===') else line[8:] for line in x.split(' '))

Unique Data Preparation Challenges:

Similar to the HackerNews challenges, we had to map comments and sub-comments to the original question.
The dataset comes with the usernames of post authors. We attempt to replace them with strings such as <USER1> to remove the PII. This step might also reduce the language model's effort to memorizing the user names.

Filters Applied:

Language Filter: English
Minimum Word Count Filter: 10
Unigram Log Probability

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
Ubuntu IRC	37966	0.00%	0.14%	1.12%	0.66%	98.08%

DM Math

DeepMind Math dataset with generated questions from various topics like algebra, calculus, geometry, etc. Maths data is included to improve model reasoning abilities in the downstream tasks.

Download and Extraction: The dataset was downloaded directly from the Huggingface repo: https://huggingface.co/datasets/deepmind/math_dataset. The data was converted to the jsonl format where lines is represented as:

Question: TEXT Answer: TEXT

Unique Data Preparation Challenges:

In one of our versions, we save the string as a byte string instead of raw text, introducing addition byte indicators at the string level
No space before keyword "Answer:"

Filters Applied:

No filtering was applied to DM Math

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
DM Math	112559888	0.00%	0.00%	0.00%	0.00%	100.00%

DM Math Filtering Examples

Data sample: 3 of 9

Raw format

{
    "question": "b'Total of 0.06 and -1977321735.\\n'",
    "answer": "b'-1977321734.94\\n'"
}

Extracted format

{
    "text": "Question: b'Total of 0.06 and -1977321735.\\n'\\nAnswer: b'-1977321734.94\\n'"
}

PG-19

A collection of books from Project Gutenberg, a digital library of public domain works. This contains all the books that were published before 1919.

Download and Extraction: The dataset was downloaded directly from Huggingface: https://huggingface.co/datasets/deepmind/pg19.

Unique Data Preparation Challenges:

The original books uses a lot of witespaces to format the text, similar to the case of FreeLaw. Sometimes, 10+ consecutive whitespaces were found. These whitespaces were reduce to one, single whitespace.
For similar reasons, consecutive new lines were found in some documents. All consecutive news over two were were reduce to two new lines.
The books are formmated with end-of-line hyphenation and break a single words into two lines. Hence a regular word such as text could become te-\nxt. We detect the combination of -\n and remove them to the origin word heuristically.
Text delimiters such as * * * * * * * * were used to indicate structures like sections. We removed such known delimiters and replaced them with proper whitespaces and new lines. For others, we make sure there are no additional leading or trailing whitepsaces.

Filters Applied:

Language Filter: English
Minimum Word Count Filter: 20
Unigram Log Probability: -20

Dataset	Lines Downloaded	Percent Removed After Language Filter	Percent Removed After Min Word Count Filter	Percent Removed After Unigram Probability Filter	Percent Removed After Local Dedup	Total Percentage Remaining
PG-19	28752	0.24%	0.00%	0.17%	0.80%	98.78%

PG-19 Filtering Examples

PG19

Data sample: 0 of 9

{
    "short_book_title": "Walking by Henry David Thoreau",
    "publication_date": 1862,
    "url": "http://www.gutenberg.org/ebooks/1022",
    "text": "\n\n\n\nProduced by Q Myers\n\n\n\n\n\nWALKING\n\nby Henry David Thoreau\n\n\nI wish to speak a word for Nature, for absolute freedom and wildness, as\ncontrasted with a freedom and culture merely civil--to regard man as\nan inhabitant, or a part and parcel of Nature, rather than a member\nof society. I wish to make an extreme statement, if so I may make\nan emphatic one, for there are enough champions of civilization: the\nminister and the school committee and every one of you will take care of\nthat.\n\n\n\nI have met with but one or two persons in the course of my life who\nunderstood the art of Walking, that is, of taking walks--who had a\ngenius, so to speak, for SAUNTERING, which word is beautifully derived\n\"from idle people who roved about the country, in the Middle Ages, and\nasked charity, under pretense of going a la Sainte Terre,\" to the Holy\nLand, till the children exclaimed, \"There goes a Sainte-Terrer,\" a\nSaunterer, a Holy-Lander. They who never go to the Holy Land in their\nwalks, as they pretend, are indeed mere idlers and vagabonds; but they\nwho do go there are saunterers in the good sense, such as I mean. Some,\nhowever, would derive the word from sans terre without land or a home,\nwhich, therefore, in the good sense, will mean, having no particular\nhome, but equally at home everywhere. For this is the secret of\nsuccessful sauntering. He who sits still in a house all the time may be\nthe greatest vagrant of all; but the saunterer, in the good sense, is\nno more vagrant than the meandering river, which is all the while\nsedulously seeking the shortest course to the sea. But I prefer the\nfirst, which, indeed, is the most probable derivation. For every walk is\na sort of crusade, preached by some Peter the Hermit in us, to go forth\nand reconquer this Holy Land from the hands of the Infidels.\n\nIt is true, we are but faint-hearted crusaders, even the walkers,\nnowadays, who undertake no persevering, never-ending enterprises. Our\nexpeditions are but tours, and come round again at evening to the old\nhearth-side from which we set out. Half the walk is but retracing our\nsteps. We should go forth on the shortest walk, perchance, in the\nspirit of undying adventure, never to return--prepared to send back\nour embalmed hearts only as relics to our desolate kingdoms. If you are\nready to leave father and mother, and brother and sister, and wife\nand child and friends, and never see them again--if you have paid your\ndebts, and made your will, and settled all your affairs, and are a free\nman--then you are ready for a walk.\n\nTo come down to my own experience, my companion and I, for I sometimes\nhave a companion, take pleasure in fancying ourselves knights of a new,\nor rather an old, order--not Equestrians or Chevaliers, not Ritters or\nRiders, but Walkers, a still more ancient and honorable class, I trust.\nThe Chivalric and heroic spirit which once belonged to the Rider seems\nnow to reside in, or perchance to have subsided into, the Walker--not\nthe Knight, but Walker, Errant. He is a sort of fourth estate, outside\nof Church and State and People.\n\nWe have felt that we almost alone hereabouts practiced this noble art;\nthough, to tell the truth, at least if their own assertions are to be\nreceived, most of my townsmen would fain walk sometimes, as I do, but\nthey cannot. No wealth can buy the requisite leisure, freedom, and\nindependence which are the capital in this profession. It comes only\nby the grace of God. It requires a direct dispensation from Heaven\nto become a walker. You must be born into the family of the Walkers.\nAmbulator nascitur, non fit. Some of my townsmen, it is true, can\nremember and have described to me some walks which they took ten years\nago, in which they were so blessed as to lose themselves for half\nan hour in the woods; but I know very well that they have confined\nthemselves to the highway ever since, whatever pretensions they may make\nto belong to this select class. No doubt they were elevated for a moment\nas by the reminiscence of a previous state of existence, when even they\nwere foresters and outlaws.\n\n         \"When he came to grene wode,\n            In a mery mornynge,\n          There he herde the notes small\n            Of byrdes mery syngynge.\n\n         \"It is ferre gone, sayd Robyn,\n            That I was last here;\n          Me Lyste a lytell for to shote\n            At the donne dere.\"\n\nI think that I cannot preserve my health and spirits, unless I spend\nfour hours a day at least--and it is commonly more than that--sauntering\nthrough the woods and over the hills and fields, absolutely free from\nall worldly engagements. You may safely say, A penny for your thoughts,\nor a thousand pounds. When sometimes I am reminded that the mechanics\nand shopkeepers stay in their shops not only all the forenoon, but all\nthe afternoon too, sitting with crossed legs, so many of them--as if the\nlegs were made to sit upon, and not to stand or walk upon--I think that\nthey deserve some credit for not having all committed suicide long ago.\n\nI, who cannot stay in my chamber for a single day without acquiring some\nrust, and when sometimes I have stolen forth for a walk at the eleventh\nhour, or four o'clock in the afternoon, too late to redeem the day,\nwhen the shades of night were already beginning to be mingled with the\ndaylight, have felt as if I had committed some sin to be atoned for,--I\nconfess that I am astonished at the power of endurance, to say nothing\nof the moral insensibility, of my neighbors who confine themselves to\nshops and offices the whole day for weeks and months, aye, and years\nalmost together. I know not what manner of stuff they are of--sitting\nthere now at three o'clock in the afternoon, as if it were three o'clock\nin the morning. Bonaparte may talk of the three-o'clock-in-the-morning\ncourage, but it is nothing to the courage which can sit down cheerfully\nat this hour in the afternoon over against one's self whom you have\nknown all the morning, to starve out a garrison to whom you are bound\nby such strong ties of sympathy. I wonder that about this time, or say\nbetween four and five o'clock in the afternoon, too late for the morning\npapers and too early for the evening ones, there is not a general\nexplosion heard up and down the street, scattering a legion of\nantiquated and house-bred notions and whims to the four winds for an\nairing-and so the evil cure itself.\n\nHow womankind, who are confined to the house still more than men, stand\nit I do not know; but I have ground to suspect that most of them do not\nSTAND it at all. When, early in a summer afternoon, we have been shaking\nthe dust of the village from the skirts of our garments, making haste\npast those houses with purely Doric or Gothic fronts, which have such\nan air of repose about them, my companion whispers that probably about\nthese times their occupants are all gone to bed. Then it is that I\nappreciate the beauty and the glory of architecture, which itself never\nturns in, but forever stands out and erect, keeping watch over the\nslumberers.\n\nNo doubt temperament, and, above all, age, have a good deal to do with\nit. As a man grows older, his ability to sit still and follow indoor\noccupations increases. He grows vespertinal in his habits as the\nevening of life approaches, till at last he comes forth only just before\nsundown, and gets all the walk that he requires in half an hour.\n\nBut the walking of which I speak has nothing in it akin to taking\nexercise, as it is called, as the sick take medicine at stated hours--as\nthe Swinging of dumb-bells or chairs; but is itself the enterprise and\nadventure of the day. If you would get exercise, go in search of the\nsprings of life. Think of a man's swinging dumbbells for his health,\nwhen those springs are bubbling up in far-off pastures unsought by him!\n\nMoreover, you must walk like a camel, which is said to be the only beast\nwhich ruminates when walking. When a traveler asked Wordsworth's servant\nto show him her master's study, she answered, \"Here is his library, but\nhis study is out of doors.\"\n\nLiving much out of doors, in the sun and wind, will no doubt produce\na certain roughness of character--will cause a thicker cuticle to grow\nover some of the finer qualities of our nature, as on the face and\nhands, or as severe manual labor robs the hands of some of their\ndelicacy of touch. So staying in the house, on the other hand, may\nproduce a softness and smoothness, not to say thinness of skin,\naccompanied by an increased sensibility to certain impressions. Perhaps\nwe should be more susceptible to some influences important to our\nintellectual and moral growth, if the sun had shone and the wind blown\non us a little less; and no doubt it is a nice matter to proportion\nrightly the thick and thin skin. But methinks that is a scurf that will\nfall off fast enough--that the natural remedy is to be found in the\nproportion which the night bears to the day, the winter to the summer,\nthought to experience. There will be so much the more air and sunshine\nin our thoughts. The callous palms of the laborer are conversant with\nfiner tissues of self-respect and heroism, whose touch thrills the\nheart, than the languid fingers of idleness. That is mere sentimentality\nthat lies abed by day and thinks itself white, far from the tan and\ncallus of experience.\n\nWhen we walk, we naturally go to the fields and woods: what would become\nof us, if we walked only in a garden or a mall? Even some sects\nof philosophers have felt the necessity of importing the woods to\nthemselves, since they did not go to the woods. \"They planted groves and\nwalks of Platanes,\" where they took subdiales ambulationes in porticos\nopen to the air. Of course it is of no use to direct our steps to the\nwoods, if they do not carry us thither. I am alarmed when it happens\nthat I have walked a mile into the woods bodily, without getting there\nin spirit. In my afternoon walk I would fain forget all my morning\noccupations and my obligations to Society. But it sometimes happens that\nI cannot easily shake off the village. The thought of some work will run\nin my head and I am not where my body is--I am out of my senses. In\nmy walks I would fain return to my senses. What business have I in the\nwoods, if I am thinking of something out of the woods? I suspect myself,\nand cannot help a shudder when I find myself so implicated even in what\nare called good works--for this may sometimes happen.\n\nMy vicinity affords many good walks; and though for so many years I have\nwalked almost every day, and sometimes for several days together, I have\nnot yet exhausted them. An absolutely new prospect is a great happiness,\nand I can still get this any afternoon. Two or three hours' walking\nwill carry me to as strange a country as I expect ever to see. A single\nfarmhouse which I had not seen before is sometimes as good as the\ndominions of the King of Dahomey. There is in fact a sort of harmony\ndiscoverable between the capabilities of the landscape within a circle\nof ten miles' radius, or the limits of an afternoon walk, and the\nthreescore years and ten of human life. It will never become quite\nfamiliar to you.\n\nNowadays almost all man's improvements, so called, as the building of\nhouses and the cutting down of the forest and of all large trees, simply\ndeform the landscape, and make it more and more tame and cheap. A people\nwho would begin by burning the fences and let the forest stand! I saw\nthe fences half consumed, their ends lost in the middle of the prairie,\nand some worldly miser with a surveyor looking after his bounds, while\nheaven had taken place around him, and he did not see the angels\ngoing to and fro, but was looking for an old post-hole in the midst of\nparadise. I looked again, and saw him standing in the middle of a boggy\nStygian fen, surrounded by devils, and he had found his bounds without\na doubt, three little stones, where a stake had been driven, and looking\nnearer, I saw that the Prince of Darkness was his surveyor.\n\nI can easily walk ten, fifteen, twenty, any number of miles, commencing\nat my own door, without going by any house, without crossing a road\nexcept where the fox and the mink do: first along by the river, and then\nthe brook, and then the meadow and the woodside. There are square miles\nin my vicinity which have no inhabitant. From many a hill I can see\ncivilization and the abodes of man afar. The farmers and their works\nare scarcely more obvious than woodchucks and their burrows. Man and\nhis affairs, church and state and school, trade and commerce, and\nmanufactures and agriculture even politics, the most alarming of them\nall--I am pleased to see how little space they occupy in the landscape.\nPolitics is but a narrow field, and that still narrower highway yonder\nleads to it. I sometimes direct the traveler thither. If you would go to\nthe political world, follow the great road--follow that market-man, keep\nhis dust in your eyes, and it will lead you straight to it; for it, too,\nhas its place merely, and does not occupy all space. I pass from it as\nfrom a bean field into the forest, and it is forgotten. In one half-hour\nI can walk off to some portion of the earth's surface where a man does\nnot stand from one year's end to another, and there, consequently,\npolitics are not, for they are but as the cigar-smoke of a man.\n\nThe village is the place to which the roads tend, a sort of expansion of\nthe highway, as a lake of a river. It is the body of which roads are\nthe arms and legs--a trivial or quadrivial place, the thoroughfare and\nordinary of travelers. The word is from the Latin villa which together\nwith via, a way, or more anciently ved and vella, Varro derives from\nveho, to carry, because the villa is the place to and from which things\nare carried. They who got their living by teaming were said vellaturam\nfacere. Hence, too, the Latin word vilis and our vile, also villain.\nThis suggests what kind of degeneracy villagers are liable to. They\nare wayworn by the travel that goes by and over them, without traveling\nthemselves.\n\nSome do not walk at all; others walk in the highways; a few walk across\nlots. Roads are made for horses and men of business. I do not travel\nin them much, comparatively, because I am not in a hurry to get to any\ntavern or grocery or livery-stable or depot to which they lead. I am\na good horse to travel, but not from choice a roadster. The\nlandscape-painter uses the figures of men to mark a road. He would not\nmake that use of my figure. I walk out into a nature such as the old\nprophets and poets, Menu, Moses, Homer, Chaucer, walked in. You may\nname it America, but it is not America; neither Americus Vespueius,\nnor Columbus, nor the rest were the discoverers of it. There is a truer\namount of it in mythology than in any history of America, so called,\nthat I have seen.\n\nHowever, there are a few old roads that may be trodden with profit, as\nif they led somewhere now that they are nearly discontinued. There\nis the Old Marlborough Road, which does not go to Marlborough now,\nme-thinks, unless that is Marlborough where it carries me. I am the\nbolder to speak of it here, because I presume that there are one or two\nsuch roads in every town.\n\n\n\n       THE OLD MARLBOROUGH ROAD\n\n        Where they once dug for money,\n        But never found any;\n        Where sometimes Martial Miles\n        Singly files,\n        And Elijah Wood,\n        I fear for no good:\n        No other man,\n        Save Elisha Dugan--\n        O man of wild habits,\n        Partridges and rabbits\n        Who hast no cares\n        Only to set snares,\n        Who liv'st all alone,\n        Close to the bone\n        And where life is sweetest\n        Constantly eatest.\n     When the spring stirs my blood\n      With the instinct to travel,\n      I can get enough gravel\n     On the Old Marlborough Road.\n        Nobody repairs it,\n        For nobody wears it;\n        It is a living way,\n        As the Christians say.\n     Not many there be\n      Who enter therein,\n     Only the guests of the\n      Irishman Quin.\n     What is it, what is it\n      But a direction out there,\n     And the bare possibility\n        Of going somewhere?\n        Great guide-boards of stone,\n        But travelers none;\n        Cenotaphs of the towns\n        Named on their crowns.\n        It is worth going to see\n\n        Where you MIGHT be.\n        What king\n        Did the thing,\n        I am still wondering;\n        Set up how or when,\n        By what selectmen,\n        Gourgas or Lee,\n        Clark or Darby?\n        They're a great endeavor\n        To be something forever;\n        Blank tablets of stone,\n        Where a traveler might groan,\n        And in one sentence\n        Grave all that is known\n        Which another might read,\n        In his extreme need.\n        I know one or two\n        Lines that would do,\n        Literature that might stand\n        All over the land\n        Which a man could remember\n        Till next December,\n        And read again in the spring,\n        After the thawing.\n     If with fancy unfurled\n      You leave your abode,\n     You may go round the world\n      By the Old Marlborough Road.\n\nAt present, in this vicinity, the best part of the land is not private\nproperty; the landscape is not owned, and the walker enjoys comparative\nfreedom. But possibly the day will come when it will be partitioned off\ninto so-called pleasure-grounds, in which a few will take a narrow and\nexclusive pleasure only--when fences shall be multiplied, and man-traps\nand other engines invented to confine men to the PUBLIC road, and\nwalking over the surface of God's earth shall be construed to mean\ntrespassing on some gentleman's grounds. To enjoy a thing exclusively\nis commonly to exclude yourself from the true enjoyment of it. Let us\nimprove our opportunities, then, before the evil days come.\n\n\n\nWhat is it that makes it so hard sometimes to determine whither we will\nwalk? I believe that there is a subtle magnetism in Nature, which, if we\nunconsciously yield to it, will direct us aright. It is not indifferent\nto us which way we walk. There is a right way; but we are very liable\nfrom heedlessness and stupidity to take the wrong one. We would fain\ntake that walk, never yet taken by us through this actual world, which\nis perfectly symbolical of the path which we love to travel in the\ninterior and ideal world; and sometimes, no doubt, we find it difficult\nto choose our direction, because it does not yet exist distinctly in our\nidea.\n\nWhen I go out of the house for a walk, uncertain as yet whither I will\nbend my steps, and submit myself to my instinct to decide for me,\nI find, strange and whimsical as it may seem, that I finally and\ninevitably settle southwest, toward some particular wood or meadow\nor deserted pasture or hill in that direction. My needle is slow to\nsettle,--varies a few degrees, and does not always point due southwest,\nit is true, and it has good authority for this variation, but it always\nsettles between west and south-southwest. The future lies that way to\nme, and the earth seems more unexhausted and richer on that side.\nThe outline which would bound my walks would be, not a circle, but a\nparabola, or rather like one of those cometary orbits which have been\nthought to be non-returning curves, in this case opening westward, in\nwhich my house occupies the place of the sun. I turn round and round\nirresolute sometimes for a quarter of an hour, until I decide, for a\nthousandth time, that I will walk into the southwest or west. Eastward I\ngo only by force; but westward I go free. Thither no business leads\nme. It is hard for me to believe that I shall find fair landscapes or\nsufficient wildness and freedom behind the eastern horizon. I am not\nexcited by the prospect of a walk thither; but I believe that the forest\nwhich I see in the western horizon stretches uninterruptedly toward\nthe setting sun, and there are no towns nor cities in it of enough\nconsequence to disturb me. Let me live where I will, on this side is the\ncity, on that the wilderness, and ever I am leaving the city more and\nmore, and withdrawing into the wilderness. I should not lay so much\nstress on this fact, if I did not believe that something like this is\nthe prevailing tendency of my countrymen. I must walk toward Oregon, and\nnot toward Europe. And that way the nation is moving, and I may say that\nmankind progress from east to west. Within a few years we have witnessed\nthe phenomenon of a southeastward migration, in the settlement of\nAustralia; but this affects us as a retrograde movement, and, judging\nfrom the moral and physical character of the first generation of\nAustralians, has not yet proved a successful experiment. The eastern\nTartars think that there is nothing west beyond Thibet. \"The world ends\nthere,\" say they; \"beyond there is nothing but a shoreless sea.\" It is\nunmitigated East where they live.\n\nWe go eastward to realize history and study the works of art and\nliterature, retracing the steps of the race; we go westward as into the\nfuture, with a spirit of enterprise and adventure. The Atlantic is a\nLethean stream, in our passage over which we have had an opportunity\nto forget the Old World and its institutions. If we do not succeed\nthis time, there is perhaps one more chance for the race left before\nit arrives on the banks of the Styx; and that is in the Lethe of the\nPacific, which is three times as wide.\n\nI know not how significant it is, or how far it is an evidence of\nsingularity, that an individual should thus consent in his pettiest walk\nwith the general movement of the race; but I know that something akin\nto the migratory instinct in birds and quadrupeds--which, in some\ninstances, is known to have affected the squirrel tribe, impelling them\nto a general and mysterious movement, in which they were seen, say some,\ncrossing the broadest rivers, each on its particular chip, with its tail\nraised for a sail, and bridging narrower streams with their dead--that\nsomething like the furor which affects the domestic cattle in the\nspring, and which is referred to a worm in their tails,--affects both\nnations and individuals, either perennially or from time to time. Not\na flock of wild geese cackles over our town, but it to some extent\nunsettles the value of real estate here, and, if I were a broker, I\nshould probably take that disturbance into account.\n\n   \"Than longen folk to gon on pilgrimages,\n   And palmeres for to seken strange strondes.\"\n\nEvery sunset which I witness inspires me with the desire to go to a West\nas distant and as fair as that into which the sun goes down. He appears\nto migrate westward daily, and tempt us to follow him. He is the Great\nWestern Pioneer whom the nations follow. We dream all night of those\nmountain-ridges in the horizon, though they may be of vapor only, which\nwere last gilded by his rays. The island of Atlantis, and the islands\nand gardens of the Hesperides, a sort of terrestrial paradise, appear\nto have been the Great West of the ancients, enveloped in mystery and\npoetry. Who has not seen in imagination, when looking into the sunset\nsky, the gardens of the Hesperides, and the foundation of all those\nfables?\n\nColumbus felt the westward tendency more strongly than any before. He\nobeyed it, and found a New World for Castile and Leon. The herd of men\nin those days scented fresh pastures from afar,\n\n  \"And now the sun had stretched out all the hills,\n  And now was dropped into the western bay;\n  At last HE rose, and twitched his mantle blue;\n  Tomorrow to fresh woods and pastures new.\"\n\nWhere on the globe can there be found an area of equal extent with that\noccupied by the bulk of our States, so fertile and so rich and varied in\nits productions, and at the same time so habitable by the European, as\nthis is? Michaux, who knew but part of them, says that \"the species of\nlarge trees are much more numerous in North America than in Europe; in\nthe United States there are more than one hundred and forty species that\nexceed thirty feet in height; in France there are but thirty that attain\nthis size.\" Later botanists more than confirm his observations. Humboldt\ncame to America to realize his youthful dreams of a tropical vegetation,\nand he beheld it in its greatest perfection in the primitive forests of\nthe Amazon, the most gigantic wilderness on the earth, which he has so\neloquently described. The geographer Guyot, himself a European, goes\nfarther--farther than I am ready to follow him; yet not when he says:\n\"As the plant is made for the animal, as the vegetable world is made for\nthe animal world, America is made for the man of the Old World.... The\nman of the Old World sets out upon his way. Leaving the highlands of\nAsia, he descends from station to station towards Europe. Each of his\nsteps is marked by a new civilization superior to the preceding, by a\ngreater power of development. Arrived at the Atlantic, he pauses on the\nshore of this unknown ocean, the bounds of which he knows not, and turns\nupon his footprints for an instant.\" When he has exhausted the rich soil\nof Europe, and reinvigorated himself, \"then recommences his adventurous\ncareer westward as in the earliest ages.\" So far Guyot.\n\nFrom this western impulse coming in contact with the barrier of the\nAtlantic sprang the commerce and enterprise of modern times. The younger\nMichaux, in his Travels West of the Alleghanies in 1802, says that the\ncommon inquiry in the newly settled West was, \"'From what part of\nthe world have you come?' As if these vast and fertile regions would\nnaturally be the place of meeting and common country of all the\ninhabitants of the globe.\"\n\nTo use an obsolete Latin word, I might say, Ex Oriente lux; ex Occidente\nFRUX. From the East light; from the West fruit.\n\nSir Francis Head, an English traveler and a Governor-General of Canada,\ntells us that \"in both the northern and southern hemispheres of the New\nWorld, Nature has not only outlined her works on a larger scale, but has\npainted the whole picture with brighter and more costly colors than she\nused in delineating and in beautifying the Old World.... The heavens of\nAmerica appear infinitely higher, the sky is bluer, the air is fresher,\nthe cold is intenser, the moon looks larger, the stars are brighter the\nthunder is louder, the lightning is vivider, the wind is stronger,\nthe rain is heavier, the mountains are higher, the rivers longer, the\nforests bigger, the plains broader.\" This statement will do at least\nto set against Buffon's account of this part of the world and its\nproductions.\n\nLinnaeus said long ago, \"Nescio quae facies laeta, glabra plantis\nAmericanis\" (I know not what there is of joyous and smooth in the aspect\nof American plants); and I think that in this country there are no,\nor at most very few, Africanae bestiae, African beasts, as the Romans\ncalled them, and that in this respect also it is peculiarly fitted for\nthe habitation of man. We are told that within three miles of the\ncenter of the East-Indian city of Singapore, some of the inhabitants\nare annually carried off by tigers; but the traveler can lie down in\nthe woods at night almost anywhere in North America without fear of wild\nbeasts.\n\nThese are encouraging testimonies. If the moon looks larger here than\nin Europe, probably the sun looks larger also. If the heavens of America\nappear infinitely higher, and the stars brighter, I trust that these\nfacts are symbolical of the height to which the philosophy and poetry\nand religion of her inhabitants may one day soar. At length, perchance,\nthe immaterial heaven will appear as much higher to the American mind,\nand the intimations that star it as much brighter. For I believe that\nclimate does thus react on man--as there is something in the mountain\nair that feeds the spirit and inspires. Will not man grow to greater\nperfection intellectually as well as physically under these influences?\nOr is it unimportant how many foggy days there are in his life? I trust\nthat we shall be more imaginative, that our thoughts will be clearer,\nfresher, and more ethereal, as our sky--our understanding more\ncomprehensive and broader, like our plains--our intellect generally on a\ngrander scale, like our thunder and lightning, our rivers and mountains\nand forests-and our hearts shall even correspond in breadth and depth\nand grandeur to our inland seas. Perchance there will appear to the\ntraveler something, he knows not what, of laeta and glabra, of joyous\nand serene, in our very faces. Else to what end does the world go on,\nand why was America discovered?\n\nTo Americans I hardly need to say--\n\n\"Westward the star of empire takes its way.\"\n\nAs a true patriot, I should be ashamed to think that Adam in paradise\nwas more favorably situated on the whole than the backwoodsman in this\ncountry.\n\nOur sympathies in Massachusetts are not confined to New England; though\nwe may be estranged from the South, we sympathize with the West. There\nis the home of the younger sons, as among the Scandinavians they took to\nthe sea for their inheritance. It is too late to be studying Hebrew; it\nis more important to understand even the slang of today.\n\nSome months ago I went to see a panorama of the Rhine. It was like\na dream of the Middle Ages. I floated down its historic stream in\nsomething more than imagination, under bridges built by the Romans, and\nrepaired by later heroes, past cities and castles whose very names were\nmusic to my ears, and each of which was the subject of a legend. There\nwere Ehrenbreitstein and Rolandseck and Coblentz, which I knew only in\nhistory. They were ruins that interested me chiefly. There seemed to\ncome up from its waters and its vine-clad hills and valleys a hushed\nmusic as of Crusaders departing for the Holy Land. I floated along under\nthe spell of enchantment, as if I had been transported to an heroic age,\nand breathed an atmosphere of chivalry.\n\nSoon after, I went to see a panorama of the Mississippi, and as I\nworked my way up the river in the light of today, and saw the steamboats\nwooding up, counted the rising cities, gazed on the fresh ruins of\nNauvoo, beheld the Indians moving west across the stream, and, as before\nI had looked up the Moselle, now looked up the Ohio and the Missouri and\nheard the legends of Dubuque and of Wenona's Cliff--still thinking more\nof the future than of the past or present--I saw that this was a Rhine\nstream of a different kind; that the foundations of castles were yet to\nbe laid, and the famous bridges were yet to be thrown over the river;\nand I felt that THIS WAS THE HEROIC AGE ITSELF, though we know it not,\nfor the hero is commonly the simplest and obscurest of men.\n\nThe West of which I speak is but another name for the Wild; and what I\nhave been preparing to say is, that in Wildness is the preservation of\nthe World. Every tree sends its fibers forth in search of the Wild. The\ncities import it at any price. Men plow and sail for it. From the\nforest and wilderness come the tonics and barks which brace mankind. Our\nancestors were savages. The story of Romulus and Remus being suckled by\na wolf is not a meaningless fable. The founders of every state which has\nrisen to eminence have drawn their nourishment and vigor from a similar\nwild source. It was because the children of the Empire were not suckled\nby the wolf that they were conquered and displaced by the children of\nthe northern forests who were.\n\nI believe in the forest, and in the meadow, and in the night in which\nthe corn grows. We require an infusion of hemlock, spruce or arbor\nvitae in our tea. There is a difference between eating and drinking\nfor strength and from mere gluttony. The Hottentots eagerly devour the\nmarrow of the koodoo and other antelopes raw, as a matter of course.\nSome of our northern Indians eat raw the marrow of the Arctic reindeer,\nas well as various other parts, including the summits of the antlers, as\nlong as they are soft. And herein, perchance, they have stolen a march\non the cooks of Paris. They get what usually goes to feed the fire. This\nis probably better than stall-fed beef and slaughterhouse pork to make\na man of. Give me a wildness whose glance no civilization can endure--as\nif we lived on the marrow of koodoos devoured raw.\n\nThere are some intervals which border the strain of the wood thrush,\nto which I would migrate--wild lands where no settler has squatted; to\nwhich, methinks, I am already acclimated.\n\nThe African hunter Cumming tells us that the skin of the eland, as well\nas that of most other antelopes just killed, emits the most delicious\nperfume of trees and grass. I would have every man so much like a wild\nantelope, so much a part and parcel of nature, that his very person\nshould thus sweetly advertise our senses of his presence, and remind us\nof those parts of nature which he most haunts. I feel no disposition to\nbe satirical, when the trapper's coat emits the odor of musquash even;\nit is a sweeter scent to me than that which commonly exhales from the\nmerchant's or the scholar's garments. When I go into their wardrobes and\nhandle their vestments, I am reminded of no grassy plains and flowery\nmeads which they have frequented, but of dusty merchants' exchanges and\nlibraries rather.\n\nA tanned skin is something more than respectable, and perhaps olive is\na fitter color than white for a man--a denizen of the woods. \"The pale\nwhite man!\" I do not wonder that the African pitied him. Darwin the\nnaturalist says, \"A white man bathing by the side of a Tahitian was like\na plant bleached by the gardener's art, compared with a fine, dark green\none, growing vigorously in the open fields.\"\n\nBen Jonson exclaims,--\n\n \"How near to good is what is fair!\"\n\nSo I would say,--\n\n \"How near to good is what is WILD!\"\n\nLife consists with wildness. The most alive is the wildest. Not yet\nsubdued to man, its presence refreshes him. One who pressed forward\nincessantly and never rested from his labors, who grew fast and made\ninfinite demands on life, would always find himself in a new country\nor wilderness, and surrounded by the raw material of life. He would be\nclimbing over the prostrate stems of primitive forest trees.\n\nHope and the future for me are not in lawns and cultivated fields, not\nin towns and cities, but in the impervious and quaking swamps. When,\nformerly, I have analyzed my partiality for some farm which I had\ncontemplated purchasing, I have frequently found that I was attracted\nsolely by a few square rods of impermeable and unfathomable bog--a\nnatural sink in one corner of it. That was the jewel which dazzled me.\nI derive more of my subsistence from the swamps which surround my native\ntown than from the cultivated gardens in the village. There are no\nricher parterres to my eyes than the dense beds of dwarf andromeda\n(Cassandra calyculata) which cover these tender places on the earth's\nsurface. Botany cannot go farther than tell me the names of the shrubs\nwhich grow there--the high blueberry, panicled andromeda, lambkill,\nazalea, and rhodora--all standing in the quaking sphagnum. I often\nthink that I should like to have my house front on this mass of dull red\nbushes, omitting other flower plots and borders, transplanted spruce\nand trim box, even graveled walks--to have this fertile spot under my\nwindows, not a few imported barrowfuls of soil only to cover the sand\nwhich was thrown out in digging the cellar. Why not put my house, my\nparlor, behind this plot, instead of behind that meager assemblage of\ncuriosities, that poor apology for a Nature and Art, which I call my\nfront yard? It is an effort to clear up and make a decent appearance\nwhen the carpenter and mason have departed, though done as much for the\npasser-by as the dweller within. The most tasteful front-yard fence was\nnever an agreeable object of study to me; the most elaborate ornaments,\nacorn tops, or what not, soon wearied and disgusted me. Bring your sills\nup to the very edge of the swamp, then (though it may not be the best\nplace for a dry cellar), so that there be no access on that side to\ncitizens. Front yards are not made to walk in, but, at most, through,\nand you could go in the back way.\n\nYes, though you may think me perverse, if it were proposed to me to\ndwell in the neighborhood of the most beautiful garden that ever human\nart contrived, or else of a Dismal Swamp, I should certainly decide for\nthe swamp. How vain, then, have been all your labors, citizens, for me!\n\nMy spirits infallibly rise in proportion to the outward dreariness. Give\nme the ocean, the desert, or the wilderness! In the desert, pure air\nand solitude compensate for want of moisture and fertility. The traveler\nBurton says of it--\"Your MORALE improves; you become frank and cordial,\nhospitable and single-minded.... In the desert, spirituous liquors\nexcite only disgust. There is a keen enjoyment in a mere animal\nexistence.\" They who have been traveling long on the steppes of Tartary\nsay, \"On re-entering cultivated lands, the agitation, perplexity, and\nturmoil of civilization oppressed and suffocated us; the air seemed to\nfail us, and we felt every moment as if about to die of asphyxia.\" When\nI would recreate myself, I seek the darkest woods the thickest and most\ninterminable and, to the citizen, most dismal, swamp. I enter a swamp as\na sacred place,--a sanctum sanctorum. There is the strength, the marrow,\nof Nature. The wildwood covers the virgin mould,--and the same soil is\ngood for men and for trees. A man's health requires as many acres of\nmeadow to his prospect as his farm does loads of muck. There are\nthe strong meats on which he feeds. A town is saved, not more by the\nrighteous men in it than by the woods and swamps that surround it. A\ntownship where one primitive forest waves above while another primitive\nforest rots below--such a town is fitted to raise not only corn and\npotatoes, but poets and philosophers for the coming ages. In such a\nsoil grew Homer and Confucius and the rest, and out of such a wilderness\ncomes the Reformer eating locusts and wild honey.\n\nTo preserve wild animals implies generally the creation of a forest for\nthem to dwell in or resort to. So it is with man. A hundred years ago\nthey sold bark in our streets peeled from our own woods. In the very\naspect of those primitive and rugged trees there was, methinks, a\ntanning principle which hardened and consolidated the fibers of men's\nthoughts. Ah! already I shudder for these comparatively degenerate days\nof my native village, when you cannot collect a load of bark of good\nthickness, and we no longer produce tar and turpentine.\n\nThe civilized nations--Greece, Rome, England--have been sustained by the\nprimitive forests which anciently rotted where they stand. They survive\nas long as the soil is not exhausted. Alas for human culture! little is\nto be expected of a nation, when the vegetable mould is exhausted, and\nit is compelled to make manure of the bones of its fathers. There\nthe poet sustains himself merely by his own superfluous fat, and the\nphilosopher comes down on his marrow-bones.\n\nIt is said to be the task of the American \"to work the virgin soil,\" and\nthat \"agriculture here already assumes proportions unknown everywhere\nelse.\" I think that the farmer displaces the Indian even because he\nredeems the meadow, and so makes himself stronger and in some respects\nmore natural. I was surveying for a man the other day a single straight\nline one hundred and thirty-two rods long, through a swamp at whose\nentrance might have been written the words which Dante read over the\nentrance to the infernal regions,--\"Leave all hope, ye that enter\"--that\nis, of ever getting out again; where at one time I saw my employer\nactually up to his neck and swimming for his life in his property,\nthough it was still winter. He had another similar swamp which I\ncould not survey at all, because it was completely under water, and\nnevertheless, with regard to a third swamp, which I did SURVEY from a\ndistance, he remarked to me, true to his instincts, that he would not\npart with it for any consideration, on account of the mud which it\ncontained. And that man intends to put a girdling ditch round the whole\nin the course of forty months, and so redeem it by the magic of his\nspade. I refer to him only as the type of a class.\n\nThe weapons with which we have gained our most important victories,\nwhich should be handed down as heirlooms from father to son, are not the\nsword and the lance, but the bushwhack, the turf-cutter, the spade, and\nthe bog hoe, rusted with the blood of many a meadow, and begrimed with\nthe dust of many a hard-fought field. The very winds blew the Indian's\ncornfield into the meadow, and pointed out the way which he had not\nthe skill to follow. He had no better implement with which to intrench\nhimself in the land than a clam-shell. But the farmer is armed with plow\nand spade.\n\nIn literature it is only the wild that attracts us. Dullness is but\nanother name for tameness. It is the uncivilized free and wild thinking\nin Hamlet and the Iliad, in all the scriptures and mythologies, not\nlearned in the schools, that delights us. As the wild duck is more swift\nand beautiful than the tame, so is the wild--the mallard--thought, which\n'mid falling dews wings its way above the fens. A truly good book is\nsomething as natural, and as unexpectedly and unaccountably fair and\nperfect, as a wild-flower discovered on the prairies of the West or\nin the jungles of the East. Genius is a light which makes the darkness\nvisible, like the lightning's flash, which perchance shatters the temple\nof knowledge itself--and not a taper lighted at the hearthstone of the\nrace, which pales before the light of common day.\n\nEnglish literature, from the days of the minstrels to the Lake\nPoets--Chaucer and Spenser and Milton, and even Shakespeare,\nincluded--breathes no quite fresh and, in this sense, wild strain. It\nis an essentially tame and civilized literature, reflecting Greece and\nRome. Her wilderness is a green wood, her wild man a Robin Hood. There\nis plenty of genial love of Nature, but not so much of Nature herself.\nHer chronicles inform us when her wild animals, but not when the wild\nman in her, became extinct.\n\nThe science of Humboldt is one thing, poetry is another thing. The\npoet today, notwithstanding all the discoveries of science, and the\naccumulated learning of mankind, enjoys no advantage over Homer.\n\nWhere is the literature which gives expression to Nature? He would be a\npoet who could impress the winds and streams into his service, to speak\nfor him; who nailed words to their primitive senses, as farmers drive\ndown stakes in the spring, which the frost has heaved; who derived his\nwords as often as he used them--transplanted them to his page with earth\nadhering to their roots; whose words were so true and fresh and natural\nthat they would appear to expand like the buds at the approach of\nspring, though they lay half smothered between two musty leaves in a\nlibrary--aye, to bloom and bear fruit there, after their kind, annually,\nfor the faithful reader, in sympathy with surrounding Nature.\n\nI do not know of any poetry to quote which adequately expresses this\nyearning for the Wild. Approached from this side, the best poetry is\ntame. I do not know where to find in any literature, ancient or modern,\nany account which contents me of that Nature with which even I am\nacquainted. You will perceive that I demand something which no Augustan\nnor Elizabethan age, which no culture, in short, can give. Mythology\ncomes nearer to it than anything. How much more fertile a Nature,\nat least, has Grecian mythology its root in than English literature!\nMythology is the crop which the Old World bore before its soil was\nexhausted, before the fancy and imagination were affected with blight;\nand which it still bears, wherever its pristine vigor is unabated. All\nother literatures endure only as the elms which overshadow our houses;\nbut this is like the great dragon-tree of the Western Isles, as old as\nmankind, and, whether that does or not, will endure as long; for the\ndecay of other literatures makes the soil in which it thrives.\n\nThe West is preparing to add its fables to those of the East. The\nvalleys of the Ganges, the Nile, and the Shine having yielded their\ncrop, it remains to be seen what the valleys of the Amazon, the Plate,\nthe Orinoco, the St. Lawrence, and the Mississippi will produce.\nPerchance, when, in the course of ages, American liberty has become\na fiction of the past--as it is to some extent a fiction of the\npresent--the poets of the world will be inspired by American mythology.\n\nThe wildest dreams of wild men, even, are not the less true, though they\nmay not recommend themselves to the sense which is most common among\nEnglishmen and Americans today. It is not every truth that recommends\nitself to the common sense. Nature has a place for the wild Clematis\nas well as for the cabbage. Some expressions of truth are\nreminiscent--others merely SENSIBLE, as the phrase is,--others\nprophetic. Some forms of disease, even, may prophesy forms of health.\nThe geologist has discovered that the figures of serpents, griffins,\nflying dragons, and other fanciful embellishments of heraldry, have\ntheir prototypes in the forms of fossil species which were extinct\nbefore man was created, and hence \"indicate a faint and shadowy\nknowledge of a previous state of organic existence.\" The Hindus dreamed\nthat the earth rested on an elephant, and the elephant on a tortoise,\nand the tortoise on a serpent; and though it may be an unimportant\ncoincidence, it will not be out of place here to state, that a fossil\ntortoise has lately been discovered in Asia large enough to support\nan elephant. I confess that I am partial to these wild fancies, which\ntranscend the order of time and development. They are the sublimest\nrecreation of the intellect. The partridge loves peas, but not those\nthat go with her into the pot.\n\nIn short, all good things are wild and free. There is something in\na strain of music, whether produced by an instrument or by the human\nvoice--take the sound of a bugle in a summer night, for instance--which\nby its wildness, to speak without satire, reminds me of the cries\nemitted by wild beasts in their native forests. It is so much of their\nwildness as I can understand. Give me for my friends and neighbors wild\nmen, not tame ones. The wildness of the savage is but a faint symbol of\nthe awful ferity with which good men and lovers meet.\n\nI love even to see the domestic animals reassert their native\nrights--any evidence that they have not wholly lost their original wild\nhabits and vigor; as when my neighbor's cow breaks out of her pasture\nearly in the spring and boldly swims the river, a cold, gray tide,\ntwenty-five or thirty rods wide, swollen by the melted snow. It is the\nbuffalo crossing the Mississippi. This exploit confers some dignity\non the herd in my eyes--already dignified. The seeds of instinct are\npreserved under the thick hides of cattle and horses, like seeds in the\nbowels of the earth, an indefinite period.\n\nAny sportiveness in cattle is unexpected. I saw one day a herd of a\ndozen bullocks and cows running about and frisking in unwieldy sport,\nlike huge rats, even like kittens. They shook their heads, raised their\ntails, and rushed up and down a hill, and I perceived by their horns, as\nwell as by their activity, their relation to the deer tribe. But, alas!\na sudden loud WHOA! would have damped their ardor at once, reduced them\nfrom venison to beef, and stiffened their sides and sinews like the\nlocomotive. Who but the Evil One has cried \"Whoa!\" to mankind?\nIndeed, the life of cattle, like that of many men, is but a sort of\nlocomotiveness; they move a side at a time, and man, by his machinery,\nis meeting the horse and the ox halfway. Whatever part the whip has\ntouched is thenceforth palsied. Who would ever think of a SIDE of any of\nthe supple cat tribe, as we speak of a SIDE of beef?\n\nI rejoice that horses and steers have to be broken before they can be\nmade the slaves of men, and that men themselves have some wild oats\nstill left to sow before they become submissive members of society.\nUndoubtedly, all men are not equally fit subjects for civilization;\nand because the majority, like dogs and sheep, are tame by inherited\ndisposition, this is no reason why the others should have their natures\nbroken that they may be reduced to the same level. Men are in the main\nalike, but they were made several in order that they might be various.\nIf a low use is to be served, one man will do nearly or quite as well as\nanother; if a high one, individual excellence is to be regarded. Any man\ncan stop a hole to keep the wind away, but no other man could serve so\nrare a use as the author of this illustration did. Confucius says,--\"The\nskins of the tiger and the leopard, when they are tanned, are as the\nskins of the dog and the sheep tanned.\" But it is not the part of a true\nculture to tame tigers, any more than it is to make sheep ferocious; and\ntanning their skins for shoes is not the best use to which they can be\nput.\n\n\n\nWhen looking over a list of men's names in a foreign language, as\nof military officers, or of authors who have written on a particular\nsubject, I am reminded once more that there is nothing in a name. The\nname Menschikoff, for instance, has nothing in it to my ears more human\nthan a whisker, and it may belong to a rat. As the names of the Poles\nand Russians are to us, so are ours to them. It is as if they had been\nnamed by the child's rigmarole,--IERY FIERY ICHERY VAN, TITTLE-TOL-TAN.\nI see in my mind a herd of wild creatures swarming over the earth,\nand to each the herdsman has affixed some barbarous sound in his own\ndialect. The names of men are, of course, as cheap and meaningless as\nBOSE and TRAY, the names of dogs.\n\nMethinks it would be some advantage to philosophy if men were named\nmerely in the gross, as they are known. It would be necessary only to\nknow the genus and perhaps the race or variety, to know the individual.\nWe are not prepared to believe that every private soldier in a Roman\narmy had a name of his own--because we have not supposed that he had a\ncharacter of his own.\n\nAt present our only true names are nicknames. I knew a boy who, from his\npeculiar energy, was called \"Buster\" by his playmates, and this rightly\nsupplanted his Christian name. Some travelers tell us that an Indian had\nno name given him at first, but earned it, and his name was his fame;\nand among some tribes he acquired a new name with every new exploit.\nIt is pitiful when a man bears a name for convenience merely, who has\nearned neither name nor fame.\n\nI will not allow mere names to make distinctions for me, but still\nsee men in herds for all them. A familiar name cannot make a man less\nstrange to me. It may be given to a savage who retains in secret his\nown wild title earned in the woods. We have a wild savage in us, and\na savage name is perchance somewhere recorded as ours. I see that my\nneighbor, who bears the familiar epithet William or Edwin, takes it off\nwith his jacket. It does not adhere to him when asleep or in anger, or\naroused by any passion or inspiration. I seem to hear pronounced by some\nof his kin at such a time his original wild name in some jaw-breaking or\nelse melodious tongue.\n\n\n\nHere is this vast, savage, hovering mother of ours, Nature, lying all\naround, with such beauty, and such affection for her children, as the\nleopard; and yet we are so early weaned from her breast to society, to\nthat culture which is exclusively an interaction of man on man--a sort\nof breeding in and in, which produces at most a merely English nobility,\na civilization destined to have a speedy limit.\n\nIn society, in the best institutions of men, it is easy to detect a\ncertain precocity. When we should still be growing children, we are\nalready little men. Give me a culture which imports much muck from the\nmeadows, and deepens the soil--not that which trusts to heating manures,\nand improved implements and modes of culture only!\n\nMany a poor sore-eyed student that I have heard of would grow faster,\nboth intellectually and physically, if, instead of sitting up so very\nlate, he honestly slumbered a fool's allowance.\n\nThere may be an excess even of informing light. Niepce, a Frenchman,\ndiscovered \"actinism,\" that power in the sun's rays which produces a\nchemical effect; that granite rocks, and stone structures, and statues\nof metal \"are all alike destructively acted upon during the hours of\nsunshine, and, but for provisions of Nature no less wonderful, would\nsoon perish under the delicate touch of the most subtle of the agencies\nof the universe.\" But he observed that \"those bodies which underwent\nthis change during the daylight possessed the power of restoring\nthemselves to their original conditions during the hours of night,\nwhen this excitement was no longer influencing them.\" Hence it has been\ninferred that \"the hours of darkness are as necessary to the inorganic\ncreation as we know night and sleep are to the organic kingdom.\" Not\neven does the moon shine every night, but gives place to darkness.\n\nI would not have every man nor every part of a man cultivated, any more\nthan I would have every acre of earth cultivated: part will be tillage,\nbut the greater part will be meadow and forest, not only serving an\nimmediate use, but preparing a mould against a distant future, by the\nannual decay of the vegetation which it supports.\n\nThere are other letters for the child to learn than those which Cadmus\ninvented. The Spaniards have a good term to express this wild and dusky\nknowledge--Gramatica parda--tawny grammar, a kind of mother-wit derived\nfrom that same leopard to which I have referred.\n\nWe have heard of a Society for the Diffusion of Useful Knowledge. It is\nsaid that knowledge is power, and the like. Methinks there is equal need\nof a Society for the Diffusion of Useful Ignorance, what we will call\nBeautiful Knowledge, a knowledge useful in a higher sense: for what\nis most of our boasted so-called knowledge but a conceit that we know\nsomething, which robs us of the advantage of our actual ignorance?\nWhat we call knowledge is often our positive ignorance; ignorance our\nnegative knowledge. By long years of patient industry and reading of\nthe newspapers--for what are the libraries of science but files of\nnewspapers--a man accumulates a myriad facts, lays them up in his\nmemory, and then when in some spring of his life he saunters abroad into\nthe Great Fields of thought, he, as it were, goes to grass like a horse\nand leaves all his harness behind in the stable. I would say to the\nSociety for the Diffusion of Useful Knowledge, sometimes,--Go to grass.\nYou have eaten hay long enough. The spring has come with its green crop.\nThe very cows are driven to their country pastures before the end of\nMay; though I have heard of one unnatural farmer who kept his cow in the\nbarn and fed her on hay all the year round. So, frequently, the Society\nfor the Diffusion of Useful Knowledge treats its cattle.\n\nA man's ignorance sometimes is not only useful, but beautiful--while his\nknowledge, so called, is oftentimes worse than useless, besides being\nugly. Which is the best man to deal with--he who knows nothing about a\nsubject, and, what is extremely rare, knows that he knows nothing, or he\nwho really knows something about it, but thinks that he knows all?\n\nMy desire for knowledge is intermittent, but my desire to bathe my head\nin atmospheres unknown to my feet is perennial and constant. The highest\nthat we can attain to is not Knowledge, but Sympathy with Intelligence.\nI do not know that this higher knowledge amounts to anything more\ndefinite than a novel and grand surprise on a sudden revelation of the\ninsufficiency of all that we called Knowledge before--a discovery that\nthere are more things in heaven and earth than are dreamed of in our\nphilosophy. It is the lighting up of the mist by the sun. Man cannot\nKNOW in any higher sense than this, any more than he can look serenely\nand with impunity in the face of the sun: \"You will not perceive that,\nas perceiving a particular thing,\" say the Chaldean Oracles.\n\nThere is something servile in the habit of seeking after a law which we\nmay obey. We may study the laws of matter at and for our convenience,\nbut a successful life knows no law. It is an unfortunate discovery\ncertainly, that of a law which binds us where we did not know before\nthat we were bound. Live free, child of the mist--and with respect to\nknowledge we are all children of the mist. The man who takes the liberty\nto live is superior to all the laws, by virtue of his relation to the\nlawmaker. \"That is active duty,\" says the Vishnu Purana, \"which is not\nfor our bondage; that is knowledge which is for our liberation: all\nother duty is good only unto weariness; all other knowledge is only the\ncleverness of an artist.\"\n\n\n\nIt is remarkable how few events or crises there are in our histories,\nhow little exercised we have been in our minds, how few experiences we\nhave had. I would fain be assured that I am growing apace and rankly,\nthough my very growth disturb this dull equanimity--though it be with\nstruggle through long, dark, muggy nights or seasons of gloom. It would\nbe well if all our lives were a divine tragedy even, instead of this\ntrivial comedy or farce. Dante, Bunyan, and others appear to have been\nexercised in their minds more than we: they were subjected to a kind of\nculture such as our district schools and colleges do not contemplate.\nEven Mahomet, though many may scream at his name, had a good deal more\nto live for, aye, and to die for, than they have commonly.\n\nWhen, at rare intervals, some thought visits one, as perchance he is\nwalking on a railroad, then, indeed, the cars go by without his hearing\nthem. But soon, by some inexorable law, our life goes by and the cars\nreturn.\n\n   \"Gentle breeze, that wanderest unseen,\n   And bendest the thistles round Loira of storms,\n   Traveler of the windy glens,\n   Why hast thou left my ear so soon?\"\n\nWhile almost all men feel an attraction drawing them to society, few are\nattracted strongly to Nature. In their reaction to Nature men appear\nto me for the most part, notwithstanding their arts, lower than the\nanimals. It is not often a beautiful relation, as in the case of the\nanimals. How little appreciation of the beauty of the land-scape there\nis among us! We have to be told that the Greeks called the world Beauty,\nor Order, but we do not see clearly why they did so, and we esteem it at\nbest only a curious philological fact.\n\nFor my part, I feel that with regard to Nature I live a sort of border\nlife, on the confines of a world into which I make occasional and\ntransient forays only, and my patriotism and allegiance to the state\ninto whose territories I seem to retreat are those of a moss-trooper.\nUnto a life which I call natural I would gladly follow even a\nwill-o'-the-wisp through bogs and sloughs unimaginable, but no moon nor\nfirefly has shown me the causeway to it. Nature is a personality so vast\nand universal that we have never seen one of her features. The walker in\nthe familiar fields which stretch around my native town sometimes finds\nhimself in another land than is described in their owners' deeds, as it\nwere in some faraway field on the confines of the actual Concord, where\nher jurisdiction ceases, and the idea which the word Concord suggests\nceases to be suggested. These farms which I have myself surveyed, these\nbounds which I have set up, appear dimly still as through a mist; but\nthey have no chemistry to fix them; they fade from the surface of the\nglass, and the picture which the painter painted stands out dimly from\nbeneath. The world with which we are commonly acquainted leaves no\ntrace, and it will have no anniversary.\n\nI took a walk on Spaulding's Farm the other afternoon. I saw the setting\nsun lighting up the opposite side of a stately pine wood. Its golden\nrays straggled into the aisles of the wood as into some noble hall. I\nwas impressed as if some ancient and altogether admirable and shining\nfamily had settled there in that part of the land called Concord,\nunknown to me--to whom the sun was servant--who had not gone into\nsociety in the village--who had not been called on. I saw their\npark, their pleasure-ground, beyond through the wood, in Spaulding's\ncranberry-meadow. The pines furnished them with gables as they grew.\nTheir house was not obvious to vision; the trees grew through it. I do\nnot know whether I heard the sounds of a suppressed hilarity or not.\nThey seemed to recline on the sunbeams. They have sons and daughters.\nThey are quite well. The farmer's cart-path, which leads directly\nthrough their hall, does not in the least put them out, as the muddy\nbottom of a pool is sometimes seen through the reflected skies.\nThey never heard of Spaulding, and do not know that he is their\nneighbor--notwithstanding I heard him whistle as he drove his team\nthrough the house. Nothing can equal the serenity of their lives. Their\ncoat-of-arms is simply a lichen. I saw it painted on the pines and oaks.\nTheir attics were in the tops of the trees. They are of no politics.\nThere was no noise of labor. I did not perceive that they were weaving\nor spinning. Yet I did detect, when the wind lulled and hearing was done\naway, the finest imaginable sweet musical hum,--as of a distant hive in\nMay, which perchance was the sound of their thinking. They had no idle\nthoughts, and no one without could see their work, for their industry\nwas not as in knots and excrescences embayed.\n\nBut I find it difficult to remember them. They fade irrevocably out\nof my mind even now while I speak, and endeavor to recall them and\nrecollect myself. It is only after a long and serious effort to\nrecollect my best thoughts that I become again aware of their\ncohabitancy. If it were not for such families as this, I think I should\nmove out of Concord.\n\n\n\nWe are accustomed to say in New England that few and fewer pigeons visit\nus every year. Our forests furnish no mast for them. So, it would seem,\nfew and fewer thoughts visit each growing man from year to year, for\nthe grove in our minds is laid waste--sold to feed unnecessary fires of\nambition, or sent to mill--and there is scarcely a twig left for them\nto perch on. They no longer build nor breed with us. In some more genial\nseason, perchance, a faint shadow flits across the landscape of the\nmind, cast by the WINGS of some thought in its vernal or autumnal\nmigration, but, looking up, we are unable to detect the substance of\nthe thought itself. Our winged thoughts are turned to poultry. They\nno longer soar, and they attain only to a Shanghai and Cochin-China\ngrandeur. Those GRA-A-ATE THOUGHTS, those GRA-A-ATE men you hear of!\n\nWe hug the earth--how rarely we mount! Methinks we might elevate\nourselves a little more. We might climb a tree, at least. I found my\naccount in climbing a tree once. It was a tall white pine, on the top\nof a hill; and though I got well pitched, I was well paid for it, for\nI discovered new mountains in the horizon which I had never seen\nbefore--so much more of the earth and the heavens. I might have walked\nabout the foot of the tree for threescore years and ten, and yet I\ncertainly should never have seen them. But, above all, I discovered\naround me--it was near the end of June--on the ends of the topmost\nbranches only, a few minute and delicate red conelike blossoms,\nthe fertile flower of the white pine looking heavenward. I carried\nstraightway to the village the topmost spire, and showed it to stranger\njurymen who walked the streets--for it was court week--and to farmers\nand lumber-dealers and woodchoppers and hunters, and not one had ever\nseen the like before, but they wondered as at a star dropped down. Tell\nof ancient architects finishing their works on the tops of columns as\nperfectly as on the lower and more visible parts! Nature has from\nthe first expanded the minute blossoms of the forest only toward the\nheavens, above men's heads and unobserved by them. We see only the\nflowers that are under our feet in the meadows. The pines have developed\ntheir delicate blossoms on the highest twigs of the wood every summer\nfor ages, as well over the heads of Nature's red children as of her\nwhite ones; yet scarcely a farmer or hunter in the land has ever seen\nthem.\n\n\n\nAbove all, we cannot afford not to live in the present. He is blessed\nover all mortals who loses no moment of the passing life in remembering\nthe past. Unless our philosophy hears the cock crow in every barnyard\nwithin our horizon, it is belated. That sound commonly reminds us\nthat we are growing rusty and antique in our employments and habits of\nthoughts. His philosophy comes down to a more recent time than ours.\nThere is something suggested by it that is a newer testament,--the\ngospel according to this moment. He has not fallen astern; he has got\nup early and kept up early, and to be where he is is to be in season,\nin the foremost rank of time. It is an expression of the health and\nsoundness of Nature, a brag for all the world,--healthiness as of a\nspring burst forth, a new fountain of the Muses, to celebrate this last\ninstant of time. Where he lives no fugitive slave laws are passed. Who\nhas not betrayed his master many times since last he heard that note?\n\nThe merit of this bird's strain is in its freedom from all\nplaintiveness. The singer can easily move us to tears or to laughter,\nbut where is he who can excite in us a pure morning joy? When, in\ndoleful dumps, breaking the awful stillness of our wooden sidewalk on\na Sunday, or, perchance, a watcher in the house of mourning, I hear a\ncockerel crow far or near, I think to myself, \"There is one of us well,\nat any rate,\"--and with a sudden gush return to my senses.\n\n\n\nWe had a remarkable sunset one day last November. I was walking in a\nmeadow, the source of a small brook, when the sun at last, just before\nsetting, after a cold, gray day, reached a clear stratum in the horizon,\nand the softest, brightest morning sunlight fell on the dry grass and on\nthe stems of the trees in the opposite horizon and on the leaves of the\nshrub oaks on the hillside, while our shadows stretched long over the\nmeadow east-ward, as if we were the only motes in its beams. It was such\na light as we could not have imagined a moment before, and the air also\nwas so warm and serene that nothing was wanting to make a paradise of\nthat meadow. When we reflected that this was not a solitary phenomenon,\nnever to happen again, but that it would happen forever and ever, an\ninfinite number of evenings, and cheer and reassure the latest child\nthat walked there, it was more glorious still.\n\nThe sun sets on some retired meadow, where no house is visible, with all\nthe glory and splendor that it lavishes on cities, and perchance as it\nhas never set before--where there is but a solitary marsh hawk to have\nhis wings gilded by it, or only a musquash looks out from his cabin, and\nthere is some little black-veined brook in the midst of the marsh, just\nbeginning to meander, winding slowly round a decaying stump. We walked\nin so pure and bright a light, gilding the withered grass and leaves,\nso softly and serenely bright, I thought I had never bathed in such a\ngolden flood, without a ripple or a murmur to it. The west side of every\nwood and rising ground gleamed like the boundary of Elysium, and the sun\non our backs seemed like a gentle herdsman driving us home at evening.\n\nSo we saunter toward the Holy Land, till one day the sun shall shine\nmore brightly than ever he has done, shall perchance shine into our\nminds and hearts, and light up our whole lives with a great awakening\nlight, as warm and serene and golden as on a bankside in autumn.\n\n\n\n\n\nEnd of the Project Gutenberg EBook of Walking, by Henry David Thoreau\n\n*** "
}

Overview of Shared Processing Steps

What This Section Contains

This section discusses all details related to deduplication and filterings steps that were uniformly applied to all data. The section is split into the following topic areas:

Why Global Deduplication
TxT360 Deduplication Process and Implementation
Personally Identifiable Information Removal
Normalization Form C Discussion
Estimated Reading Time: 10 minutes

Why Global Deduplication

Deduplication is beneficial for LM pretraining in several ways, with the most important being controllable upsampling. With unique data, teams gain fine-grained control over the training data. Other benefits of deduplication include avoiding train-test overlap which prevents evaluation contamination.

Duplicate data can lead to a strong double descent phenomenon, where repeated data causes test loss to increase midway through training. Additionally, it reduces the risk of memorization. By implementing deduplication and selective upsampling, we gain control over the pretraining data distribution, rather than relying on the inherent distribution of the source.

To illustrate the need for deduplication, below is the distribution of near-duplicate clusters, organized into buckets of 100. The first bucket contains clusters with sizes ranging from 2 to 100, as found in the Common Crawl dataset. Some clusters even reach up to a million documents.

The example below is from one such cluster. Here most of the text is repeated with just specifics changed.

We started deduplication with 61.8 TB of filtered and compressed documents. The initial dataset had roughly 48.83 billion documents. First, we performed exact deduplication using a Bloom filter with a capacity of 1 billion and a false positive rate of 0.001. This reduced the documents from 48.83 billion to 40.21 billion, removing about 17% as exact duplicates. This step used constant memory for the Bloom filter and lessened the workload for subsequent near-deduplication.

For the global near-deduplication, we employed a methodology used by prior works like SlimPajama [3] but scaled it to the entire dataset which includes 99 Common Crawl snapshots (also called “crawls”) and the curated data. The near-deduplication process involved generating signatures for every document, matching these signatures to identify near-duplicates, and then clustering the near-duplicate documents to select all but one for deletion.

We applied the following inclusion criteria for all documents:

Curated Document > Common Crawl Document
Most Recent > Less Recent

Additionally, we maintained statistics about each matching clusters as they were formed during the final stage of deduplication. Below are the details of all four stages of our deduplication pipeline. We use Dask extensively throughout all stages of the deduplication. We have included the size of results of each stage on disk to give an idea about the scale:

MinHash Generation

We use the datasketch library to generate MinHash signatures with the number of permutations to 128. Each signature is signature represented as a MinHash object for each document. Before calculating the signature, the text is cleaned by stripping whitespace, converting to lowercase, and removing punctuation, consecutive spaces, newlines, and tabs. Next, a list of 13-grams is generated to use as features for creating a document signature. The globally-unique document IDs and signatures are then saved to disk. The documented ID is designed by an encoding scheme which converts file names and line numbers (there is one document per line) to unique document IDs. This also helped a lot in saving disk and memory for this stage.

This step produced 20 TB of hashes.

Matching Pairs Generation

We are using a Jaccard similarity threshold of 0.8 to identify near-duplicate documents. To do this, we divide the MinHashes into 9 bands, each with 13 hashes (also known as the range). To save memory during matching, we first store each band of MinHashes separately on disk. We then process each band individually. Within each band, documents are matched based on their hashes, and the matches are saved as document pairs. A document is considered a match if it matches another document in any of the 9 bands. Since we are looking for near-duplicates, a document may match multiple documents across different bands.

For partitioning and matching the hashes, we utilize Dask's bag data structure to load the document ids and MinHashes. The matching process is simply a group by operation on this bag data structure. This approach allows us to group matches efficiently and distribute the operation to multiple machines. The group by produces full components (documents that share the same signature) within a band which simplifies the later stages. The algorithm can be expressed using the Dask expression below:

dask.bag.from_sequence(doc_file_paths) .map_partitions(stream_docs) .groupby(lambda doc: doc["hash"]) .map_partitions(make_doc_pairs) .compute()

This step produced 9.2 TB of matching pairs from all bands.

Finding Duplicate Pairs

Multiple bands can create the same document pairs, leading to duplicates. The simplest way to eliminate these duplicate pairs is to call distinct() before the compute(). However, we found that Dask is not very efficient when it comes to distributed distinct execution. Additionally, since we process each band separately, this approach wouldn’t remove duplicates across different bands.

To address this, we use a Bloom filter with a capacity of 64 billion and a false positive rate of 0.001 to remove duplicates. We parallelize the Bloom filter execution is by partitioning pairs horizontally and running one filter per partition, as shown in the table below. Note: this step was completed in ~5 days by parallelizing the Bloom filter versus ~25 days if the filter was serialized.

There is a high chance that duplicates from different bands will have the same pairs in the same horizontal partition. Performing the Bloom filter step reduces the number of pairs by nearly ninefold.

Bloom Filter	Band 0	Band 1	....	Band 8
BF 0	(A,B)	(A,B)	...	(A,B)
	(C,D)	(C,D)	...	(C,D)
	(E,K)	(F,K)	...	(D,E)
	(B,K)	(B,K)	...	(E,K)
BF 1	...	...	...	(B,K)
	...	...	...	...
BF 8	...	...	...	...

The resulting unique pairs are then used to identify clusters of near-duplicates by finding connected components in a graph, where the vertices represent documents and the edges represent matches.

This step produced 1.9 TB of unique pairs.

Finding Connected Components using MapReduce

The purpose of this step is to create a set of clusters of matching pairs. For example, a list of pairs (A, B), (B, C), (D, E) is merged into a list of components (A, B, C) and (D, E). Using a third-party library like NetworkX to find connected components would require all pairs to fit into the memory of a single machine, which is not feasible. Instead, we implemented a distributed connected component finderusing the Dask framework, which can scale across multiple machines. The algorithm works by mapping edges by both the source and destination of pairs and reducing only edges where the source is greater than the destination. It performs successive iterations of this MapReduce computation until convergence, meaning the number of new edges produced becomes zero. In the end, every document in a cluster points to the smallest document within the cluster. Later, we compile a list of duplicate documents that need deletion and gather statistics about each component.

We needed to partition the duplicate pairs generated in the third stage into three groups to reduce memory pressure on the final stage. We observed that the second stage itself generates partial components which have some overlap. These overlapping clusters cause some documents to appear in the delete set multiple times. However, our deletion code handled this overlap.

Below is the distribution of duplicate documents found across different snapshots of CommonCrawl. The distribution is skewed to the right because the documents are bucketed by the dump ID of the document we retain, and we prefer documents from higher dump IDs.

Analysis of Near-Duplicate Clusters

Smaller components tend to have more overlap in their MinHash bands. The smallest components are almost exact pairs but due to small differences, were not included in the local exact deduplication.

Changes in text are incremental from buckets of 3 or more documents onwards. The example below shows a personnel list that has grown over the years.

In sizable clusters comprising 1000 or more documents, we observe a trend towards templatization. This involves the recurrent use of standardized language to convey general topics such as terms and conditions, warnings, and disclaimers. Such language is prevalent on commercial websites, offering a consistent and efficient way to communicate commonly encountered information.

Personally Identifiable Information Removal

Why Personally Identifiable Information Removal

Personally Identifiable Information (PII) refers to any information that can be used to identify an individual, such as names, addresses, phone numbers, email addresses, and social security numbers. PII removal is essential for data privacy and security, as well as for compliance with global regulations. By removing PII from the training data, we can reduce the risk of data breaches and unauthorized access to sensitive information. Additionally, removing PII from training data prevents the models generating that specific PII during inference time.

PII Type	Examples	Target
Email	john.doe@llm360.ai	firstname.lastname@example.com
IP Address	172.217.164.110	[22.214.171.124 , ...]

Removing PII

Similar to prior work, we have removed two types of PII from the dataset: email address and IP address. Regular expressions are used to identify and replace these PII with a generic placeholder. We have also designed PII removal procedures for individual sources, such as replacing names in the Ubuntu IRC dataset mentioned above.

We have used the following regular expressions to identify and replace PII:

Email:
r"[A-Za-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[A-Za-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:(?:[A-Za-z0-9] (?:["r"A-Za-z0-9-]*[A-Za-z0-9])?\.)+[A-Za-z0-9](?:[A-Za-z0-9-]*[A-Za-z0-9])?|\[(?:(?:25 [0-5]|2[0-4][0-9]|[&quot r&quot01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]? |[A-Za-z0-9-]*[A-Za-z0-9]:)])
IP Address:
r"(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"

Normalization Form C

Normalization Form C Defined

Normalization Form C (NFC) is a Unicode normalization form that combines characters with diacritics into a single code point. This is important for text processing tasks as it ensures that the text is consistently represented across different languages and scripts. By normalizing the text to NFC, we can avoid issues related to character encoding, such as duplicate tokens and incorrect tokenization.

NFC Implementation

We have used the ftfy library to normalize the text to NFC. The library provides a simple API for normalizing text to NFC, which can be applied to the entire dataset one row at a time. Below is the code snippet about how we normalized text to NFC:

ftfy.fix_text(text, normalization="NFC")

NFC Examples

Original Text	NFC Text
fÃ¼r	für
the problem \ ud83d \ ude42	the problem 🙂
peptidoglycan`s	peptidoglycan's

TxT360 Studies

What This Section Contains

This section shows the learning curve when pre-training on TxT360, with a proper upsampling approach. We compare several simple strategies and demonstrate that one particular upsampling method, inspired by the natural data distribution, performs exceptionally well. In our preliminary experiments, the model learns significantly faster on TxT360 compared to a similarly scaled dataset, FineWeb, on several important evaluation metrics. We believe that a more carefully designed upsampling strategy could further enhance the use of our data.

In addition to the training results, we also provide an analysis of the dataset, including perplexity trends over time across the CommonCrawl snapshots. This section is organized into the following topic areas:

The Learning Curve of TxT360 with an Upsampling Recipe
Perplexity Analysis across time
Topic Analysis on Data Cluster Groups
Estimated Reading Time: 25 minutes

A Simple Data Mix Creates a Good Learning Curve

As discussed in prior sections, duplicated documents can significantly reduce training efficiency (i.e., the ratio of model performance to the number of pre-trained tokens). Previous work, such as RefinedWeb, emphasizes the importance of deduplication. Recently, the FineWeb study conducted an interesting analysis, comparing LLM performance when pre-trained on globally deduplicated versus locally deduplicated datasets. They found that training efficiency with a globally deduplicated dataset can be worse. Fineweb hypothesize that global deduplication may remove a higher proportion of high-quality documents.

This finding led us to consider that a pre-training corpus based on crawled websites is naturally upsampled for a variety of reasons. For example, commonly used templates or boilerplates may appear millions of times; a well-regarded article reposted by different users may surface across multiple sites; and the same web pages, crawled by CommonCrawl at different times, will duplicate each other. The reasons behind these duplications vary: some may serve as indirect indicators of high-quality content, while others may not. Therefore, curating a pre-training dataset should involve leveraging these signals and considering data weighting schemes — or at the very least, provide users with the necessary information to control it effectively.

To this end, we store rich metadata for each document source, including features like user votes from StackExchange. One crucial piece of metadata is the number of duplicates detected for a document. This information allows users to reconstruct the natural web distribution, but more importantly, we will demonstrate that a simple upsampling recipe based on this metadata can create a high-quality data mix.

Experiment Setup

Motivated by the FineWeb study, we opted to upsample documents based on their natural distribution. However, since duplication is only an indirect indicator of quality, we upsample documents to a few predefined levels rather than using their exact count. Specifically, we set the upsampling weight to 3 for documents with 2 to 5 duplicates, 5 for those with 5 to 100 duplicates, 8 for 101 to 1000 duplicates, and 10 for documents with over 1000 duplicates. These values were selected heuristically and informed by preliminary small-scale experiments. For non-CommonCrawl data sources, we assign a weight of 2 if the document appears more than once. This straightforward approach results in a corpus exceeding 15 trillion tokens, making it one of the largest open-access pre-training datasets available.

To evaluate the training efficiency of our dataset, we sampled 1.5T tokens from both FineWeb and TxT360 (using the aforementioned weighting) and conducted a training ablation on an 8x8B Mixture-of-Experts architecture, similar to Mixtral. We compared the learning curves by tracking training loss, validation scores, and performance across a wide array of diverse evaluation benchmarks. The validation set was sampled independently from SlimPajama. Note that this experiment is done on a slightly earlier version of the dataset.

Learning Curves on the Evaluation Metrics

Evaluation results are the most direct indicator of model quality. We assess the intermediate results of the models across multiple metrics and plot the learning curves. Our findings indicate that the model learns significantly faster with TxT360. For a fair comparison, we evaluate TxT360 against FineWeb using only the CommonCrawl data sources, and we also show the curves after incorporating the 14 curated sources and coding data (Stack V2), demonstrating the full potential of the dataset. Due to computation resource constraints, we stop running experiments when we can observe clear trends.

Based on the metrics, we find that TxT360’s CommonCrawl portion with the umsampling strategy outperforms FineWeb on key metrics at MMLU, NQ, falls slightly behind on HellaSwag. Furhter, we show that by combining TxT360 with coding data (Stack V2), the learning curve is significantly more stable and we observe improved results across most all of the metrics. Apparently the dataset preference here may depend on the set of metrics one would use.

Similar to the findings in DCLM, adding the curated non-CommonCrawl data sources produces mixed results (some preliminary figures are not shown here). Yet such data can help with domain specific tasks like MedQA.

Comparing the Loss Curves

We also plot the training and validation loss curves for each dataset, showing that TxT360 achieves both lower training and validation losses compared to FineWeb. Although training loss may not correlate directly with final model performance, we observe that the loss curve for TxT360 exhibits fewer spikes compared to FineWeb, indicating more stable training dynamics.

Perplexity Evaluation on Duplicate Data

Model based Quality Estimation

We took one of the model-based data quality evaluation strategies adopted by DataComp-LM which used perplexity filtering as a candidate for quality filtering. The DCLM results show that a simple perplexity filter is still quite strong. DCLM followed CCNet’s practice to use a 5-gram Kneser-Ney model as implemented in the KenLM library for efficient perplexity calculation. In order to gain more insights of our dataset, we also took a KenLM model trained on English Wikipedia data to compute perplexity on data with different duplication patterns, and try to observe how such signals coorelate with the duplication patterns.

Sampling Strategy

We took a early version of the TxT360 Common Crawl (CC) portion, and bucket the documents by the number of duplicates each has. For each CC snapshot, we bucket the documents by their duplicate counts in the following buckets (1, 2-5, 6-10, 11-100, 101-1000, 1001-infinite). We sampled the first 10k documents from each bucket.

Perplexity vs. Years

Taking the same data, we can convert it into a graph indicating the yearly trend. For most buckets, the average perplexity of dumps from more recent years seem to be lower than that of former years. This could be biased since we always keep the newest document if we find a duplicate.

Perplexity vs. Document Duplication

Instead of bucketing, we also plot the relationship between perplexity versus the number of duplicates directly. The graph becomes a bit noisy at the end because of insufficient samples with larger duplication counts. However, we can observe that there seems to be a lower point at around 10-20 duplicates. To see the results more clearly, we recommend you turn of other years and only look at one year, and zoom in to 0-100 region on the X axis.

Perplexity vs. Dump Duplication

Fineweb hypothesize that documents appear across multiple snapshots (CC dumps) might be an indicator of quality. Hence, we also plot the perplexity versus the number of times a document appear in different snapshots. From the graph below we can see that documents that are duplicated across around 40 - 60 snapshots usually have lower perplexity.

Perplexity Plots before Global Deduplication

Previously we have seen that documents in recent snapshots tend to have lower perplexity. This might be related to the way how global deduplication was implemented. During global deduplication, we only keep copy in the latest dump. Hence documents that are duplicated across multiple dumps only appear in the latest one. To avoid bias brought by this strategy, we tried to recover the states before the global deduplication using the stored metadata (i.e., the locally deduplicted dataset state). This trends are a bit different. In the figure below, we do not observe a clear trend of which year has a higher quality, especially in the 2-10 bucket region.

Perplexity vs. Dump Duplication before Global Deduplication

Following the same practice, we can plot the graph of average perplexity with respect to dump duplication count, before global deduplication. The conclusion is similar, that documents with a dump duplication count around 40-60 have the lower perplexity.

Llama 3.1 8B

For comparison purpose, we run the same perplexity evaluation with llama 3.1 8B model.

Perplexity vs. Buckets

Perplexity vs. Years

Perplexity vs. Dump Duplication

Perplexity vs. Buckets before Global Deduplication

Perplexity vs. Dump Duplication Count before Global Deduplication

Topic Analysis

In order to understand our dataset better, we tried to cluster our data into topic groups and examined for correlations between topics and other attributes of the documents. We suspect documents from different topic groups should manifest different characteristics of distribution, which can give us some insight into the composition of dataset.

Methodology

We took an early version of the LLM360 Common Crawl portion and clustered them into 17 topic groups using BERTopic. We collected and aggregated a series of metrics from the stored metadata. For each topic group, we calculated average scores and generated the corresponding bar charts over different metrics for comparison and analysis.

Cluster Groups

We grouped data into the following 17 clusters. These clusters are obtained by first clustered a seed portion of the dataset into 128 dumps, and then we manually inspect the clusters to combine 17 semantically meaningful ones.

Arts
Business & Economics & Finance
Culture & Cultural geography
Daily Life & Home & Lifestyle
Education
Entertainment & Travel & Hobby
Environment
Food & Drink & Cooking
Health & Wellness & Medicine
Law & Justice
Natural Science & Formal Science & Technology
Personal Development & Human Resources & Career
Politics & Government
Religion & Spirituality
Shopping & Commodity
Society & Social Issues & Human Rights
Sports

Topic vs. Various Metrics

In the following section, we plot the cluster against their average score of a particular metric stored in the metadta. We recommend the readers to jump to the ones you are most interested in.

Number of Document of Each Topic

As shown in the graph above, over 20% of the documents are related to Business & Economics & Finance, which makes it the largest topic group in dataset. On the contrary, the group of Culture & Cultural geography contains the smallest number of documents among all topics.

Fraction of Words Corrected in Lines

In average, documents related to Shopping & Commodity have larger fraction of words corrected in lines.

Fraction of Lines Ending with Ellipsis

Compared with other topics, Personal Development & Human Resources & Career in average contain more lines ending with ellipsis.

Fraction of Lines Starting with Bullet Point

Shopping & Commodity related documents have higher percentage of lines starting with bullet point.

Number of Lines with Toxic Words

Daily Life & Home & Lifestyle in average has more lines with toxic words.

Number of Toxic Words

Daily Life & Home & Lifestyle in average has more toxic words.

Word Count

Documents in the topic of Personal Development & Human Resources & Career in average contain more words than other topics.

Mean Word Length

There is no significant variance in the average word length for different topic groups. However, Education related data contain longer words than others in general.

Number of Sentences

Documents in the topic of Personal Development & Human Resources & Career usually contain more sentences.

Symbol to Word Ratio

Documents related to Entertainment & Travel & Hobby usually have higher percentage of symbols.

Fraction of Words with Alpha Character

The fraction of words with alpha character seems to be relatively consistent across different topics.

Number of Stop Words

Culture & Cultural geography contains more stop words in average.

Has Curly Bracket

Natural Science & Formal Science & Technology has a significantly higher rate in percentage of documents that contain curly bracket. It might be related to the coding data.

Number of Document Duplication

Culture & Cultural geography related documents have a higher number of duplication count.

Number of Dump Duplication

In average, Culture & Cultural geography related documents are duplicated across a higher number of common crawl dumps. Duplication of Shopping & Commodity appears in less dumps than others.

Number of Year Duplication

In average, Culture & Cultural geography related documents are duplicated across more years than other topics.

Maximum Span of Year Duplication

In average, Culture & Cultural geography related documents are duplicated across a wider span of years.

Language Score

Average language scores of different topic groups are mostly consistent. No significant differences are obeserved.

Fraction of Duplicate Lines

In average, Shopping & Commodity has a larger fraction of duplicate lines than others.

Fraction of Characters in Duplicate Lines

Shopping & Commodity usually has a larger fraction of characters in duplicate lines than others.

Fraction of Characters in Most Common Bigram

Fraction of Characters in Most Common 3-gram

Fraction of Characters in Most Common 4-gram

Fraction of Characters in Duplicate 5-grams

Fraction of Characters in Duplicate 6-grams

Fraction of Characters in Duplicate 7-grams

Fraction of Characters in Duplicate 8-grams

Fraction of Characters in Duplicate 9-grams

Fraction of Characters in Duplicate 10-grams

Number of Document of Each Topic in Duplication Bucket 1-1

Number of Document of Each Topic in Duplication Bucket 2-5

Number of Document of Each Topic in Duplication Bucket 6-10

Number of Document of Each Topic in Duplication Bucket 11-100

Number of Document of Each Topic in Duplication Bucket 101-1000

Number of Document of Each Topic in Duplication Bucket 1001-30000000

Citation

For attribution in academic contexts, please cite this work as

@misc{txt360data2024,
      title={TxT360: A Top-Quality LLM Pre-training Dataset Requires the Perfect Blend}, 
      author={Liping Tang, Nikhil Ranjan, Omkar Pangarkar, Xuezhi Liang, Zhen Wang, Li An, Bhaskar Rao, Linghao Jin, Huijuan Wang, Zhoujun Cheng, Suqi Sun, Cun Mu, Victor Miller, Xuezhe Ma, Yue Peng, Zhengzhong Liu, Eric P. Xing},
      year={2024}
}