Rule semantics & selectors
How MonitorTower resolves CSS selectors, alias.field tokens, regex flags, and logic expressions.
The rule builder couples every source (s1, s2, …) with fields that you can reference from regex rules, numeric expressions, semantic similarity checks, and notification templates. This reference explains how the pieces map together so the worker evaluates rules exactly the way you expect.
Source aliases & CSS selectors
- Sources are ordered and auto-assigned aliases (
s1,s2, etc.). The alias remains stable even if you reorder rules; rename a source and the alias persists. - CSS selectors are parsed by BeautifulSoup/SoupSieve, so anything that works in modern CSS (IDs, classes, descendant combinators, attribute selectors,
:nth-child, etc.) is valid. - Monitors fail early if the selector is invalid or matches 0 nodes, so you notice broken scrapes before shipping alerts.
/* Example selector that grabs both the title and body copy */
main article h1,
main article .content p:nth-child(-n+2)Tip: the builder lists any metadata-backed fields for the source type beneath the selector—populate the source_fields table in Supabase if you need custom fields beyond the defaults.
Field catalog & token syntax
Every field is referenced via a token shaped like ${s1.content_md} (dollar sign, curly brace, alias dot field, closing curly brace). The default WEBSITE source exposes:
| Token | Type | Description |
|---|---|---|
${s1.content} | HTML | Raw HTML string returned by the selector. |
${s1.content_md} | Markdown | Markdown-rendered version of the HTML. |
${s1.content_text} | Text | Plain-text content with whitespace normalized. |
${s1.status_code} | Number | Final HTTP status code. |
${s1.match_count} | Number | How many nodes matched the selector. |
${s1.content_length} | Number | Character length of the Markdown output. |
Add your own numeric or string fields through Supabase metadata and they will appear in the builder with the same ${alias.field} shape.
TEXT_REGEX rules
- Patterns are evaluated with Python’s
reengine usingre.IGNORECASE | re.MULTILINEby default. That means.never matches newlines unless you opt in with(?s). - Inline flags let you override behavior per rule:
(?-i)forces case-sensitive matching,(?m)toggles multiline,(?s)enables dot-all, etc. - Errors from the regex compiler bubble up immediately; the Rule Editor validation (#7) also lints the pattern in the UI so you get a red error state before saving.
Example:
${s1.content_text}.regex("(?s)(?:earnings|guidance).*(raised|hiked)") == trueNUMERIC_EXPR rules
- Compose expressions with the numeric tokens you select (
${s1.status_code},${s1.content_length}, etc.) plus+ - * /, comparison operators (>,<,>=,<=,==,!=), and logical connectors&&/||. - Parentheses are supported for grouping. The builder will highlight any tokens you reference that were not selected, and validation (#7) blocks saving until the expression parses cleanly.
- Pi workers currently raise
UnsupportedRuleErrorfor numeric expressions; the Cloud worker honors them, so keep them in monitors even if your local device cannot execute them yet.
Example:
(${s1.status_code} >= 500 && ${s1.content_length} > 2000) || ${s2.match_count} == 0SEMANTIC_SIMILARITY rules
Provide:
- A target field (usually
${s1.content_text}). reference_textthat captures the tone or phrasing you’re looking for.- A threshold from 0–100 (most alerts use 70–85). Workers compute embeddings and brand matches above the threshold as
true.
Like numeric expressions, Pi workers currently raise UnsupportedRuleError; the semantic runtime runs in Cloud workers.
Logic expressions (r1, r2, …)
- Each rule becomes
r1,r2, etc. UseAND,OR,NOT, parentheses, and the literalsTRUE/FALSEto describe how they combine. - Expressions are case-insensitive (
r1 and r2works) and validated in both the UI and backend. Missing aliases or stray keywords raise an error before deploy. - When only one rule exists, the builder auto-fills
r1. Add more rules and you’ll be prompted for the full expression.
Example:
r1 AND (NOT r2) OR r3Notification templates & monitor tokens
Templates accept the same token syntax, plus monitor-level helpers that the backend injects ({monitor_name}, {monitor_id}, {processed_at}). Combine them to craft alerts that cite the trigger:
${monitor.name} fired – see ${s1.final_url}\nTop paragraph: ${s1.content_text}If the template fails to format, the backend falls back to the standard “Monitor triggered” copy, so malformed templates never block alerts.
Rule Editor validation (#7)
The builder now provides real-time guardrails:
- Regex inputs compile in your browser; invalid patterns show an inline error.
- Numeric expressions highlight unknown tokens and unsupported characters.
- Token Highlighter previews notification templates and points out tokens the backend won’t recognize.
- Logic expressions must be present when more than one rule exists—missing or misspelled aliases are flagged before submission.
These checks mirror the same validators the backend runs during job_builder and worker ingestion, so anything that saves through the UI will also succeed when the worker evaluates it.