I made a DSL

and I liked

Email Security 101

You can set a DNS record that sends you an email report when people send mail as that domain.

DNS Record

$ dig +noall +answer _dmarc.robertroskam.com txt

output

v=DMARC1; 
p=quarantine; 
rua=mailto:0y8hka6m@ag.dmarcian.com

let's dig my domain's dmarc record

DMARC Report

<?xml version="1.0" ?>
<feedback>
  <report_metadata>
    <org_name>google.com</org_name>
    <email>noreply-dmarc-support@google.com</email>
    <extra_contact_info>https://support.google.com/a/answer/2466580</extra_contact_info>
    <report_id>3766706526427983302</report_id>
    <date_range>
      <begin>1626220800</begin>
      <end>1626307199</end>
    </date_range>
  </report_metadata>
  <policy_published>
    <domain>robertroskam.com</domain>
    <adkim>r</adkim>
    <aspf>r</aspf>
    <p>quarantine</p>
    <sp>quarantine</sp>
    <pct>100</pct>
  </policy_published>
  <record>
    <row>
      <source_ip>209.85.220.41</source_ip>
      <count>4</count>
      <policy_evaluated>
        <disposition>quarantine</disposition>
        <dkim>fail</dkim>
        <spf>fail</spf>
      </policy_evaluated>
    </row>
    <identifiers>
      <header_from>robertroskam.com</header_from>
    </identifiers>
    <auth_results>
      <spf>
        <domain>robertroskam.com</domain>
        <result>softfail</result>
      </spf>
    </auth_results>
  </record>
</feedback>

Observations

IP Addresses aren't human readable
Server names from reverse IP might not be meaningful
You need another layer to group and label the traffic

Enter Dmarcian.com

Over 1k rules
Based on >1 trillion records
Written by professional analysts

Rule Examples

Label: Salesforce - ID: 105

ip_in_netblocks(ip, ['67.228.34.32/27', '52.128.40.0/21']) or 
regex(ptr_org, 'emsend[1-8].com')

Label: Active Campaign - ID: 107

asn == 22606

ptr_org in ('hubspot.com', 'hubspotemail.net')

Label: Hubspot - ID: 106

Human-Readable Output

The Motivation

Give our end users the same tools as our analysts.

Why?

Sometimes our users know better than us what particular traffic actually is

ip == '209.85.220.41' and ptr_org == 'robertroskam.com'

Label: Roskam's Home - ID: 108

So just expose the internal authoring system, right?

All rules were written in python and just evaled in production.

NO!

What we like

Features: we had a very robust feature set we gave to our analysts. It has allowed them to be very productive. Our analysts went from 1k rules to 1.5k rules in about 6-8 months after our new authoring tool. The first 1k rules took 7 years to write.
Performance: entire 1.5k+ ruleset takes 1-2ms per each item of traffic. We process ~10 million records of traffic each day and that rate doubles every 4-8 months.

What we don't like

Security: users could be malicious with pure Python
Feedback: the existing approach did some inspection on the incoming rule, but was limited in giving feedback if the code was an invalid Python AST
Global rules: we had no means to separate the rules on a per-account basis, because our rules authoring engine was centralized

Requirements

Prevent injection attacks
Give feedback to users during authoring
Enable per-account not just global authoring
Support existing features in 1.5k+ rules
Have the performance be no worse at runtime

Authoring

Prevent injection attacks
Give feedback to users

Areas of Concern

Runtime

Enable per-account authoring
Support existing features in 1.5k+ rules
Have the performance be no worse

Authoring Requirements

Prevent injection attacks
Give feedback to users

Injection Attacks

locals()

import

match

block these?

allow these?

Is it safer to....

!=
in

✅

Authoring Requirements

Prevent injection attacks. Choices are:
- ~~Block~~
- Allow ✅
Give feedback to users

What kind of feedback?

Syntax

Semantics

abc == 123
asn == 1234 and
asn == 123 and ( ptr_org == 'foo.com' or h_from == 'm.foo.com'

asn == '123'
ptr_org in asn
regex(ptr_org, 123)

Authoring Requirements

Prevent injection attacks. Choices are:
- ~~Block~~
- Allow ✅
Give feedback to users:
- syntax
- semantics

Does this already exist?

Nope

Time to make a DSL

Phases

Plan: Determine the grammar
Implement: Lexer, Parser, Compiler
Deploy: Integrate into Existing Authoring System as beta

Plan

Problem #1: what all we use
Problem #2: DSL syntax

Problem #1: determine what all we use

Historically, "just use Python" was the spec for analysts, we had no idea what was being used
We wrote a script using Python's AST library and just dumped out every token found across our existing rules.
That's when the surprises rolled in:
- Just comparisons ops ==, != but not < or >
- We had both lists and tuples being used
- Certain variables injected but completely unused

Problem #2: DSL syntax

Since we could make the syntax different, should we?
- using AND/OR instead of and/or
- or using = instead of ==
For the initial pass, we did decided to support a subset of Python's language spec, and not limit ourselves to that permanently.
We decided to reject doing both lists [,] and tuples (,), and just have one symbol set and we went with lists using [,]
As a subset of Python's syntax, we didn't have to immediately rewrite the runtime in production to use our new compiler. So this was a win from a deployment compatibility.

Implement

Made as an internal library; separate git repo
Tokenizer
- it knows about exactly which variables to expect
- It can reject anything it doesn't expect
Parser
- LR hand parser, doing depth first
- The parser does the heavy lifting doing syntax checking for boolean logic and parenthesis and semantic checking
- We even check the argument types for each function we have and their outputs
Compiler
- Targets Python for this initial build and simply unrolls the AST from a map for each symbol to python
~70 tests (pytest + parametrize); gitlab ci: tests, black, flake8

Deployment

Rewrote all existing rules that used tuples to using lists
Integrated it into our internal authoring tool fairly simply. Before committing to the db, we check each one. That was deployed after we wrote all the rules.
- Had briefly considered storing the AST instead of the DSL, but decided against it because it locked us into particular AST representations.
Once the runtime was rewritten to support multiple accounts, we implemented shortly after that the new compiler. It was basically a drop in replacement and added two lines.

Final Thoughts

Things I regretted

Not implementing a type system
Not making the DSL re-usable to other domains
Not treating all the operators such as == and in like functions instead of different syntax. (Their visitor functions are discrete.)
Trying a functional only approach with the parser. Found how much state being passed around a bit too much.

Things I liked about the approach

Relatively easy to debug weird rule behavior in production
Authoring for existing users and customers is extremely straightforward
Leaves us open to expansion into new ideas
Enables the feedback

Future Opportunities

Rule overlap checking: as it stands, we don't know if two authors write substantially similar rules
Implementing a client-side syntax and semantic-aware auto-complete/intellisense
Making a library for implementing DSLs so that we can reuse this approach in others areas of the business

Stakeholders Involved

Principal Engineer for Core Product
Director of Engineering
Lead Analyst
Director of Deployment Services
VP of Product, later C-Level

Client-Driven WYSIWYG

Example of client side builder for conditions. Src: sentry.io

I made a DSL

By Robert Roskam

I made a DSL

Robert Roskam

Engineer Manager at Pantheon

raiderrobert

I made a DSL

Email Security 101

DNS Record

DMARC Report

Observations

Enter Dmarcian.com

Rule Examples

Human-Readable Output

The Motivation

Why?

So just expose the internal authoring system, right?

All rules were written in python and just evaled in production.

NO!

What we like

What we don't like

Requirements

Authoring

Areas of Concern

Runtime

Authoring Requirements

Injection Attacks

block these?

allow these?

Is it safer to....

Authoring Requirements

What kind of feedback?

Syntax

Semantics

Authoring Requirements

Does this already exist?

Nope

Time to make a DSL

Phases

Plan: Determine the grammar

Implement: Lexer, Parser, Compiler

Deploy: Integrate into Existing Authoring System as beta

Plan

Problem #1: determine what all we use

Problem #2: DSL syntax

Implement

Deployment

Final Thoughts

Things I regretted

Things I liked about the approach

Future Opportunities

Stakeholders Involved

Client-Driven WYSIWYG

I made a DSL

More from Robert Roskam