opsec-blogposts/simplexalerts/index.md

---
author: MulliganSecurity
date: 2025-06-04
gitea_url: "http://git.nowherejezfoltodf4jiyl6r56jnzintap5vyjlia7fkirfsnfizflqd.onion/nihilist/blog-contributions/issues/223"
xmr: 82htErcFXbSigdhK9tbfMoJngZmjGtDUacQxxUFwSvtb9MY8uPSuYSGAuN1UvsXiXJ8BR9BVUUhgFBYDPvhrSmVkGneb91j
tags:
  - Core Tutorial
---

# The case for alerting

As you know, [monitoring](../anonymous_server_monitoring/index.md) is important when running any kind of operations, especially so
for clandestine ones.


## Alert Types

Automated alerts have many advantages over open simple monitoring: if you can define what nominal looks like (can be done with simple statistical process control measures such as the [Nelson rules](https://en.wikipedia.org/wiki/Nelson_rules)) then you can set up a system that will be:
- reliable
- customizable: you define the exact context you want, how you want to receive alerts
- runs 24/7

Simple threshold-based alert are reactive by nature, but their automated monitoring cuts the response time down and enable operational agility when responding to threats. Statistical-based alerts allow you to be proactive and notify you when something is not a problem yet but might become one in the future.


### Examples from the MulliganSecurity Infrastructure monitoring standard playbook

- Threshold-based: a [SMARTCTL](https://en.wikipedia.org/wiki/Smartctl) alert creating a notification when any hard drive within your infrastructure crosses a pre-failure threshold

    smartctl_device_attribute{attribute_flags_long=~".*prefailure.*", attribute_value_type="value"}
    <=
    on (device, attribute_id, instance, attribute_name)
    smartctl_device_attribute{attribute_flags_long=~".*prefailure.*", attribute_value_type="thresh"}

- Statistical (anomaly detection): CPU spike or under-use

    cpu_percentage_use > (avg_over_time(cpu_percentage_use[5m]) + (3* stddev_over_time(cpu_percentage_use[5m])))
    OR
    cpu_percentage_use < (avg_over_time(cpu_percentage_use[5m]) - (3* stddev_over_time(cpu_percentage_use[5m])))

## Associated Risks

As your perimeter and infrastructure grows, as you add more servers your system complexity will shoot up exponentially. Simple organic alerting shows its limit when you have to correlate logs and behaviors across multiple systems.
That's why you need alerting, if an adversary decides to stealthily probe at your infrastructure and you know what to look for you will see their attempt for what it is. Choosing to remain in the dark about it is foolish at best and irresponsible if you are part of an outfit as your laziness will put others in harm's way.


### Real world attack scenario

#### Situation
You run a clandestine operation that requires the ability to serve a website over tor in a [highly available](../high_availability/index.md) manner.


#### Assets
To keep this example simple we will focus only on the website content as the asset to be protected

![](example_infra.png)

#### Threat model
You have a highly technical, state-backed adversary.

- Adversary objective:
  - Either make your content unavailable or untrusted
- Adversary methods
  - Availability-based attacks: take the site down
    - tor service deanonymization techniques (from the [high availability](../high_availability/index.md) attacker playbook)
  - Integrity-based attacks: deface or introduce mistakes to break public trust in its content
    - AppSec-based attack on the website itself, probing for vulnerabilities such as XSS to identify readers or SQLi to change site content or gain access

#### Threat-model based Alerting

When devising a monitoring plan you must take the following into account:

- What application are you running?
  - we are running a website that interacts with user over HTTP through a tor onion service
- How do you know it is running correctly?
  - Systemd unit must be running
  - associated systemd-socket must be running
  - correct queries should receive answers with the following characteristics
    - 200 status code
    - 95th percentile response time of at most Xms
- How do you monitor the application substrate (VPses, Networks)
  - Onionbalance node must be up and available
  - Onionbalance node should receive and reply to queries for the highly available server descriptors
  - Onionbalance node should have minimal network trafic besides that
  - Onion balance node and VPS server should have minimal SSH trafic and only through a tor onion sevice
  - Onion balance 95th percentime response time of at most Yms
  - Vanguards warning must remain off on this infrastructure in normal operating conditions


##### Availability-based attacks

- To discover coordinated attemps at availability-based deanonymization against your infrastructure you should monitor your server's uptime (Prometheus data source)

    absent(up{application="node",instance="myserver5496497897891561asdf.onion"})


- Percentile based detection of performance-degradation attacks (Prometheus data source)

    histogram_quantile(0.95, sum(rate(http_server_duration_milliseconds_bucket{http_method="GET",http_host="mycoolwebsite.onion"}[5m])) by (le, http_method,http_host))


##### Intrusion detection

- insider threat: track successful logins and session durations (Loki)

    {unit="systemd-logind.service", instance="$hostname"} |= `session` | regexp `.* session (?P<session>[0-9]+).*user (?P<user>[^\.]+)` | label_format session="{{.session}}", user="{{.user}}" | session != ""


- If the endpoint used to connect remotely over ssh gets discovered by the attacker and becomes the target of a bruteforce attacks (Loki data source):


    count_over_time({unit="sshd.service", instance="myserver"} |~ `.*invalid (user|password)`[24h]) > 0


- If you have deployed fail2ban and an appropriate telemetry exporter to monitor it (prometheus data source) this query can give you a heads up when  you are under attack

    sum by (instance) (rate(f2b_jail_banned_current[5m]))


Season with statistical threshold detection depending on how likely your administrators are to fat-finger their username


- Appsec Monitoring (Tempo datasource for traces): if your service collects distributed tracing data you can create alerts based on specific function durations to discover if an attacker has, for example, dropped a webshell in a traced function


    {duration>=10s && .service.name="my-interactive-website"}

Do note that Pyroscope for continuous profiling should also be used, but this is highly application specific (eg: monitor critical functions for duration variation). You will want to create recording rules that will build prometheus metrics from your continuous profiling infrastructure so you can alert against those. Creation of recording rules is out of scope for this tutorial but they use the same language and tooling as alerting rules.

![](pyroscope_tracing.png)


### But alerting carries risk too!

Indeed. Today we will keep building on the [monitoring](../anonymous_server_monitoring/index.md) tutorial.


Alerts can be used to deanonymize you. If an adversary suspects that you or others are tied to a clandestine infrastructure, they might decide to trigger alerts (say, bruteforcing one of your endpoints) and see if you receive notifications through channels they control.

![](insecure_channel.png)


Grafana supports a large number of possible alerts methods, most of them unfit to our purpose as those channels can be watched by the adversary.

![](available_alerters.png)

- Telegram: opaque and centralized, tied to phone numbers
- AWS, Cisco, DingDing, Slack, Discord... same issue
- Email: you can control it but it still is traceable and tied to a domain name and clearweb infrastructure with all the accompanying metadata
- webhook: simply call an arbitrary URL with the alert message => **that's the way to go with the most control for ourselves**


# Tools of the trade

Let's start with webhooks. Webhooks are a great swiss-army knife when you need a system to act on its environment. They only need an http endpoint and the ability to make http request. They can easily be run as onion services too!


For this tutorial, Mulligan Security open sourced the tool they use for their own infrastructure: the [grafana-simplex-alerter](http://git.nowherejezfoltodf4jiyl6r56jnzintap5vyjlia7fkirfsnfizflqd.onion/Capably7710/simplex-alerter)


This will be used with the [simplex-chat](https://github.com/simplex-chat/simplex-chat) CLI client.


## Target architecture

![](architecture.png)


## Simplex-chat

You can install simplex-chat from your source repository or directly from the [simplex-chat release page](https://github.com/simplex-chat/simplex-chat/releases)


Download the simplex-chat ubuntu release as shown:


![](releases.png)


The release page can be accessed through your web browser or you can download the latest (at the time of writing) release with the following command:


    wget https://github.com/simplex-chat/simplex-chat/releases/download/v6.2.3/simplex-chat-ubuntu-22_04-x86-64


### Initializing your Simplex client

Run your client in server mode:


    [user@devnode:~]$ simplex-chat -d clientDB -p 1337 -x
    No user profiles found, it will be created now.
    Please choose your display name.
    It will be sent to your contacts when you connect.
    It is only stored on your device and you can change it later.
    display name: myAlertBot
    Current user: myAlertBot


This command will:
- create two databases that your client uses for reconnecting to groups and interacting with the simpleX Network
- set this client display name (what you will see in your groups when receiving alerts)
- open a websocket port that the alerter will use on 127.0.0.1:1337


## Grafana-Simplex-alerter

### What's this code and how can I trust it?

Great question. This code runs a simple webserver using uvicorn, configures the simplex-chat client based on a yaml config file you provide it with invite links to the alert groups you want to configue.

#### FOSS
This code is FOSS. All of it. No secret, sauce. Furthermore it's pretty easy to analyze yourself as it's short

#### Dependency pinning
Everything that goes into making this code is cryptographically pinned and auditable:

- flake.lock file => python version, build-tools, and so on
- uv.lock => all libraries used by the code

#### Opentelemetry-enabled
Your alerting system itself should be monitored, this is can be done the following way:

- prometheus metrics: the alerter exposes a special /metrics endpoint so you can collect telemetry data about it
- continuous profiling: you have the option to connect it to a pyroscope server to get low-level per-function performance profile informations
- opentelemetry integration: connecting it to an opentelemetry collector such as alloy will give you correlated logs, traces, metrics and profiles, allowing you to easily debug and monitor it


And if you don't care about any of this: if you don't use those options no resources shall be consumed.

### Building the alerter

#### With nix

If you have the nix package manager installed, you can simply run:


    nix profile install github:MulliganSecurity/grafana-simplex-alerter


You will then be able to run simplex-alerter:


    [user@devnode:~]$ simplex-alerter --help
    usage: simplex-alerter [-h] [-a ADDR] [-m PROMETHEUS_CONFIG] [-o OTEL_SERVER] [-f PYROSCOPE_SERVER] [-b BIND_ADDR] [-d] [-c CONFIG] [-g] [-e ENDPOINT]

    options:
      -h, --help            show this help message and exit
      -o, --opentelemetry OTEL_SERVER
                            opentelemetry server
      -f, --profiling PYROSCOPE_SERVER
                            pyroscope server address for profiling
      -b, --bind-addr BIND_ADDR
                            host:port to run the app on
      -d, --debug           enable debug mode, increases pyroscope sampling rate if configured
      -c, --config CONFIG   config file
      -g, --generate-config
                            generate config file with placeholder values
      -e, --endpoint ENDPOINT
                            simplex endpoint


#### With Docker


    [user@devnode:~]$ git clone https://github.com/MulliganSecurity/grafana-simplex-alerter.git
    cd grafana-simplex-alerter
    docker build . -t simplex-alerter


For the rest of the tutorial I will show the docker commands. If you installed the alerter using nix, simply replace "docker run simplex-alerter" with simplex-alerter.
Do take note of the following:

- In docker run the "--rm" parameter will be used to automatically destroy the container after it's run. If using from nix you can disregard it.


### Running the alerter

#### Create your invite links

Now, on your phone or on your desktop create a group to receive alerts.

##### Create a group
In this tutorial I will create the "clandestineInfraAlerts" group and invite anyone from within my organization that needs to receive those alerts.

![](create_group.png)


![](name_group.png)


I will now need to create an invite link so the alert can send messages there:

![](group_link.png)


#### Generate a basic config file

To get started we need a basic config file to fill out with our alerter information:


    [user@devnode:~]$ docker run --rm simplex-alerter -g > config.yml
    [user@devnode:~]$ cat config.yml
    alert_groups:
    - invite_link: https://simplex.chat/contact#/?v=2-7&sm...
      name: alert_group0


##### Configure the alerter


Update your config file with the invite link you created earler and set the name to your group name

    [user@devnode:~]$ vi config.yml

so it looks like this:

    [user@devnode:~]$ cat config.yml
    alert_groups:
    - invite_link: https://simplex.chat/contact#/?v=2-7&smp=smp%3A%2F%2FUkMFNAXLXeAAe0beCa4w6X_zp18PwxSaSjY17BKUGXQ%3D%40smp12.simplex.im%2FlOgzQT8ZxfF3TV_x00c0mNLMFBDkl6gj%23%2F%3Fv%3D1-4%26dh%3DMCowBQYDK2VuAyEAypkpAgfmsShNThQBGvPXxjBk8O03vKe1x0311UHhK3I%253D%26q%3Dc%26srv%3Die42b5weq7zdkghocs3mgxdjeuycheeqqmksntj57rmejagmg4eor5yd.onion&data=%7B%22groupLinkId%22%3A%22EfuyLGxGhsc0iWkqr9NYvQ%3D%3D%22%7D
      name: clandestineInfraAlerts

##### Start the alerter

Run the container

    sudo docker run -v $(pwd):/config --network="host" --rm simplex-alerter -c /config/config.yml -e 127.0.0.1:1337

It will connect to the simplex-chat client we started earlier. You can check the metrics to make sure it's running by checking the metrics page

    curl http://localhost:7898/metrics | less
    # HELP python_gc_objects_collected_total Objects collected during gc
    # TYPE python_gc_objects_collected_total counter
    python_gc_objects_collected_total{generation="0"} 600.0
    python_gc_objects_collected_total{generation="1"} 15.0
    python_gc_objects_collected_total{generation="2"} 0.0
    # HELP python_gc_objects_uncollectable_total Uncollectable objects found during GC
    # TYPE python_gc_objects_uncollectable_total counter
    python_gc_objects_uncollectable_total{generation="0"} 0.0
    python_gc_objects_uncollectable_total{generation="1"} 0.0
    python_gc_objects_uncollectable_total{generation="2"} 0.0
    # HELP python_gc_collections_total Number of times this generation was collected
    ...
    ...
    ...


### Configuring a grafana endpoint

Browse to the contact points page in grafana and click on "add a new contact point"


![](contact_points.png)


Fill out the details, don't forget the path after the URL, it **must** match your simpleX group name  as it's how the alerter knows where to deliver the messages

![](new_contact_point.png)

And click on Test!


Result:


![](alert.png)


### Configuring an actual alert

Now that our webhook is ready we can configure an actual alert!

Go to alert rules and click on "create a new alert rule"

![](create_alert.png)

Now you need to configure it:

![](test_alert.png)

- set a name
- make sure it uses the Prometheus data source
- add an alerting condition (must be True to fire)
- Use the preview button to check that the alert would indeed be firing upon creation

#### A more useful alert

Since the goal of this alert is to test the end to end notification pipeline it is built to be a minimal example.

A more useful one would be the following for monitoring uptime of a server based on it's metrics endpoint reachability:

    absent(up{application="node",instance="myserver5496497897891561asdf.onion"})

#### Keeping things Tidy

Your alert must live in a folder:


Click on "New folder"
![](test_alert2.png)

and add a folder name

![](test_alert_2_1.png)

#### Alert Evaluation

![](test_alert_3.png)

Alerts are regularly *evaluated* by grafana. Which means grafana will run the query at specific intervals and fire the alert if the conditions specified are filled.


Let's imagine that we want to keep a close eye on this alert, as if 0 ever not equals 0 then we will have big problems.


First configure an evaluation group and set its name. You can leave the one minute evaluation timing as it's the shortest.

![](test_alert_3_1.png)


You can leave all other options to their default.


#### Alert Contact Point

Now we are going to use the alert contact point we created earlier:

![](test_alert_4.png)

Choose from the drop-down menu the web hook we configured.


#### Alert Message

When the conditions are fulfilled you want an information to be conveyed: that's where you configure it

![](test_alert_5.png)


#### And now let's blow up some phones

Save the rule and exit, in 1 minute it will be evaluated and you will receive a a notification

![](test_alert_6.png)


And here's the alert:

![](test_alert_7.png)


# Configuring those systems as systemd services

To turn the simplex-chat and the alerter into systemd services, you only need to create two files:

## /etc/systemd/system/simplex-chat.service
```text
 vim /etc/systemd/system/simplex-chat.service
 cat /etc/systemd/system/simplex-chat.service
```

    [Unit]
    Requires=tor.service

    [Service]
    ExecStart=simplex-chat -d /etc/alerter_clientDB -p 1337 -x
    [Install]
    WantedBy=multi-user.target

## /etc/systemd/system/alerter.service

```text
vim /etc/systemd/system/alerter.service
cat /etc/systemd/system/xxx
```


    [Unit]
    Requires=simplex-chat.service

    [Service]
    ExecStart=docker run --rm simplex-alerter -c /etc/alerter-config.yaml -e 127.0.0.1:1337

    [Install]
    WantedBy=simplex-chat.service

## Enable the services
Now enable the services

    sudo systemctl daemon-reload
    sudo systemctl enable --now simplex-chat.service
    sudo systemctl enable --now alerter.service

# Conclusion

We now have an easy way to set multiple alerts to different groups based on our monitoring system, furthermore those alerts will be sent over tor through a privacy-preserving messaging system.