HackWeek 2019

Last week team Thinkst downed tools again for our bi-annual HackWeek. The rules of HackWeek are straightforward:
  • Make Stuff;
  • Learn;
  • Have fun.
We discussed HackWeek briefly last year:
Our HackWeek parameters are simple: We down tools on all but the most essential work (primarily anything customer-facing) and instead scope and build something. The project absolutely does not have to be work-related, and people can work individually or in teams. The key deadline is a 10-minute demo on the Friday afternoon. The demos are in front of the rest of the team, and results count more than intentions.
We pride ourselves on being a "learning organization" and HackWeek is one of the things that help make that happen. It's always awesome seeing a software-developer solder their first board or seeing someone non-technical write their first lines of python.

Project highlights this year: 

Az used the SimH simulator to run an obscure Soviet Mainframe (the BESM-6):

Eventually, he had the mainframe pushing the keys on a Pokemon game running in a simulator using Fortran (because, of course!). Along the way he had to deal with Russian manuals and, uh, learning Fortran.

Mike built "Incubator" to manage our stock of Canary raw materials:

Riaan threw in a physical hack to make sure fewer cars were scratched when parking in the basement, and built a physical status monitor for our support queues:

Keagan decided to combine ModSecurity hackery & testing to add in extra protection onto our new flocks consoles:

Haroon took a crack at some d3 fiddling to create art (and inspectable graphs) with our customer logos but sadly this can't be shown :)

Quinton used an Arduino and some jury rigged hardware to keep better tracking of scores for the indoor cricket games held in the Jhb office:

Jay used the incredible work by the openDrop people to create a fake Airdrop service on our Canaries.

So, configure it through our Canary console:

Once the bird loads, it becomes visible to people in its vicinity using Airdrop on their Macs or iPhones:

After an attacker submits a file, the Canary alerts as usual:

Donovan flirted with Flask and Python to make another interface to download Canarytokens.

Danielle dived into Verilog to get her Quartus II FPGA to voice-print individuals:

Marco embedded draw.io into our Phabricator setup to allow us phriction-phree-phlowcharting:

Max broke out Unity to build a game for the Occulus:

Matt wrote a game for his Nintendo switch:

Bradley attempted to give Apple designers aneurysms by affixing a travel LCD to his laptop for a MacGyver'd screen extender:

Nick and Anna paired up to create a hardware/software combo. They used RaspberryPi's, a pack of blank credit cards, stepper-motors and toothpicks? to create a 9-digit split-flap display for the CapeTown office.


(I would have totally given it the prize for "most soothing sound made by any HackWeek project, ever".)

Adrian combined the Canary API and his nostalgia for CLI interfaces to make a lo-fi Canary Console:

Yusuf built an app/bot that could be summoned on Twitter to compile tweet-storms to blog posts (and learned the harsh lessons of unforgiving HackWeek deadlines.)

"A fun time was had by all" (tm)

Canary Alerts, Part 2 - Bonus Flavours

Canaries and Canarytokens are tripwires that can alert you to intrusions. When alerts trigger, we want to make sure you get them where you need them. While our Slack integration is cool, you might prefer to send alerts through your SIEM. Or to a security automation tool. Maybe you want to leverage our API to integrate Canary alerts into a custom SOC tool. Want to turn a smart light bulb red and play the Imperial March? You could do that too.
IFTTT screenshot of an applet that makes a light blink when a Canary alert is received

Your way or the highway

We often puzzle at products that require customers to totally revamp how they do things. We never presume to be the most important tool in your toolbox, which is why our product is designed to be installed, configured, and (somewhat) forgotten, in minutes. We’d rather disappear into your existing workflow, only becoming visible again when you need us most.

Our customers dictate where and how they see our alerts. To enable this, we provide a wide variety of flexible options for sending and consuming alerts.

By default, you’ll get alerts on your console...

In your email…

...and as a text message.

And that’s not all…

For those of you wondering where the SIEM love is at, don’t worry. We can send syslog where you need it, as secure as you need it. A quick email to support@canary.tools with the details for your syslog endpoint will get the logs flowing in no time.

For Splunk fans, we have a Splunk app that works with both Splunk Enterprise and Splunk Cloud. Details on installing and configuring the Splunk app can be found in our help documentation.

Email can also be an easy way to integrate Canary alerts with other tools. For example, most task and ticket management systems support creating tickets or tasks with an email. ServiceNow, BMC Remedy are common in large enterprises, but what about something simpler, with a free use plan? Something you could set up in minutes, like a Canary?

Build a SOC dashboard in 5 minutes, for free

We’re going to use Trello as an example of how flexible email can be for alert integration.

It turns out, Trello aligns well with the spirit of simple, fast and ‘just works’. Finding the custom email address that allows new card creation takes just a few clicks. Then, paste it in the email notifications list in your console settings and you’re good to go. Canary alerts will start showing up in Trello on the board and list you chose to attach the Trello email to.

A simple three-list configuration should work for basic alert triage: new alerts, acknowledged (being worked) and completed.

Any Canaries or Canarytokens triggered will result in a new card dropping into the New Alerts column immediately. Drag the card over to the Ack column and assign it to someone and Trello can notify them (based on your Trello configuration). Each card contains the full content of the alert and supports comments and attachments.

Once the investigation is complete, the card can be dragged over to the final column.

And, of course, an API

Anything you can do or view in the Canary console can be done via our fully documented API. It’s possible to control Canaries, create Canarytokens, view alerts, manage alerts and much more. Following is a simple bash script demonstrating how to grab a week’s worth of alerts and dump them into a spreadsheet-friendly format (CSV). Also available as a gist.

# Create a CSV with the last week's worth of alerts from your Canary console
# Requires curl and jq to be in the path

# Set this variable to your API token
export token=deadbeef12345678

# Customize this variable to match your console URL
export console=ab123456.canary.tools

# Date format (one week ago)
export dateformat=`date -v-1w "+%Y-%m-%d-%H:%M:%S"`

# Filename date (right now)
export filedate=`date "+%Y%m%d%H%M%S"`

# Complete Filename
export filename=$filedate-$console-1week-alert-export.csv

# Base URL
export baseurl="https://$console/api/v1/incidents/all?auth_token=$token&shrink=true&newer_than"

# Run the jewels
echo Datetime,Alert Description,Target,Target Port,Attacker,Attacker RevDNS > $filename
curl "$baseurl=$dateformat" | jq -r '.incidents[] | [.description | .created_std, .description, .dst_host, .dst_port, .src_host, .src_host_reverse | tostring] | @csv' >> $filename

Taking Flight

Like everything else Canary-related, alerts should be dead simple and easy to work with. Though alert volumes from Canaries are incredibly low (customers with dozens of Canaries report just a handful of alerts per year) we include a bunch of options to cover everything from common requests to esoteric requirements.

If you have any clever ideas on integrating alerts or consuming them, we’d love to hear them! Drop us a message on Twitter @ThinkstCanary or via email, support at canary dot tools.

Alerts Come in Many Flavours

‪If you force people to jump through hoops to handle alerts, they’ll soon stop doing it 🤯‬
‪Canary optimizes for fewer alerts but we also ensure that you can handle alerts easily without us.‬ ‪So it takes just 4 minutes to setup a Canary but far less to pull our alerts into Slack‬.

By default, your console will send you alerts via email or SMS, but there are a few other tricks up its sleeve. It is trivial to also get alerts via webhooks, syslog or our API.

This post will show you how to get alerts into your Slack. The process is similar for Microsoft Teams and other messaging apps that use webhooks for integration. It’s quick, painless and super useful.

(This post is unfortunately now also bound to be anti-climactic - it’s going to take you longer to read this than to do the integration).

Did you know how easy this can be?
The Canary Console can integrate with Microsoft Teams and Slack in seconds and with a few more steps, can integrate with any other webhook-friendly platform. The process is similar for most platforms, but here’s how it looks for Slack.

  1. Enable Webhooks in your Canary Console settings.
  2. Click Add to Slack, choose the channel to drop alerts into and click Allow
  3. That’s it! You now have Canary alerts showing up in Slack. Elapsed setup time? About 30 seconds.

Now that you’ve got Canary alerts integrated into Slack, you can actually interact with them. When an alert shows up in Slack, you’re given an option to mark it as ‘seen’, which removes it from the queue of unacknowledged alerts.

You can even permanently delete it from inside Slack - no need to even log into the console. Here’s a peek at what the process looks like.

Why we’re so keen to get alerts out of the console

You’ve got enough consoles already. Heck, you may even have multiple "single panes of glass". We’re not interested in adding our console to the already long list of security tools to check on a daily or hourly basis. We realise and deeply understand that it’s not about us, it’s about you. That’s why we make it so easy to pull Canary alerts into your existing workflows.

Live in Slack? We’ll alert you there.
Live on your phone? We’ll text you.
Live in Outlook? We’ll drop you an email.
Want all-of-the-above, just in case? We can do that too.

I'm Running Canaries, but...

...what if someone finds out?

Do attackers care if there are canaries in my network?

People wonder if they need to hide the defensive tech used on their networks. Like all interesting dilemmas, the answer is nuanced.

In defense of obscurity

In any discussion about obscurity you will almost certainly have someone shout about “security through obscurity” being bad. As a security strategy, obscurity is a terrible plan. As an opportunity to slow down or confuse attackers, it’s an easy win. Every bit of information an attacker has to gather during a campaign gains the defender time.

This is very much a race against time. No breach happens the moment a shell is popped or SQL injection is discovered. Attackers are flying blind and must explore the environments they’ve broken into to find their target. Defenders can seize the opportunity to stop an incident before it becomes a breach.

It is often true that attackers typically operate with a fuller view of the chessboard than defenders. However, when environments are running with defaults, they meet attackers' expectations. Defenders who are able to introduce unexpected defenses or tripwires to this chessboard can turn this asymmetry to their advantage.

What are defenders so afraid of?

Defenders tend to be concerned that their security products:
1. could, themselves, be insecure
2. may not work as expected when attacked
3. could possibly be evaded if attackers are aware of them
4. will simply eat labor without producing much value

Pardon the pun, but this isn’t a very defensible position to be in.

We know very well from Tavis Ormandy, Joxean Koret, Veracode, and others that security software and products are notoriously insecure. According to Veracode, in fact, they come in next-to-last place.

If that’s not discouraging enough, the average security product is difficult to configure, challenging to use and requires significant resources to run and maintain. There is no shortage of reasons for wanting to hide the details of security products in use.

The Importance of Resilience

Let’s consider the flipside for a moment: offensive tools and capabilities. There’s a solid argument for keeping offensive capabilities secret. For example, the zero-day vulnerabilities used by Stuxnet wouldn’t have been as effective if they had been previously reported and patched. For some time, military aircraft have had advantages because details of their capabilities or even their very existence were closely guarded secrets.

Defenses are a very different case, however. These must stand the test of time. They are often visible to outsiders and similar to defenses used by other organizations. Vendors, after all, will advertise their products in order to sell them. Defenses need to hold up under close scrutiny and be robust enough to last for years without needing to be replaced. The argument for keeping them secret could perhaps slow down an attacker but not by an appreciable amount. 

Ultimately, defenses need to work regardless of whether attackers are aware of their presence.

Attackers Discover Your Secret: Canaries

It’s okay - we’ve planned for this moment. We spent significant effort ensuring Canaries are unlikely to ever be the ‘low hanging fruit’ on any network. We’ve also made architecture choices that minimize blast radius should a Canary ever be exploited (e.g. we won’t span VLANs, ever). In short, compromising a Canary would be very difficult and will never improve an attacker’s position.

With a direct attack against a Canary unlikely to prove useful, let’s look at the attacker’s remaining options.

Scenario 1: The attacker has no idea you’ve deployed Canaries and Canarytokens. Since they’re not expecting honeypots, they’re less concerned with being noisy. They’re likely to trip alerts all over the place, as they run scans and attempt to log into interesting-looking devices.

Scenario 2: The attacker knows you use Canaries, but they’re flying blind. Even though they know honeypots are in use, they don’t know which are real and which are fake. This presents them with a dilemma - being sneaky is a lot more work, but they still need some way of exploring the network without triggering alerts. It’s likely to be in the attacker’s best interest to find a different target.

An unexpected bonus we never planned for is that Canaries are super scalable. Many customers start with five or ten and grow to dozens or hundreds. Stepping back into the attacker’s shoes - are you on a network with five or five hundred? Has this organization deployed a hundred Canarytokens or a million?


The underlying principle is a shift in thinking. Defeatist phrases like, “it’s not a matter of if, but when you get breached” have discouraged defenders. The reality is that the attacker is typically coming in blind, while the defender has control over the environment. By setting traps and tripwires, the defender can tip the outcome in their favor.

We think it’s a very positive and empowering change for the defender mindset. It’s your network - own it and rig the game.

Introducing Rapsheet

We've got hundreds of servers and thousands of Canaries deployed in the world. Keeping them healthy is a large part of what we do, and why customers sign up for Canary. Monitoring plays a big role in supporting our flocks and keeping the infrastructure humming along. A pretty common sight in operations are dashboards covered with graphs, charts, widgets, and gizmos, all designed to give you insight into the status of your systems. We are generally against doing things “just because everyone does it” and have avoided plastering the office with “pew-pew maps” or vanity graphs.

(although the odd bird-migration graph does slip through)

As with most ops related checks, many of ours are rooted in previous issues we've encountered. We rely heavily on DNS for comms between our bird and consoles, and interruptions in DNS are something we want to know about early. Likewise, we want to ensure each customer console (plus other web properties) are accessible.

There are tools available for performing DNS and HTTP checks, but our needs around DNS are somewhat unusual. We want to be able to quickly confirm whether each of our consoles is responding correctly, across multiple third party DNS providers (e.g. Google, Cloudflare, OpenDNS). For a handful of domains that's scriptable with host, but for many hundreds of domains this becomes an issue if you want to be able to detect issues fairly quickly (i.e. within tens of seconds of the failure).
To plug this gap we built Rapsheet, to give us a “list of crimes” of our systems against intermediary network service providers. In this post I'll provide a quick run through of why we built it.

Goal: "zero measurements"

In this post, I am going to dive a little deeper into the thinking behind Rapsheet and why the dashboard behaves in the way it does. Why we aim for zero measurements to be the goal and what this actually means. (Hint: it doesn't mean "take no measurements".)
Much of the thinking was expounded on by this awesome Eric Brandwine AWS reinvent talk:

If you have not yet watched it, you really should. Eric is whip smart and is an excellent presenter.
The key takeaway for this post is that alarms and dashboards will often lead to Engineers and other technicians developing what is known as “alarm deafness”. Primarily studied in the medical world, it describes the situation where operators rely on a raft of metrics with tunable parameters. If the parameters are too tight, the operators learn to ignore the alarm. If they’re too loose, bad things happen without anyone being the wiser. Alarm deafness grows when a specific check or metric is constantly in an “alerting” state. Eric points out that the best way to alleviate alert deafness, is to constantly be striving for “a zero measurement”, because as soon as you see anything other than a zero measurement, you know that action is required.

If you can find those zero measurement metrics then they provide clear tasks for the operations folks, for whom a key objective is to keep all panels in a non alerting state (i.e. zero measurement). With our current setup, I will drop most things I am busy with whenever I see one of the panels in any colour other than green.

A counter example of an actionable measurement is almost anything which provides a count or tally of an expected conditions (even if it’s an error.) For example, we rely on DNS for a range of services and are able to track DNS failures across multiple providers. A poor metric would be a graph displaying how many DNS queries have failed within the last few hours against any of the DNS servers. This graph may look interesting but nothing about the graph would lead to an exact action that engineers could take. Transient network issues lead to queries failing every so often and our engineers aren't expected or authorised to fix the backbone network between our hosting site and the 3rd party DNS servers.

Instead we only track which of our servers are not currently responding correctly to our various DNS queries. With this metric, it becomes a lot easier for us to determine patterns, and therefore understand the root cause of the problems that are responsible for the failing DNS queries. For example, we rely on Rapsheet to tell when a particular DNS provider is experiencing issues, or if a particular server is experiencing DNS issues, or whether one of our domains has been flagged as unsafe. This then leads to the issues being mitigated and resolved more timeously.


At it's core it's straight forward. A long running processes uses asyncio to periodically perform a whole bunch of network tests (DNS, HTTP), pumps the results into InfluxDB and then we build a simple UI in Grafana.

The tests are tied together based on a customer, so we can quickly determine if an issue is customer-specific or more general.

Rapsheet has modules to implement each of its tests. Current modules include DNS checks against four different DNS providers. Different HTTP health checks including reachability, served content and endpoint checks against site blacklists such as Google’s SafeBrowsing. All endpoints are asynchronously checked and the results collated before posting the metrics into Grafana.  

Each time a new panel is added to the dashboard it has gone through a number of iterations on the backend in order to ensure that it keeps to this mindset.

It’s built to be extensible, so we can quickly add new zero-based measurements we want to track. 

A recent example was exception counting. We have as a design goal the fixing of unhandled exceptions. The dashboard has an “exception monitoring” panel that tracks the number of unhandled exceptions across all of our deployed code (on customer consoles and our own servers). Coming back to the notion of focus, it becomes a very clear goal for me: handle the exceptions that are cropping up. Most of the time it involves a deep dive into what is causing the exception and how we should best mitigate those cases. When introduced, the panel would get tripped up with noisy but harmless exceptions (and those panels would just burn red). After a flurry of root cause fixes, and thanks to the goal of driving them to zero, we only get a couple every other day across hundreds of production systems (and work is under way to alleviate those too).


Some of the panels involve performing requests or queries against third party services. For example, we want to know if someone using the Chrome browser is able to login to their console.
In order to "play well with others" and use the third party service fairly, Rapsheet has rate limits (especially since asyncio can push out lots of traffic). Most third party services let you know what these rate limits are so they can be implemented correctly. Interestingly, Google's DNS service doesn’t. Only by practically testing the limits did we figure out what type of rate we should limit our queries against their DNS service.

Location, location, location

Almost all of Canary services run inside AWS. It therefore made sense to move the monitoring out of AWS. We didn’t want to fall victim to a situation where everything looked good to tests inside AWS, but due to AWS issues were silent to the outside world. So we run Rapsheet at a wholly independent provider.

So far, it's proved quite adept at picking up issues in this hosting provider's network...


The dashboard is being projected onto a wall in the operations office which is the main thoroughfare from our desks to the office facilities, so anyone working in the office has to see it when leaving their desks.

(Of course, it’s online for remote workers who are able to keep it open on their desktops if needed.)

Rapsheet is a work in progress, but helps us make sure that things are ticking along as expected.

Introducing the Office 365 Mail Token

Shared passwords, sensitive documents: mailboxes are great targets for attackers. Would you know they were targeted? We’ve got your back! Our Office 365 token deploys to thousands of mailboxes in minutes and alerts you when someone is snooping around.

Why an Office 365 Mail token?

Enterprises have been flocking (ha) to Office 365 for years now and a large number of Thinkst customers are using it. The Canaries will detect attackers on their networks, but nothing lets them know if an attacker has compromised a single mailbox and is snooping around.

Canarytokens are great at becoming high fidelity tripwires in places that other tools can’t easily go. You can quickly head over to https://canarytokens.org to create a token, and then place it in Bob’s mailbox, but how does this work for an entire office? Will it work for an entire org?


The Office 365 Mail token can drop a pre-written, tokened email into multiple mailboxes at once. We insert the emails into mailboxes automatically, so it avoids getting caught by email security filters. We avoid dropping it in the default inbox so users won’t stumble on it accidentally, but an attacker searching for booty can still quickly find it and trigger an alert.

Deploying the Token

To deploy this token, there are a few easy steps.

  1. Log into an Office 365 account that has the proper permissions ( details here ). Bonus - this token also works with on-prem Exchange implementations - see the link above for details.
  2. Log into your Canary Console and choose the Office365 Mail token under Canarytokens
  3. Select the mailboxes to token from the list presented to you
  4. You now have tokened mailboxes, which will be displayed in the list of enabled Canarytokens
  5. Wait for some unsuspecting attacker to stumble upon the email. To test yourself, search for “password reset” and you’re likely to find the gift we left for attackers.

When someone stumbles upon the trap, you’ll receive an alert like this one.

While it's difficult to rule out false positives altogether, we employ a few tricks to avoid them that require no additional effort on your part. First, we place the email in the archive folder, reducing the chance of legitimate users finding this email in their own inbox. Second, because we insert the email directly into the mailbox, we avoid security gateways inspecting tokens directly and creating false positives.

Tokens like this are great for the attacker details they give you, but would also be useful just as a heads up. Someone just searched for password reset emails in Bob’s mailbox. This is probably something you should be aware of.

Wrapup; What's Next?

With the Office 365 Mail token, we’ve gone from some basic token ingredients to something that simply scales to hundreds of mailboxes in the same 3-4 minutes it takes to deploy a Canary. That's it - quick, easy and likely to catch the bad guys.

For more thoughts on Canarytokens, check out our post on the AWS API Key token. The official documentation for Canarytokens is a concise and useful read as well.