USENIX Security Symposium 2019

Thinkst in Santa Clara

Last week Haroon and I found ourselves at the 28th USENIX Security Symposium in balmy Santa Clara. We made the trip from Vegas for Haroon's invited talk at the main event, and I took the opportunity to present at one of the side workshops (HotSec). This is a short recap of our USENEX experience.

Neither Haroon nor I have attended USENIX events previously, despite over 20 Black Hat USAs between the two of us. What's worse, we both used to read ;login: regularly, and the research coming out of USENIX Security is typically thorough. When this opportunity presented itself, we couldn't turn it down.

Drawing comparisons between USENIX and Black Hat/DEF CON is a bit unfair as they have different goals entirely, but given the consecutive weeks they run on, I think it's ok. Compared to Black Hat/DEF CON, obvious differences are the smaller scale (there were fewer speaking rooms and smaller audiences), primarily academic focus, and no side events that we saw. (Black Hat and DEF CON usually have have a ton of parties and events in parallel with the main conference.) USENIX is billed as a place for industry and academia to meet, but most of the talks were academic. I'll come back to this shortly.

The event

Event organisation was slick, and the venue was a welcome respite from the Vegas casinos. No one was screwing around with the WiFi (at least, not detectably...) and the AV just worked for the most part. Session chairs played their role admirably in corralling speakers and audiences, keeping a close track on time and chiming in with questions when the audience was quiet.

USENIX was much more sedate than either Black Hat or DEF CON. No Expo area (the handful of sponsors each had a table in a small foyer), no special effects, no massive signs, no hacker cosplay, no one shouting in the corridors, no visible media, no gags or gimmicks. The list goes on. It just reinforces how much of an outlier the Vegas events are.

Our talks

Haroon's talk used our Canarytokens as a lens to explore how defensive teams need to embrace quicker, hackier solutions to win. The central thesis being that the current battle field is too fluid, which favours the lighter more agile M.O of attackers. We'll publish more details on this in the following weeks.

My talk was a brief exposition on our take of honeypots as breach detection vs observers, with the experience of running Canary to back it up. In the next few days we'll publish another post here delving into this.

.edu dominates

Turning to the talks, virtually all of them were densely packed with information. The acceptance rate was something like 15% (115 from 740 submissions), and (as typical for academic conferences), authors submitted completed works. To rise above the pack, papers must cover lots of ground. Accepted authors only have a 20-minute presentation slot to talk about their work and take questions. It means the authors fly through the highlights of their research, frequently leaving out large chunks of content from the presentation and instead deferring to the paper to make the time limit. That's at odds with Black Hat's 50-minute slots which usually include a weighty background section (I recall us having to fill 75 minutes at Black Hat at a point.)

Abbreviated talks also mean that sometimes the speakers just have to assume the audience has a background in their topic; there's simply not enough time to cover background plus what they did. In those talks you can expect to be reading the papers as you can quickly be left behind.

In contrast to Black Hat's many parallel tracks with breaks between every talk, USENIX ran three parallel tracks with up to five 20-minute talks in a row. This meant that you could potentially see 53 talks if you sat in one of the main speaking rooms for the three days. It's a firehose of new work, and it was great.

Academia dominated the talks. Of the 36 talks I saw, just two were purely from industry (both were from the Google Chrome folks). I suspect the completed paper requirement serves as a natural barrier against submissions from industry. A completed paper is the output of regular academic work, and finding a publication venue afterwards is a stressful but comparatively minor part.

For industry folks, a research paper isn't a natural goal; they'd need to set aside time for this side project. It's easier to hit an earlier result (like a bug or technique) and submit to an industry conference. Since career progression isn't tied to paper publication, there's much less incentive to write one.

In addition, there's also very different standards for the talks. It's clear that merely finding a bug or figuring out a way to find bugs isn't going to get your paper accepted. Virtually every paper had extensive surveys or evaluations. At USENIX there's a big push towards gathering data, either in the form of measuring prevalence or in designing tests for evaluating new defences, and then making that data available for others to analyse. Collecting data is a large part of the research effort. Contrast that with a Black Hat talk which describes (say) a new heap manipulation technique demonstrated with a specific bug. The bug's prevalence will get a cursory slide or two, but talks are accepted if the exploitation techniques are novel.

Talk skim

In terms of the talks, it was a deluge of content. Speculative execution attacks had a bunch of attention, with new variants being demonstrated as well as deepening of the fundamental understanding of the attacks. One of these highlighted that not only is execution speculative, but so are other operations like memory load. The authors demonstrated a speculative load attack, in which an attacker can leak physical memory address mappings. This category of research is now squarely being led by the academy.

There was a talk on threats in the NPM ecosystem, showing how the average NPM package trusts 79 other packages and 39 maintainers. That's food for thought when worrying about supply chain attacks and software provenance. The authors also showed how a tiny group of 100 maintainers appear in 50% of the dependencies (i.e. a small group to subvert, to affect a huge swathe of dependents). A later talk on In-Toto, a software supply chain protection tool provides some limited hope for finding our way out of the supply chain mess.

I enjoyed the ERIM talk, which claims a way to achieve in-process memory isolation. This could be used to let a process compute cryptographic results over a key stored in the process' own memory, but still prevent the process from reading the memory. Kinda wild to think about.

There was one honeypot-related talk I saw. The authors realised that honeypot smart contracts are a thing; apparently scammers deploy contracts which appear to have flaws in them, prompting folks looking for smart contract vulnerabilities to send Ether to the contracts in the hopes of exploiting the contracts. However the flaws are mirages; it's an example of a scam that takes advantage of other scammers.


There were further talks on crypto (cryptography, cryptographic attacks, and cryptocurrencies), hardware, side-channels galore, web stuff, and much much more. A good portion dealt with building better defences, which is in further contrast to Black Hat's primarily offence-oriented talks.

We hope to return to USENIX soon, while the time away was significant it was well worth it.

PS. Seeing Rik Farrow in person was a delight, exactly what you'd imagine he might look like. Sandals, Hawaiian shirt and ponytail!

One Month with Thinkst

Recently, I was faced with a career dilemma.
  • Go back to the enterprise and be a CISO
  • Take a gig that would be part research, part bizdev
  • A research and writing gig
  • Consulting/Advisory work
  • Join another vendor
SPOILER: I chose the last one… but why?

Why Thinkst?

Thinkst Applied Research is the company behind the popular Canary product. Though they started off as more of a research firm that would build various products, the Canary product took off and has become their primary focus.

They are a moderately-sized company of enthusiastic industry veterans, developers and engineers that love to learn and try new things. They’ve managed a sort of startup nirvana: bootstrapped with a popular product that customers openly love and a great company culture. When Haroon pitched me the idea of joining to help out, I was immediately flattered, excited and skeptical.

I knew Canaries were wildly successful. It’s kinda hard to ignore. Especially hard to ignore when you’re working product marketing for another vendor, trying to figure out how to recreate that same fierce customer/brand loyalty and excitement. The chance to join a company that had already figured this out… yes. Very much yes, I wanted to be a part of that.

Though I knew much about the product, what did I know about the company? Not much, I had to admit. I knew that there were 10–15 employees. I knew they had never taken funding. Oh yeah — and they’re based in South Africa.

I had some reservations.
I’ve never been to the continent of Africa, much less South Africa. Not that they needed me there.

I have built a measure of trust with Haroon over the years. We’ve been chatting regularly over the last five years and we were both very much on the same page when it came to principles and InfoSec. He clearly knew how to execute on ideas. I was down for a chat at the very least.

My key reservations related to the ability of the company to support me and the distance between us (literally, not metaphorically). Specifically:
  1. Was this company large enough and stable enough to meet my salary needs without some sort of weird comp plan arrangement (note: I’ve never been a dedicated sales guy, so all comp plans are ‘weird’ to me)
  2. Though I was more than comfortable working remotely, being the only US employee of a South African company was super remote. How remote?
If you think my travel time to them looks bad, imagine what half the company goes through when attending Black Hat USA (another 5 hours West of me).
He had sold me on his product, but now, Haroon set his mind on selling me on his company. It was no less compelling. While Thinkst is small in employee count, they’re certainly not in terms of revenue, profits or customers. I’d say they handily beat 99% of the rest of the security industry by many metrics. For example, their revenue-to-employee ratio is probably double or triple the average I’ve seen in vendor-land. Let’s just say my concerns were addressed satisfactorily. #HumbleBrag #NewEmployee #BrowniePoints #Shameless

The distance was still a concern, however. Doing the math in my head, I realized Cape Town was six hours ahead of me. In fact, all of South Africa is one timezone — South Africa Standard Time (SAST). When I start my day at 9am in Eastern Time (the closest the US gets to SAST), the folks in Joburg and Cape Town are almost done with their day. Except… they’re not.

Though they have customers all over the world, a large chunk of them are in the US and a large chunk of those are in California. PST is an additional 3 hours West of me, so they have zero overlap with a normal 9–5 workday in South Africa. That means it has become fairly normal for part of the Thinkst team to work late. I was well aware of this — while one of my primary concerns was the distance between us, I knew that a big part of the attraction of hiring me was the fact that I wasn’t in South Africa.

Yes, in addition to my charming sense of humor and scruffy visage, I was in that sweet EST time zone. The same time as New York City — only 3 hours off from Silicon Valley, but still decently overlapped with South Africa and the United Kingdom. In addition to my time zone, there’s the fact that I can attend conferences for a tiny fraction of the cost and effort it takes the team in South Africa to get over to this continent.

This was a tough choice. So naturally, I asked my family for help. I built a “help me pick my new job” presentation and presented it to them.

Hmmm, looks like some visual bias occurred here? A coincidence, surely.

After some collective eye-rolling and deep sighs, my personal board of directors agreed to grant me five minutes of their attention. They came out of it Team Thinkst all the way (and surely, with new appreciation for my presentation skills).

My first week

As with most jobs these days, especially as a remote employee, the first thing you get access to is email and other corporate/backoffice systems. They also mentioned they’d be sending me some things. I was actually in Indianapolis for CircleCityCon with my kids when the packages started arriving. After a few days, my wife messaged me. “It’s looking like Christmas on your side of the table!”

Sure enough, I came home to:
  • Stickers
  • Business cards
  • A GoRuck GR1
  • A new MacBook Pro
  • A custom Canary Space Pen (I hear it may be possible for some of these to get into non-employee hands…)
  • A custom Thinkst Canary-themed WASD keyboard
  • 3 Canary t-shirts
  • 1 Canary hoodie
  • A free Audible subscription (SQUEE — big audiobook fan here!)
  • 5 Canaries
  • and a Partridge in a Pear Tree
This is a BEAST of a backpack. Pretty sure it can stop bullets.
My wife was right, it was like Christmas morning. What I was most excited about, however, wasn’t the cool swag or mechanical keyboard.

I had heard so much about these Canaries, I was ready to dig in and see what all the fuss was about. A few minutes after I started setting them up, it occurred to me that I should capture this moment. Obviously, it should be no problem for the Thinkst staff to set up a Canary in five minutes or less — they’ve been putting on Canary demos for years! A new employee with no previous experience with these devices was a perfect opportunity to see if the product matched the writing on the tin.
Image result for video coming soon
Video coming as soon as I finish learning how to use iMovie.
Side note: I previously didn’t have a positive impression of DHL, but now, I’m seriously impressed with them. They not only regularly get stuff to me in 2–3 days from South Africa, they will pick up packages from my house, to deliver to South Africa in the same 2–3 days!


So much is said about company culture, but it often comes off as forced or is used as a facade. If there’s a Foosball table, you shouldn’t be judged a slacker for using it. Culture isn’t a ‘perk’ or tool that exists to balance out dysfunctional management and inefficient processes. A snapshot view of how Thinkst does things probably looks like the most hipster, trendy SF startup thing ever. When you get to know everyone and start working with them, you realize they come by all of it honestly.

Email — Sales and support use it to communicate with customers, but that’s it. I had to go check just to make sure what I’m saying is accurate, but I’m exactly at the one month mark and I have yet to receive a single internal email from anyone. As far as I can tell, there’s no employee handbook or rule that says we can’t use email to discuss things, we just use Slack for all that.

Slack — Thinkst is far from the only organization leaning heavily on Slack for internal communications, but I found it interesting how Thinkst organizes things there. There’s no #general and no #random, so I was a bit off-balance at first. Once I got used to the organization, it all made sense, including Thinkst’s flavor of ChatOps, which is very impressive. My favorite channel is #learning. I had been wondering how a staff of 15 manages and supports north of 600 customers. Good, efficient communication and smart use of Slack’s integration features.

Bi-weekly Tech Talks — I’ve been places that have talked about doing things like this and occasionally did them, but they’d often cancel them as ‘Tech Talks’ were a low priority item in management’s eyes that took up an inconvenient amount of time. At Thinkst, they always happen and they’re recorded, so anyone that misses one can go back and watch it. I find myself really looking forward to these at Thinkst and can’t wait to do my first one.

Meetings — For at least my last 4 jobs, I share a link to Jason Fried’s epic Ted talk, Why work doesn’t happen at work. I haven’t sent this video to anyone at Thinkst. Aside from customer demos, the Bi-weekly Tech Talks and the odd spontaneous chat, there aren’t any meetings. I think the devs do standups. Looking at others’ availability on the corporate calendar, I can confirm that I’m not just being shielded from meetings so the ‘new guy can settle in’. They generally don’t exist, because they’re largely unnecessary due to other positive aspects of the culture here. 

Decisions can be made without meetings. Announcements can be made without meetings. Basically, anything other companies have meetings for, happens asynchronously in Slack, where it doesn’t disrupt anyone’s workflow. Coworkers and management often take hours to respond. There’s no ‘ego’ here creating a culture of fear that makes employees feel like there’s a race to respond.
Can a company be both laid back and super productive?
It sounds odd to say it this way, but Thinkst manages to give a laid-back, calm vibe. At the same time, everyone’s busy and you can see that. The way several of the Slack channels are used by support, customer success, R&D, there’s a visible ‘exhaust’ that somehow makes completed work visible and tangible.

A new feature was just rolled out. Cool, I remember seeing chatter about that last week after a customer suggested it! There’s an issue — support works it, finds the problem and fixes it. Someone else suggests documenting it in the knowledge base, since this has happened two times before. Documented. Done.

Work/Life Balance — In contrast to this visible productivity, several key employees (I suppose everyone’s key when there are only 15?) were on vacation when I joined, or soon after, with no negative impact I could see. I’ve been known to ‘forget’ to take vacation at other jobs, often leaving with dozens or even hundreds of unused hours accumulated. Here, I’ve already found myself thinking about when and where I want to use mine and looking forward to it. 
Perhaps it’s just coincidence that two senior folks were on vacation right around the time I joined, but I found it refreshing and encouraging. I’ve been burned out several times in my career and was worried about joining an organization with an ‘embrace the grind’ attitude, bragging about how many nights and weekends they work. 
Not that people here don’t ever work nights or weekends, but no one treats it as bragging rights or some badge of honor. Sometimes shit breaks in the night and you have to fix it. I can’t say I’ve seen any scrambling or overtime due to poor planning or management so far.

What Sucks?

#WaterCooler — I generally get busy quick when I join a company. I’m the kind of employee that can always find things to do and doesn’t require much direction. I do wish I could have had more time to talk to my co-workers and get to know them before diving in. When everyone else is twenty-one hours of travel away and six hours ahead, that’s just the reality.

The good news is that Black Hat is only a month away and I’ll get to meet a lot of the team there and spend more time with them. 

Looking from the other perspective, this is my fifth job working out of my home office and I think I’d have a hard time adjusting back to cubicle life. For me, the benefits of working from home far outweigh the drawbacks.

#TooMuch? — I’ve read Trevor Noah’s book. I know who Elon Musk is. I enjoy Hugh Masekela’s music. That’s about the extent of my knowledge of South Africa. I tend to use humor to connect with people and that can be hard to do when there are large cultural differences. I use a lot of self-deprecating humor, which is always easy and convenient to reach for as the lone American in the company. I live in the South. In the bible belt. It is July 3rd as I write this and fireworks are going off. A lot of the stereotypes are true, especially here.

I’ve already posted pictures of pickup trucks with offensive slogans. Too soon?

But is it funny? Is it awkward? Do I sound like a cartoon? I’m not sure yet. I’ve often heard comics talk about how necessary it is to research cultures and shift their material for shows on an international tour. The difference between a laugh and a riot can be small in some places. 

I just need more time to settle in, I suppose. All I can do is observe and hold back until I grok enough to not make an ass of myself. Hopefully, I haven’t already. Currently, I’m trying to convince everyone that it’s okay to make fun of me for being old. I feel like I’ve earned it by not dying yet.

What’s Next

I’m going to be doing a lot of advocacy and sales support at Thinkst. There are opportunities to do a ton of other stuff as well, which makes me happy. Soon, I’ll be digging into the API and looking for novel uses for CanaryTokens. Haroon and I are delivering the ending keynote for Virus Bulletin (London, early October). I’m already comfortable giving demos, so let me know if you’re interested. Probably a lot more blog posts. I’m a bit behind on my writing.

Just a few drafts…
Also, if you’re going to Black Hat or DEF CON, I’ll be at both. We will be at Booth #474 at Black Hat and you’ll likely find me at the Aviation, Medical Device or Biohacking villages at DEF CON.

When document.domain is not equal to document.domain


One of our most popular Canarytokens is one we call the "Cloned-Site Token". Essentially, we give you a tiny piece of JavaScript to add to your public webpage. If this JS is ever loaded on a server that doesn't belong to you, it fires an alert. You can be alerted at an email address or webhook in the free version, or to your SIEM, slack channel or a bunch of other alternatives in the paid version.

The Cloned-Site Token is super useful at catching Phishers who duplicate your website as a pre-cursor to an actual phishing attack.

A notification that the website from was now running on

The Issue

Recently, a financial services customer was periodically getting alerts where the Cloned-Site domain matched their actual domain. This was unexpected, as the token explicitly should only trigger if the domains are different. In other words, the token for should only fire if the page is loaded at a different URL, but in this case the alert was firing even though the page was (supposedly) loaded at the legitimate URL

First thing to do was to investigate the alert:

Date: Thu Jun 20 2019 08:36:12 GMT+0000 (UTC)
Original Site:
Cloned Site:
Headers:      Accept: image/webp,image/apng,image/*,*/*;q=0.8
     Accept-Encoding: gzip, deflate
     Accept-Language: en-GB,en-US;q=0.9,en;q=0.8
     Connection: keep-alive
     Forwarded: for=
     Save-Data: on
     Scheme: http
     User-Agent: Mozilla/5.0 (Linux; Android 7.0; VTR-L09) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.89 Mobile Safari/537.36
     Via: 1.1 Chrome-Compression-Proxy

Okay… There are a few interesting things here. Firstly, all the alerts seemed to be coming from mobile devices and more specifically via the “Chrome-Compression-Proxy”. What the heck is that thing? After a bit of Googling it turns out that if you enable the Data Saver feature on your android device it routes all traffic through a Google proxy.

I’m glad you asked. According to peeps at Google, “Data Saver may reduce data use by up to 90% and load pages two times faster, and by making pages load faster, a larger fraction of pages actually finish loading on slow networks.”

Google optimises the page being fetched by performing a bunch of built-in optimisations and using Google servers. Some of the optimisations include:
  • Rewriting slow pages
  • Image replacement in the form of placeholders
  • Disable scripts
  • Stop loading of non-critical resources
Google calls these optimisations Chrome Lite pages ( You can imagine how they do this for HTTP pages, but they recently announced Lite support for HTTPS pages too.

Digging into it

At this point I turned to my mobile device to try and recreate the cloned site alert and after a bit of fiddling I managed to trigger it BUT ONLY ONCE!

For our customer we deployed a small server-side fix to make the token work again but were curious about these Chrome Lite pages. If Google is rewriting my site’s HTML, how are they not breaking SSL? Are they mitm’ing my site?

We hosted a toy site with a bunch of static files and viewed the access logs on page load to see what files came from Google and which were requested from our server.

It turns out that triggering Lite mode for a site is annoyingly difficult. Some things I tried was:
  • Creating an enormously large index page. This should be rewritten right? Maybe?
  • Massive unminified JS files
  • Massive unminified CSS files
  • Including links to content that was being blocked by Lite mode on other sites
One method that did turn out to work consistently was to kill Chrome on my mobile device while on the target page and then reopen Chrome.  (ick!)

Chrome has a nifty feature that allows you to remotely debug your mobile phone’s browser (good luck opening dev tools on your phone). Having the ability to see what network operations were taking place in the browser was great. I could see what items Lite mode would reject and which items it would minify.

Unfortunately, I was still having trouble recreating the Cloned-Site alert we'd seen on the customer's page (I had only ever triggered it once). It took me a few days to realise that the fix I implemented on our backend was blocking me... 

Even with that taken into account, I was still unable to reproduce the triggered alert. (At this point I had spent way too much time trying to trigger the alert / force Lite mode, without any wins).

Then, almost while putting it to bed, we had a happy accident. Without thinking I hit cmd + R to refresh the mobile browser on my desktop via the debugging window and hey! the Canarytoken triggered!

It seems like there’s a flow in Chrome (Lite mode via close/open) that sets 'document.domain' to the empty string "", which is the reason the alert was triggering. (The observant reader would note that our token reported it was running at, that's because we checked document.domain, and reported location.href. The bug means a disconnect between the two.)

So, if you were using Chrome, and your connectivity was bad enough, you'd drop to Lite pages mode, and then it would be possible for the document to be served from ... Chrome on reload? So the document.domain would suddenly be "".


This seems to be pretty unexpected behaviour and is interesting to us for two reasons:

    1. Any site making use of document.domain will have a bad day;
    2. We wouldn't have known any of this was happening without a well-deployed Canarytoken!

This is the second time that Canarytokens deployed by users have found Chrome flirting with the creepy line. In 2018, Kelly Shortridge found Chrome reading files in her Documents folder:

It's the value of both Canaries and Canarytokens. Knowing when things go bump in the night.


We pinged the Chrome team and got this reply:
Max, some of the lite pages are served locally by Chrome itself.  Specifically, if Chrome has an offline version of the page available locally on the device, it will serve that page directly from cache. Since that page is not coming directly from origin, document.origin for those pages is not set.

Developing a full stack… of Skyballs

We like solving problems. Sometimes, we make up new ones so we can solve them. Skyball Pyramids are one such case!

Last year we discovered these amazing Skyballs and decided to make them a regular feature at our conference booths. 

Canary Skyballs
They have just the right amount of heft and weight to make them genuinely fun to play with. Of course, this leaves us with the devilish problem of how to display them...

At Infosec Europe 2018, some of our team attempted to stack them in a pyramid shape.

The problem: Skyballs do not like to be stacked. In fact, they like to roll all over the place uncontrollably, frustrating the person that is attempting to stack them.

Exhibit A
Exhibit B
Note the use of Canary-green duct tape in an attempt to keep them in place. 

So, as RSAC 2019 was approaching we needed a better solution; something that was simple, yet effective. (We could have simply taken a bowl, but have you ever tried to fly with a bowl in your carry-on?)

Last year we purchased an Ultimaker 2+ for the office, and  since then we have printed some pretty awesome (though ultimately useless) things.
Yes Max, he missed the goal because of vision problems
Finally! A moment for our 3D printer to shine. 

The Criteria:
  • Easy and light to transport (it would need to fly with our baggage)
  • Modular (we weren’t sure of how big/small the base needed to be)
  • Simple to print (no complex connections or overhangs)
The Solution:
We created a model of the ball (measure the ball and insert the dimensions; easy-peasy) and then with Andrew Hall's help designed a ring with a simple dove-tail joint that allowed for symmetric assembly (yes, there are other fancier joints we could have used, but the design was time efficient and bulk printing friendly). 


We were able to fit 5 rings on the print-bed at a time, and whilst the print failed on a handful of the connectors (we were experiencing a heat-wave at our office at the time, so warping was an issue; glue to the rescue!), we were able to print 65 connectors (enough for an 8 x 8 pyramid) pretty quickly.
A 4x4 Base!
The simple design worked perfectly at our booth. At the beginning of the conference, we used a 7 x 7 pyramid, and by day 4, with a dwindling supply of Skyballs, we were able to reduce the base size all the way down to 2 x 2. 

Look Ma! No Duct Tape!
If you’d like to check it out, and/or use our design, you can download the STL files here.

When you can’t do awesome things, because of crushing bureaucracy

I’ve sometimes bumped into people who bemoan their broken company cultures with varying degrees of self-awareness. Around 2007, a then-customer heard we were heading to Vegas to speak at BlackHat and said:
You guys are so lucky.. my company won’t let us go to anything like that
At the time I bristled. We worked for months on that research, dedicating many nights and burnt family time before we could stand up and talk. For sure our company celebrated those wins, but it irked me that someone who spent his free time tearing up backroads in a 4x4 felt we were gifted access to BlackHat.

In the intervening decade+, we’ve encountered genuinely broken work cultures. I’ve looked at some of the brokenness and wondered how on earth those environments would ever lead to awesomeness in the face of all of the obvious impediments?

And then, fortunately, I started reading “Skunk Works” by Ben Rich. (I'm not quite finished but so far it's been excellent.)

(Awesome on Audible too)
The book is an amazing account of a pivotal invention in modern warfare: the creation of stealth jets. It’s filled with tiny lessons for any company wanting to build innovative things, but for this post, I want to focus on just one of them: supposedly being crippled by your org / bureaucracy.

Ben had just taken over the running of Lockheed’s Skunk Works from its illustrious founder Kelly Johnson. Knowing he had to perform, but without the halo of his vaunted predecessor, he had to cross his t’s and dot his i’s. A Skunk Work's mathematician (Denys Overholser) brought Ben a 10 year-old paper from a Russian scientist (Petr Ufimtsev) to predict the radar reflectivity of a geometric shape.

Denys convinced Rich that this was the key to radar stealth and they began testing their theories. His famed predecessor, who had built Skunk Works and had a track record of incredible judgement, was against the idea and regarded stealth largely as a waste of time.

Bucking all of this, and still delivering on existing contracts to keep the group alive, they proceed to develop “Have Blue” which turned out to be almost miraculous (and then went on to spur a revolution in the design of bombers, and later fighter jets).

The Original "Have Blue"
There’s an interesting snippet in the book, once the the initial concept had been proven, and it was time to build production stealth aircraft:
(It's a short 59s listen, and describes Skunk Works bureaucratic oversight that was starkly at odds with the core stealth mission.)

While reading that, an arresting thought hit home. It's easy to assume that the famed Skunk Works meant that its employees had a free reign to get the job done. Innovation with no boundaries and an open cheque-book. (There's a thread of research which strongly suggests constraints aid creativity; folks often incorrectly assume the opposite (i.e. to be creative you need unconstrained)).

Bringing it closer to (the infosec) home, you won't search for long to find examples of such incorrect reasoning when discussing success stories.

Taviso finds a boatload of bugs, but it isn’t because he works at Google - Project 0 and is “given” the time to.
He was finding boatloads of bugs when he was bug-hunting for free on open-source projects.
Assigning extra weight to the org, lightens the burden on us. We could be taviso too if we were in P0. Tavis’ disclosure timelines show that this just isn’t true:
Taviso finds boatloads of bugs because he’s Taviso, and he’s worked like hell to become Taviso.
Ben Rich didn’t build the Stealth because he had no constraints. He built it in spite of the constraints, because he was Ben Rich. If you aren’t managing to “build your stealth fighter”, it’s probably not just because your organization is a bureaucratic nightmare. It’s because you're not Ben Rich.

Postscript: There is an important point here that needs to be made. As a company leader, it’s still the smart thing to do, to remove friction for your people wherever possible. This isn’t a get-out-of-jail free card to stifle your people. After all, Kelly Johnson managed to recruit a Ben Rich, and then trained him for 3 decades to make him Ben Rich.. If you don’t have any Ben Richs’ maybe it’s because you aren’t a Kelly Johnson ;>

Post Postscript: Of course this post isn't about being Ben Rich, or Tavis, or inventing a billion dollar business, but it is about knowing yourself, and the excuses we make for not making an impact.

HackWeek 2018

Two weeks ago we ran the second edition of our internal HackWeek, and it was fantastic. Last year’s event was great fun and produced projects we still use; going into this year’s HackWeek we anticipated a leveling up, and weren’t disappointed. We figured we’d talk a little bit about the week, and discuss some of the “hacks”.

Our HackWeek parameters are simple: We downtools on all but the most essential work (primarily anything customer-facing) and instead scope and build something. The project absolutely does not have to be work-related, and people can work individually or in teams. The key deadline is a 10-minute demo on the Friday afternoon. The demos are in front of the rest of the team, and results count more than intentions.

Everyone participated and everyone presented at the Friday demo, including sales, dev, support, back office and yours truly. We strive to keep Thinkst a learning organisation and this HackWeek is one way that we do it. For example, it’s great to see a salesperson taking their first steps in writing Python, and our HackWeek helps make that happen. Here’s a roundup of a few of the notable submissions.

Portable Demo Kit
Bradley showed an early diversion into hardware hacking with his jury-rigged demo station. We often demo Canary over WebEx/GoToMeeting, and he decided to spend his HackWeek upgrading the current webcam setup.

He removed a camera from a non-functioning laptop, added some LED’s for lighting, attached both to a single USB cable, and then kept iterating on packaging until he had a tiny unit that hides in a pocket, but sets up for great overhead shots.
It appears to have cost his kids a few toy arrows, but was totally worth it! Wish him luck getting home-rolled electronics through airport security...

Az was up next and blew us away with his OSQuery-like hack to make our back-end infrastructure data more queryable in real-time. It’s pretty neat, SQLite lets you write plugins to incorporate underlying data sources which look nothing like relational tables. The upshot of this project is that we can run SQL queries which go out and fetch data from our customer consoles using SaltStack, and perform standard actions like filtering and joins.
I’m hoping we write a CanaryQL blog post of in good time. Projection Central Anna used the week to claim a piece of our downstairs office wall. She started by projecting a simple web page on the wall which showed off our customer tweets, and then gradually iterated the complexity upwards.
Step 2 displayed a cool animated clock, Step 3 showed bird deployments, and Step 4 integrated a websockets based chat system (allowing people in the office to send messages that would now display on the projector). This is perfect for kicking off long running jobs that notify people downstairs when done. Part of what made this awesome is the fact that Anna never touched Python before HackWeek! She summarised her win early on with a John Gall quote I love:
Kinect Resurrection
Jay swapped projects midstream, and eventually went for a hack related to the Kinect. This meant resurrecting and saving an old device before creating an office facial recognition based IDS.
A Better MouseTrap We have a janky internal system to test sample SD Cards comprising of a series of Raspberry Pi’s and a terrible-looking breadboard. Marco decided to replace the breadboard rat’s nest with a custom circuit board, built in the office. This meant turning the office into a meth-lab and a lot of fails.

Of course in true Marco fashion, he prevailed, in time and under budget:
Canary-War Nick and Max teamed up to build a Unity3d based game they called Canary-War. They designed the characters from scratch in Blender and then built the game mechanics for a multiplayer game in a week. Pretty awesome..
Grafana meets IoT Danielle decided that Grafana dashboards that merely displayed data from IoT devices were too limiting, and hacked a module using MQTT & WebSockets to get bi-directional comms going with her IoT device. Since Grafana is designed to be uni-directional, this took some finagling.
Instapaper for Video My project was purely to scratch my own itch. I wanted a way to tag video links during the day, and to then have them magically saved on my iPad for later viewing.
I ended up with a Rube Goldberg machine called Essentially this lets me send a video link to an email address, which is then parsed by an EC2 server which downloads the video and adds it to my personal podcast. My iPad then subscribes and auto-downloads episodes for that podcast so the videos are there even if I’m on a plane with no connectivity.
I’ve extended this to make the system multi-user, so I’ll blog about this one separately too.
It’s probably enough to say “a fun time was had by all” and end it there, because if we can’t have hacker fun, then what is this all for any way? But there’s always more. Post the presentations, we noted at least the following points on our internal Slack:

(Ed's warning: cut & paste from internal slack)
  • Make sure we always give credit for stuff we use from other people. It breeds a type of academic honesty that’s important and clarifying, and gets us into the habit of more generally giving credit when it’s due.
  • We often talk about “being a learning org” and the HackWeek demos warmed my heart for it. Az said, “last time I missed the mark by doing A, so now I did B”. I also heavily changed from the last HackWeek. (Last year, I planned time for HackWeek and “work happened”, and I barely shipped. This time, work also happened, but I expected it, so I had cleared up personal time heavily and it gave me enough time to ship satisfactorily). Learning (from past mistakes) is what we do.
  • Why bother? Things like a HackWeek come and go and if you don’t stretch for it, there’s actually no perceptible difference to your life. In fact you quickly figure out that life is much easier if you don’t put yourself into stretch-needing situations. The reason for consciously doing it during an artificial one week sprint? Because you’re building those muscles; during a HackWeek, you’re not just building the new tech skills you bumped into, but also meta skills. Skills like knowing when to dive deep or when to walk, how to pick a date, commit and ship. It’s super trite, but ultimately, “we are what we repeatedly do”.

Making NGINX slightly less “surprising”

Dan Geer famously declared that security is “the absence of unmitigatable surprise”. He said it while discussing how dependence is the root source of risk, where increasing system dependencies change the nature of surprises that emanate from composed systems.  Recently, two of our servers “surprised” us due to an unexpected dependence, and we thought this incident was worth talking about. (We also discuss how to mitigate such surprises going forward).

Every Canary deployment is made up of at least two pieces. Canaries (hardware, VM or Cloud) that then report in to the customer’s dedicated console hosted in EC2. We’ve gone to great lengths to make sure that the code and infrastructure we run is secure, and we ensure that any activity on these servers that isn’t expected, is raised in the form of an alert.

A few weeks ago, this real-time auditing activity tripped an alert on a development server. Servers are either built as production servers, which have been tested and effectively have a frozen footprint, or as development servers which are used by our developers for testing. The anomalous activity triggered our incident response process, and the response team swung into action.

A quick check on the server showed that it was owned by an automated tool blasting through the AWS address space looking for a bunch of simple misconfigs and attack vectors. On this dev server, it found a debug interface at the path  /console. This debug console was foundational to the framework and was automatically included whenever the framework’s debug flag was enabled. Importantly, it’s served from deep within the framework, and introspecting the application’s internal routes didn’t show /console.

What a gift! The developer working on this console enabled debug mode without realizing its full implications, and the attacker’s script found it in time. The fault here was ours, we turned on debug mode in Flask, but our surprise came from the fact that we never expected our webserver to serve up pages we didn’t know about.

The immediate fix was to disable the debug flag and reboot the server, which killed the access. (We subsequently tore down all our dev servers, not because of signs of compromise, but because our tooling makes it trivial to launch new ones.) However we wanted to examine the pattern a little more closely, to see if we could reduce our unmitigatable surprises. If /console was present, what other surprises await us now, or in the future with new developments happening?

So we started looking at creating an whitelist generator for NGINX, which is the web server we rely on. What we had in mind was a stand-alone tool that would coax NGINX to only serve documents from known paths, with minimal effort and minimal impact to an existing setup.

1) What we tried that didn’t work
NGINX has a module which allows one to embed Lua scripts inside the config file. We explored this (because we really wanted an opportunity to play with Lua) but ultimately rejected it as Lua support isn’t part of most default NGINX packages. We’d have to build the NGINX package from scratch, which would create an additional operational burden, and so fails our minimal effort and impact goals. We then explored the custom-built NGINX Javascript module referred to as njs, which is a scripting module developed and supported by NGINX themselves. It installs on top of existing NGINX setups and is super cool and interesting, but also turned out to be too limited for our needs. (It essentially prevented us from being able to call out and inspect the Flask setup to learn about valid routes).

2) What we currently do
    1. Grab nginx_flaskapp_whitelister from our Github account;
    2. Run nginx_flaskapp_whitelister to generate a new include.whitelist file
    3. Include this file into your current NGINX config.

The skinny:
A typical Flask app has a url_map object which holds all of the routes used in the application. Assume we have a Flask app, with defined routes that look something like this:
 ‘/‘, ‘/login’, ‘/chat’ and ‘/admin’
Our url_map will look like this:

Now NGINX has a concept of whitelisting routes, using what they refer to as “location directives”. A simple pseudo-configuration will look something like this:

So the basic lookup sequence of NGINX, to determine what to serve up for a requested path is as follows:

    1. NGINX looks for exact matching routes (defined with ‘=‘), 
    2. It turns to the longest matching routes defined with prefixes (modifiers such as ‘^~’) 
    3. It will turn to its default lookup method as regular expression matches to the longest matching route. 

Thus by specifying locations using the ’=‘ and ‘^~’ modifiers, you are able to override the natural behaviour of route lookups.

We make use of this by extracting all rules/routes defined for our app (by grabbing it from the apps associated url_map object) and mangle this into a neatly bundled, bite-size chunk for NGINX. We then fetch the current NGINX config describing the '/' route of the running server (most likely serving the allowed current endpoints).

We then create a separate include.whitelist file with the following NGINX configuration:
If an endpoint is exactly equal to '/' or it is equal to any of the fetched Flask endpoints, pass it to the original NGINX config that was fetched,
For any other endpoints being requested return a 404. 

Including this whitelist file ahead of the location directive definitions ensures that the config written by the tool will take priority over possible conflicts without overriding or ignoring additional existing configuration oddities in the current setup.

So with a simple git clone and install:

…and a run of the tool:

…you are sure that even if a dev accidentally enables ‘/console’ again, it won’t be served, as you do not explicitly allow it. Instead you will be served up an NGINX 404 page (or if so defined in your NGINX configuration file, a custom 404 page - like we do!):

Example: for implementing the nginx_flaskapp_whitelister for your Flask application called app, that is defined in the file /module/ and run from /path/to/python/virtualenv; you would run the following command:

3) How it fits into our pipeline
We use SaltStack for easing our configuration management, with their event-driven orchestration and automation through the use of configuration files referred to as ‘recipes’. Including our NGINX Flask App Whitelisting tool into our Salt recipes was quite simple: a recipe ensures the tool is present and installed, and the tool is run after the NGINX config file is deployed, but before the NGINX process is started.

4) Where you can find it
The nginx_flaskapp_whitelister tool is available on our Github page at

As trite as it sounds, nothing beats solid security design and multiple layers of detection. We were able to discover the compromise within minutes because all SYSCALLS on our servers are audited and exceptions are alerted on. The deployed architecture means that a single server compromise doesn’t leak anything to an attacker to let her target other servers and she has no preferential access because of her compromise. Now, with all of our servers running nginx_flaskapp_whitelister, there’s less chance of them surprising us too.

Edit: Also check out by @nickdothutton