I'm Running Canaries, but...

...what if someone finds out?

Do attackers care if there are canaries in my network?

People wonder if they need to hide the defensive tech used on their networks. Like all interesting dilemmas, the answer is nuanced.


In defense of obscurity

In any discussion about obscurity you will almost certainly have someone shout about “security through obscurity” being bad. As a security strategy, obscurity is a terrible plan. As an opportunity to slow down or confuse attackers, it’s an easy win. Every bit of information an attacker has to gather during a campaign gains the defender time.

This is very much a race against time. No breach happens the moment a shell is popped or SQL injection is discovered. Attackers are flying blind and must explore the environments they’ve broken into to find their target. Defenders can seize the opportunity to stop an incident before it becomes a breach.

It is often true that attackers typically operate with a fuller view of the chessboard than defenders. However, when environments are running with defaults, they meet attackers' expectations. Defenders who are able to introduce unexpected defenses or tripwires to this chessboard can turn this asymmetry to their advantage.


What are defenders so afraid of?

Defenders tend to be concerned that their security products:
1. could, themselves, be insecure
2. may not work as expected when attacked
3. could possibly be evaded if attackers are aware of them
4. will simply eat labor without producing much value

Pardon the pun, but this isn’t a very defensible position to be in.

We know very well from Tavis Ormandy, Joxean Koret, Veracode, and others that security software and products are notoriously insecure. According to Veracode, in fact, they come in next-to-last place.


If that’s not discouraging enough, the average security product is difficult to configure, challenging to use and requires significant resources to run and maintain. There is no shortage of reasons for wanting to hide the details of security products in use.

The Importance of Resilience

Let’s consider the flipside for a moment: offensive tools and capabilities. There’s a solid argument for keeping offensive capabilities secret. For example, the zero-day vulnerabilities used by Stuxnet wouldn’t have been as effective if they had been previously reported and patched. For some time, military aircraft have had advantages because details of their capabilities or even their very existence were closely guarded secrets.

Defenses are a very different case, however. These must stand the test of time. They are often visible to outsiders and similar to defenses used by other organizations. Vendors, after all, will advertise their products in order to sell them. Defenses need to hold up under close scrutiny and be robust enough to last for years without needing to be replaced. The argument for keeping them secret could perhaps slow down an attacker but not by an appreciable amount. 

Ultimately, defenses need to work regardless of whether attackers are aware of their presence.

Attackers Discover Your Secret: Canaries

It’s okay - we’ve planned for this moment. We spent significant effort ensuring Canaries are unlikely to ever be the ‘low hanging fruit’ on any network. We’ve also made architecture choices that minimize blast radius should a Canary ever be exploited (e.g. we won’t span VLANs, ever). In short, compromising a Canary would be very difficult and will never improve an attacker’s position.

With a direct attack against a Canary unlikely to prove useful, let’s look at the attacker’s remaining options.

Scenario 1: The attacker has no idea you’ve deployed Canaries and Canarytokens. Since they’re not expecting honeypots, they’re less concerned with being noisy. They’re likely to trip alerts all over the place, as they run scans and attempt to log into interesting-looking devices.

Scenario 2: The attacker knows you use Canaries, but they’re flying blind. Even though they know honeypots are in use, they don’t know which are real and which are fake. This presents them with a dilemma - being sneaky is a lot more work, but they still need some way of exploring the network without triggering alerts. It’s likely to be in the attacker’s best interest to find a different target.


An unexpected bonus we never planned for is that Canaries are super scalable. Many customers start with five or ten and grow to dozens or hundreds. Stepping back into the attacker’s shoes - are you on a network with five or five hundred? Has this organization deployed a hundred Canarytokens or a million?

Conclusion

The underlying principle is a shift in thinking. Defeatist phrases like, “it’s not a matter of if, but when you get breached” have discouraged defenders. The reality is that the attacker is typically coming in blind, while the defender has control over the environment. By setting traps and tripwires, the defender can tip the outcome in their favor.

We think it’s a very positive and empowering change for the defender mindset. It’s your network - own it and rig the game.



Introducing Rapsheet

We've got hundreds of servers and thousands of Canaries deployed in the world. Keeping them healthy is a large part of what we do, and why customers sign up for Canary. Monitoring plays a big role in supporting our flocks and keeping the infrastructure humming along. A pretty common sight in operations are dashboards covered with graphs, charts, widgets, and gizmos, all designed to give you insight into the status of your systems. We are generally against doing things “just because everyone does it” and have avoided plastering the office with “pew-pew maps” or vanity graphs.

(although the odd bird-migration graph does slip through)

As with most ops related checks, many of ours are rooted in previous issues we've encountered. We rely heavily on DNS for comms between our bird and consoles, and interruptions in DNS are something we want to know about early. Likewise, we want to ensure each customer console (plus other web properties) are accessible.

There are tools available for performing DNS and HTTP checks, but our needs around DNS are somewhat unusual. We want to be able to quickly confirm whether each of our consoles is responding correctly, across multiple third party DNS providers (e.g. Google, Cloudflare, OpenDNS). For a handful of domains that's scriptable with host, but for many hundreds of domains this becomes an issue if you want to be able to detect issues fairly quickly (i.e. within tens of seconds of the failure).
To plug this gap we built Rapsheet, to give us a “list of crimes” of our systems against intermediary network service providers. In this post I'll provide a quick run through of why we built it.

Goal: "zero measurements"

In this post, I am going to dive a little deeper into the thinking behind Rapsheet and why the dashboard behaves in the way it does. Why we aim for zero measurements to be the goal and what this actually means. (Hint: it doesn't mean "take no measurements".)
Much of the thinking was expounded on by this awesome Eric Brandwine AWS reinvent talk:


If you have not yet watched it, you really should. Eric is whip smart and is an excellent presenter.
The key takeaway for this post is that alarms and dashboards will often lead to Engineers and other technicians developing what is known as “alarm deafness”. Primarily studied in the medical world, it describes the situation where operators rely on a raft of metrics with tunable parameters. If the parameters are too tight, the operators learn to ignore the alarm. If they’re too loose, bad things happen without anyone being the wiser. Alarm deafness grows when a specific check or metric is constantly in an “alerting” state. Eric points out that the best way to alleviate alert deafness, is to constantly be striving for “a zero measurement”, because as soon as you see anything other than a zero measurement, you know that action is required.

If you can find those zero measurement metrics then they provide clear tasks for the operations folks, for whom a key objective is to keep all panels in a non alerting state (i.e. zero measurement). With our current setup, I will drop most things I am busy with whenever I see one of the panels in any colour other than green.

A counter example of an actionable measurement is almost anything which provides a count or tally of an expected conditions (even if it’s an error.) For example, we rely on DNS for a range of services and are able to track DNS failures across multiple providers. A poor metric would be a graph displaying how many DNS queries have failed within the last few hours against any of the DNS servers. This graph may look interesting but nothing about the graph would lead to an exact action that engineers could take. Transient network issues lead to queries failing every so often and our engineers aren't expected or authorised to fix the backbone network between our hosting site and the 3rd party DNS servers.



Instead we only track which of our servers are not currently responding correctly to our various DNS queries. With this metric, it becomes a lot easier for us to determine patterns, and therefore understand the root cause of the problems that are responsible for the failing DNS queries. For example, we rely on Rapsheet to tell when a particular DNS provider is experiencing issues, or if a particular server is experiencing DNS issues, or whether one of our domains has been flagged as unsafe. This then leads to the issues being mitigated and resolved more timeously.

Architecture

At it's core it's straight forward. A long running processes uses asyncio to periodically perform a whole bunch of network tests (DNS, HTTP), pumps the results into InfluxDB and then we build a simple UI in Grafana.

The tests are tied together based on a customer, so we can quickly determine if an issue is customer-specific or more general.

Rapsheet has modules to implement each of its tests. Current modules include DNS checks against four different DNS providers. Different HTTP health checks including reachability, served content and endpoint checks against site blacklists such as Google’s SafeBrowsing. All endpoints are asynchronously checked and the results collated before posting the metrics into Grafana.  

Each time a new panel is added to the dashboard it has gone through a number of iterations on the backend in order to ensure that it keeps to this mindset.


It’s built to be extensible, so we can quickly add new zero-based measurements we want to track. 

A recent example was exception counting. We have as a design goal the fixing of unhandled exceptions. The dashboard has an “exception monitoring” panel that tracks the number of unhandled exceptions across all of our deployed code (on customer consoles and our own servers). Coming back to the notion of focus, it becomes a very clear goal for me: handle the exceptions that are cropping up. Most of the time it involves a deep dive into what is causing the exception and how we should best mitigate those cases. When introduced, the panel would get tripped up with noisy but harmless exceptions (and those panels would just burn red). After a flurry of root cause fixes, and thanks to the goal of driving them to zero, we only get a couple every other day across hundreds of production systems (and work is under way to alleviate those too).

Rate-limiting

Some of the panels involve performing requests or queries against third party services. For example, we want to know if someone using the Chrome browser is able to login to their console.
In order to "play well with others" and use the third party service fairly, Rapsheet has rate limits (especially since asyncio can push out lots of traffic). Most third party services let you know what these rate limits are so they can be implemented correctly. Interestingly, Google's DNS service doesn’t. Only by practically testing the limits did we figure out what type of rate we should limit our queries against their DNS service.

Location, location, location

Almost all of Canary services run inside AWS. It therefore made sense to move the monitoring out of AWS. We didn’t want to fall victim to a situation where everything looked good to tests inside AWS, but due to AWS issues were silent to the outside world. So we run Rapsheet at a wholly independent provider.

So far, it's proved quite adept at picking up issues in this hosting provider's network...

Wrap-up

The dashboard is being projected onto a wall in the operations office which is the main thoroughfare from our desks to the office facilities, so anyone working in the office has to see it when leaving their desks.


(Of course, it’s online for remote workers who are able to keep it open on their desktops if needed.)

Rapsheet is a work in progress, but helps us make sure that things are ticking along as expected.

Introducing the Office 365 Mail Token

Shared passwords, sensitive documents: mailboxes are great targets for attackers. Would you know they were targeted? We’ve got your back! Our Office 365 token deploys to thousands of mailboxes in minutes and alerts you when someone is snooping around.

Why an Office 365 Mail token?

Enterprises have been flocking (ha) to Office 365 for years now and a large number of Thinkst customers are using it. The Canaries will detect attackers on their networks, but nothing lets them know if an attacker has compromised a single mailbox and is snooping around.

Canarytokens are great at becoming high fidelity tripwires in places that other tools can’t easily go. You can quickly head over to https://canarytokens.org to create a token, and then place it in Bob’s mailbox, but how does this work for an entire office? Will it work for an entire org?

Easy!

The Office 365 Mail token can drop a pre-written, tokened email into multiple mailboxes at once. We insert the emails into mailboxes automatically, so it avoids getting caught by email security filters. We avoid dropping it in the default inbox so users won’t stumble on it accidentally, but an attacker searching for booty can still quickly find it and trigger an alert.

Deploying the Token

To deploy this token, there are a few easy steps.

  1. Log into an Office 365 account that has the proper permissions ( details here ). Bonus - this token also works with on-prem Exchange implementations - see the link above for details.
  2. Log into your Canary Console and choose the Office365 Mail token under Canarytokens
  3. Select the mailboxes to token from the list presented to you
  4. You now have tokened mailboxes, which will be displayed in the list of enabled Canarytokens
  5. Wait for some unsuspecting attacker to stumble upon the email. To test yourself, search for “password reset” and you’re likely to find the gift we left for attackers.



When someone stumbles upon the trap, you’ll receive an alert like this one.


While it's difficult to rule out false positives altogether, we employ a few tricks to avoid them that require no additional effort on your part. First, we place the email in the archive folder, reducing the chance of legitimate users finding this email in their own inbox. Second, because we insert the email directly into the mailbox, we avoid security gateways inspecting tokens directly and creating false positives.

Tokens like this are great for the attacker details they give you, but would also be useful just as a heads up. Someone just searched for password reset emails in Bob’s mailbox. This is probably something you should be aware of.

Wrapup; What's Next?

With the Office 365 Mail token, we’ve gone from some basic token ingredients to something that simply scales to hundreds of mailboxes in the same 3-4 minutes it takes to deploy a Canary. That's it - quick, easy and likely to catch the bad guys.

For more thoughts on Canarytokens, check out our post on the AWS API Key token. The official documentation for Canarytokens is a concise and useful read as well.

USENIX Security Symposium 2019

Thinkst in Santa Clara

Last week Haroon and I found ourselves at the 28th USENIX Security Symposium in balmy Santa Clara. We made the trip from Vegas for Haroon's invited talk at the main event, and I took the opportunity to present at one of the side workshops (HotSec). This is a short recap of our USENEX experience.



Neither Haroon nor I have attended USENIX events previously, despite over 20 Black Hat USAs between the two of us. What's worse, we both used to read ;login: regularly, and the research coming out of USENIX Security is typically thorough. When this opportunity presented itself, we couldn't turn it down.

Drawing comparisons between USENIX and Black Hat/DEF CON is a bit unfair as they have different goals entirely, but given the consecutive weeks they run on, I think it's ok. Compared to Black Hat/DEF CON, obvious differences are the smaller scale (there were fewer speaking rooms and smaller audiences), primarily academic focus, and no side events that we saw. (Black Hat and DEF CON usually have have a ton of parties and events in parallel with the main conference.) USENIX is billed as a place for industry and academia to meet, but most of the talks were academic. I'll come back to this shortly.

The event

Event organisation was slick, and the venue was a welcome respite from the Vegas casinos. No one was screwing around with the WiFi (at least, not detectably...) and the AV just worked for the most part. Session chairs played their role admirably in corralling speakers and audiences, keeping a close track on time and chiming in with questions when the audience was quiet.

USENIX was much more sedate than either Black Hat or DEF CON. No Expo area (the handful of sponsors each had a table in a small foyer), no special effects, no massive signs, no hacker cosplay, no one shouting in the corridors, no visible media, no gags or gimmicks. The list goes on. It just reinforces how much of an outlier the Vegas events are.

Our talks

Haroon's talk used our Canarytokens as a lens to explore how defensive teams need to embrace quicker, hackier solutions to win. The central thesis being that the current battle field is too fluid, which favours the lighter more agile M.O of attackers. We'll publish more details on this in the following weeks.

My talk was a brief exposition on our take of honeypots as breach detection vs observers, with the experience of running Canary to back it up. In the next few days we'll publish another post here delving into this.

.edu dominates

Turning to the talks, virtually all of them were densely packed with information. The acceptance rate was something like 15% (115 from 740 submissions), and (as typical for academic conferences), authors submitted completed works. To rise above the pack, papers must cover lots of ground. Accepted authors only have a 20-minute presentation slot to talk about their work and take questions. It means the authors fly through the highlights of their research, frequently leaving out large chunks of content from the presentation and instead deferring to the paper to make the time limit. That's at odds with Black Hat's 50-minute slots which usually include a weighty background section (I recall us having to fill 75 minutes at Black Hat at a point.)

Abbreviated talks also mean that sometimes the speakers just have to assume the audience has a background in their topic; there's simply not enough time to cover background plus what they did. In those talks you can expect to be reading the papers as you can quickly be left behind.

In contrast to Black Hat's many parallel tracks with breaks between every talk, USENIX ran three parallel tracks with up to five 20-minute talks in a row. This meant that you could potentially see 53 talks if you sat in one of the main speaking rooms for the three days. It's a firehose of new work, and it was great.

Academia dominated the talks. Of the 36 talks I saw, just two were purely from industry (both were from the Google Chrome folks). I suspect the completed paper requirement serves as a natural barrier against submissions from industry. A completed paper is the output of regular academic work, and finding a publication venue afterwards is a stressful but comparatively minor part.

For industry folks, a research paper isn't a natural goal; they'd need to set aside time for this side project. It's easier to hit an earlier result (like a bug or technique) and submit to an industry conference. Since career progression isn't tied to paper publication, there's much less incentive to write one.

In addition, there's also very different standards for the talks. It's clear that merely finding a bug or figuring out a way to find bugs isn't going to get your paper accepted. Virtually every paper had extensive surveys or evaluations. At USENIX there's a big push towards gathering data, either in the form of measuring prevalence or in designing tests for evaluating new defences, and then making that data available for others to analyse. Collecting data is a large part of the research effort. Contrast that with a Black Hat talk which describes (say) a new heap manipulation technique demonstrated with a specific bug. The bug's prevalence will get a cursory slide or two, but talks are accepted if the exploitation techniques are novel.

Talk skim

In terms of the talks, it was a deluge of content. Speculative execution attacks had a bunch of attention, with new variants being demonstrated as well as deepening of the fundamental understanding of the attacks. One of these highlighted that not only is execution speculative, but so are other operations like memory load. The authors demonstrated a speculative load attack, in which an attacker can leak physical memory address mappings. This category of research is now squarely being led by the academy.

There was a talk on threats in the NPM ecosystem, showing how the average NPM package trusts 79 other packages and 39 maintainers. That's food for thought when worrying about supply chain attacks and software provenance. The authors also showed how a tiny group of 100 maintainers appear in 50% of the dependencies (i.e. a small group to subvert, to affect a huge swathe of dependents). A later talk on In-Toto, a software supply chain protection tool provides some limited hope for finding our way out of the supply chain mess.

I enjoyed the ERIM talk, which claims a way to achieve in-process memory isolation. This could be used to let a process compute cryptographic results over a key stored in the process' own memory, but still prevent the process from reading the memory. Kinda wild to think about.

There was one honeypot-related talk I saw. The authors realised that honeypot smart contracts are a thing; apparently scammers deploy contracts which appear to have flaws in them, prompting folks looking for smart contract vulnerabilities to send Ether to the contracts in the hopes of exploiting the contracts. However the flaws are mirages; it's an example of a scam that takes advantage of other scammers.

Wrap-up

There were further talks on crypto (cryptography, cryptographic attacks, and cryptocurrencies), hardware, side-channels galore, web stuff, and much much more. A good portion dealt with building better defences, which is in further contrast to Black Hat's primarily offence-oriented talks.

We hope to return to USENIX soon, while the time away was significant it was well worth it.

PS. Seeing Rik Farrow in person was a delight, exactly what you'd imagine he might look like. Sandals, Hawaiian shirt and ponytail!

One Month with Thinkst

Recently, I was faced with a career dilemma.
  • Go back to the enterprise and be a CISO
  • Take a gig that would be part research, part bizdev
  • A research and writing gig
  • Consulting/Advisory work
  • Join another vendor
SPOILER: I chose the last one… but why?

Why Thinkst?

Thinkst Applied Research is the company behind the popular Canary product. Though they started off as more of a research firm that would build various products, the Canary product took off and has become their primary focus.

They are a moderately-sized company of enthusiastic industry veterans, developers and engineers that love to learn and try new things. They’ve managed a sort of startup nirvana: bootstrapped with a popular product that customers openly love and a great company culture. When Haroon pitched me the idea of joining to help out, I was immediately flattered, excited and skeptical.

I knew Canaries were wildly successful. It’s kinda hard to ignore. Especially hard to ignore when you’re working product marketing for another vendor, trying to figure out how to recreate that same fierce customer/brand loyalty and excitement. The chance to join a company that had already figured this out… yes. Very much yes, I wanted to be a part of that.

Though I knew much about the product, what did I know about the company? Not much, I had to admit. I knew that there were 10–15 employees. I knew they had never taken funding. Oh yeah — and they’re based in South Africa.

I had some reservations.
I’ve never been to the continent of Africa, much less South Africa. Not that they needed me there.

I have built a measure of trust with Haroon over the years. We’ve been chatting regularly over the last five years and we were both very much on the same page when it came to principles and InfoSec. He clearly knew how to execute on ideas. I was down for a chat at the very least.

My key reservations related to the ability of the company to support me and the distance between us (literally, not metaphorically). Specifically:
  1. Was this company large enough and stable enough to meet my salary needs without some sort of weird comp plan arrangement (note: I’ve never been a dedicated sales guy, so all comp plans are ‘weird’ to me)
  2. Though I was more than comfortable working remotely, being the only US employee of a South African company was super remote. How remote?
If you think my travel time to them looks bad, imagine what half the company goes through when attending Black Hat USA (another 5 hours West of me).
He had sold me on his product, but now, Haroon set his mind on selling me on his company. It was no less compelling. While Thinkst is small in employee count, they’re certainly not in terms of revenue, profits or customers. I’d say they handily beat 99% of the rest of the security industry by many metrics. For example, their revenue-to-employee ratio is probably double or triple the average I’ve seen in vendor-land. Let’s just say my concerns were addressed satisfactorily. #HumbleBrag #NewEmployee #BrowniePoints #Shameless

The distance was still a concern, however. Doing the math in my head, I realized Cape Town was six hours ahead of me. In fact, all of South Africa is one timezone — South Africa Standard Time (SAST). When I start my day at 9am in Eastern Time (the closest the US gets to SAST), the folks in Joburg and Cape Town are almost done with their day. Except… they’re not.

Though they have customers all over the world, a large chunk of them are in the US and a large chunk of those are in California. PST is an additional 3 hours West of me, so they have zero overlap with a normal 9–5 workday in South Africa. That means it has become fairly normal for part of the Thinkst team to work late. I was well aware of this — while one of my primary concerns was the distance between us, I knew that a big part of the attraction of hiring me was the fact that I wasn’t in South Africa.

Yes, in addition to my charming sense of humor and scruffy visage, I was in that sweet EST time zone. The same time as New York City — only 3 hours off from Silicon Valley, but still decently overlapped with South Africa and the United Kingdom. In addition to my time zone, there’s the fact that I can attend conferences for a tiny fraction of the cost and effort it takes the team in South Africa to get over to this continent.

This was a tough choice. So naturally, I asked my family for help. I built a “help me pick my new job” presentation and presented it to them.

Hmmm, looks like some visual bias occurred here? A coincidence, surely.

After some collective eye-rolling and deep sighs, my personal board of directors agreed to grant me five minutes of their attention. They came out of it Team Thinkst all the way (and surely, with new appreciation for my presentation skills).
   

My first week

As with most jobs these days, especially as a remote employee, the first thing you get access to is email and other corporate/backoffice systems. They also mentioned they’d be sending me some things. I was actually in Indianapolis for CircleCityCon with my kids when the packages started arriving. After a few days, my wife messaged me. “It’s looking like Christmas on your side of the table!”

Sure enough, I came home to:
  • Stickers
  • Business cards
  • A GoRuck GR1
  • A new MacBook Pro
  • A custom Canary Space Pen (I hear it may be possible for some of these to get into non-employee hands…)
  • A custom Thinkst Canary-themed WASD keyboard
  • 3 Canary t-shirts
  • 1 Canary hoodie
  • A free Audible subscription (SQUEE — big audiobook fan here!)
  • 5 Canaries
  • and a Partridge in a Pear Tree
This is a BEAST of a backpack. Pretty sure it can stop bullets.
My wife was right, it was like Christmas morning. What I was most excited about, however, wasn’t the cool swag or mechanical keyboard.






I had heard so much about these Canaries, I was ready to dig in and see what all the fuss was about. A few minutes after I started setting them up, it occurred to me that I should capture this moment. Obviously, it should be no problem for the Thinkst staff to set up a Canary in five minutes or less — they’ve been putting on Canary demos for years! A new employee with no previous experience with these devices was a perfect opportunity to see if the product matched the writing on the tin.
Image result for video coming soon
Video coming as soon as I finish learning how to use iMovie.
Side note: I previously didn’t have a positive impression of DHL, but now, I’m seriously impressed with them. They not only regularly get stuff to me in 2–3 days from South Africa, they will pick up packages from my house, to deliver to South Africa in the same 2–3 days!
   

Culture

So much is said about company culture, but it often comes off as forced or is used as a facade. If there’s a Foosball table, you shouldn’t be judged a slacker for using it. Culture isn’t a ‘perk’ or tool that exists to balance out dysfunctional management and inefficient processes. A snapshot view of how Thinkst does things probably looks like the most hipster, trendy SF startup thing ever. When you get to know everyone and start working with them, you realize they come by all of it honestly.

Email — Sales and support use it to communicate with customers, but that’s it. I had to go check just to make sure what I’m saying is accurate, but I’m exactly at the one month mark and I have yet to receive a single internal email from anyone. As far as I can tell, there’s no employee handbook or rule that says we can’t use email to discuss things, we just use Slack for all that.

Slack — Thinkst is far from the only organization leaning heavily on Slack for internal communications, but I found it interesting how Thinkst organizes things there. There’s no #general and no #random, so I was a bit off-balance at first. Once I got used to the organization, it all made sense, including Thinkst’s flavor of ChatOps, which is very impressive. My favorite channel is #learning. I had been wondering how a staff of 15 manages and supports north of 600 customers. Good, efficient communication and smart use of Slack’s integration features.

Bi-weekly Tech Talks — I’ve been places that have talked about doing things like this and occasionally did them, but they’d often cancel them as ‘Tech Talks’ were a low priority item in management’s eyes that took up an inconvenient amount of time. At Thinkst, they always happen and they’re recorded, so anyone that misses one can go back and watch it. I find myself really looking forward to these at Thinkst and can’t wait to do my first one.

Meetings — For at least my last 4 jobs, I share a link to Jason Fried’s epic Ted talk, Why work doesn’t happen at work. I haven’t sent this video to anyone at Thinkst. Aside from customer demos, the Bi-weekly Tech Talks and the odd spontaneous chat, there aren’t any meetings. I think the devs do standups. Looking at others’ availability on the corporate calendar, I can confirm that I’m not just being shielded from meetings so the ‘new guy can settle in’. They generally don’t exist, because they’re largely unnecessary due to other positive aspects of the culture here. 

Decisions can be made without meetings. Announcements can be made without meetings. Basically, anything other companies have meetings for, happens asynchronously in Slack, where it doesn’t disrupt anyone’s workflow. Coworkers and management often take hours to respond. There’s no ‘ego’ here creating a culture of fear that makes employees feel like there’s a race to respond.
Can a company be both laid back and super productive?
It sounds odd to say it this way, but Thinkst manages to give a laid-back, calm vibe. At the same time, everyone’s busy and you can see that. The way several of the Slack channels are used by support, customer success, R&D, there’s a visible ‘exhaust’ that somehow makes completed work visible and tangible.

A new feature was just rolled out. Cool, I remember seeing chatter about that last week after a customer suggested it! There’s an issue — support works it, finds the problem and fixes it. Someone else suggests documenting it in the knowledge base, since this has happened two times before. Documented. Done.

Work/Life Balance — In contrast to this visible productivity, several key employees (I suppose everyone’s key when there are only 15?) were on vacation when I joined, or soon after, with no negative impact I could see. I’ve been known to ‘forget’ to take vacation at other jobs, often leaving with dozens or even hundreds of unused hours accumulated. Here, I’ve already found myself thinking about when and where I want to use mine and looking forward to it. 
Perhaps it’s just coincidence that two senior folks were on vacation right around the time I joined, but I found it refreshing and encouraging. I’ve been burned out several times in my career and was worried about joining an organization with an ‘embrace the grind’ attitude, bragging about how many nights and weekends they work. 
Not that people here don’t ever work nights or weekends, but no one treats it as bragging rights or some badge of honor. Sometimes shit breaks in the night and you have to fix it. I can’t say I’ve seen any scrambling or overtime due to poor planning or management so far.
   

What Sucks?

#WaterCooler — I generally get busy quick when I join a company. I’m the kind of employee that can always find things to do and doesn’t require much direction. I do wish I could have had more time to talk to my co-workers and get to know them before diving in. When everyone else is twenty-one hours of travel away and six hours ahead, that’s just the reality.

The good news is that Black Hat is only a month away and I’ll get to meet a lot of the team there and spend more time with them. 

Looking from the other perspective, this is my fifth job working out of my home office and I think I’d have a hard time adjusting back to cubicle life. For me, the benefits of working from home far outweigh the drawbacks.

#TooMuch? — I’ve read Trevor Noah’s book. I know who Elon Musk is. I enjoy Hugh Masekela’s music. That’s about the extent of my knowledge of South Africa. I tend to use humor to connect with people and that can be hard to do when there are large cultural differences. I use a lot of self-deprecating humor, which is always easy and convenient to reach for as the lone American in the company. I live in the South. In the bible belt. It is July 3rd as I write this and fireworks are going off. A lot of the stereotypes are true, especially here.

I’ve already posted pictures of pickup trucks with offensive slogans. Too soon?

But is it funny? Is it awkward? Do I sound like a cartoon? I’m not sure yet. I’ve often heard comics talk about how necessary it is to research cultures and shift their material for shows on an international tour. The difference between a laugh and a riot can be small in some places. 

I just need more time to settle in, I suppose. All I can do is observe and hold back until I grok enough to not make an ass of myself. Hopefully, I haven’t already. Currently, I’m trying to convince everyone that it’s okay to make fun of me for being old. I feel like I’ve earned it by not dying yet.

What’s Next

I’m going to be doing a lot of advocacy and sales support at Thinkst. There are opportunities to do a ton of other stuff as well, which makes me happy. Soon, I’ll be digging into the API and looking for novel uses for CanaryTokens. Haroon and I are delivering the ending keynote for Virus Bulletin (London, early October). I’m already comfortable giving demos, so let me know if you’re interested. Probably a lot more blog posts. I’m a bit behind on my writing.

Just a few drafts…
Also, if you’re going to Black Hat or DEF CON, I’ll be at both. We will be at Booth #474 at Black Hat and you’ll likely find me at the Aviation, Medical Device or Biohacking villages at DEF CON.



When document.domain is not equal to document.domain


Background

One of our most popular Canarytokens is one we call the "Cloned-Site Token". Essentially, we give you a tiny piece of JavaScript to add to your public webpage. If this JS is ever loaded on a server that doesn't belong to you, it fires an alert. You can be alerted at an email address or webhook in the free version, or to your SIEM, slack channel or a bunch of other alternatives in the paid version.


The Cloned-Site Token is super useful at catching Phishers who duplicate your website as a pre-cursor to an actual phishing attack.

A notification that the website from http://thinkst.com was now running on http://fake-thinkst.com

The Issue

Recently, a financial services customer was periodically getting alerts where the Cloned-Site domain matched their actual domain. This was unexpected, as the token explicitly should only trigger if the domains are different. In other words, the token for http://domain.com should only fire if the page is loaded at a different URL, but in this case the alert was firing even though the page was (supposedly) loaded at the legitimate URL http://domain.com.


First thing to do was to investigate the alert:

Date: Thu Jun 20 2019 08:36:12 GMT+0000 (UTC)
Original Site: xxxxxxxx.com
Cloned Site: https://xxxxxxxx.com
Headers:      Accept: image/webp,image/apng,image/*,*/*;q=0.8
     Accept-Encoding: gzip, deflate
     Accept-Language: en-GB,en-US;q=0.9,en;q=0.8
     Connection: keep-alive
     Forwarded: for=1.2.3.4
     Save-Data: on
     Scheme: http
     User-Agent: Mozilla/5.0 (Linux; Android 7.0; VTR-L09) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.89 Mobile Safari/537.36
     Via: 1.1 Chrome-Compression-Proxy
     X-Forwarded-For: 1.2.3.4

Okay… There are a few interesting things here. Firstly, all the alerts seemed to be coming from mobile devices and more specifically via the “Chrome-Compression-Proxy”. What the heck is that thing? After a bit of Googling it turns out that if you enable the Data Saver feature on your android device it routes all traffic through a Google proxy.


I’m glad you asked. According to peeps at Google, “Data Saver may reduce data use by up to 90% and load pages two times faster, and by making pages load faster, a larger fraction of pages actually finish loading on slow networks.”


Google optimises the page being fetched by performing a bunch of built-in optimisations and using Google servers. Some of the optimisations include:
  • Rewriting slow pages
  • Image replacement in the form of placeholders
  • Disable scripts
  • Stop loading of non-critical resources
Google calls these optimisations Chrome Lite pages (https://blog.chromium.org/2019/03/chrome-lite-pages-for-faster-leaner.html). You can imagine how they do this for HTTP pages, but they recently announced Lite support for HTTPS pages too.

Digging into it

At this point I turned to my mobile device to try and recreate the cloned site alert and after a bit of fiddling I managed to trigger it BUT ONLY ONCE!

For our customer we deployed a small server-side fix to make the token work again but were curious about these Chrome Lite pages. If Google is rewriting my site’s HTML, how are they not breaking SSL? Are they mitm’ing my site?

We hosted a toy site with a bunch of static files and viewed the access logs on page load to see what files came from Google and which were requested from our server.

It turns out that triggering Lite mode for a site is annoyingly difficult. Some things I tried was:
  • Creating an enormously large index page. This should be rewritten right? Maybe?
  • Massive unminified JS files
  • Massive unminified CSS files
  • Including links to content that was being blocked by Lite mode on other sites
One method that did turn out to work consistently was to kill Chrome on my mobile device while on the target page and then reopen Chrome.  (ick!)

Chrome has a nifty feature that allows you to remotely debug your mobile phone’s browser (good luck opening dev tools on your phone). Having the ability to see what network operations were taking place in the browser was great. I could see what items Lite mode would reject and which items it would minify.



Unfortunately, I was still having trouble recreating the Cloned-Site alert we'd seen on the customer's page (I had only ever triggered it once). It took me a few days to realise that the fix I implemented on our backend was blocking me... 



Even with that taken into account, I was still unable to reproduce the triggered alert. (At this point I had spent way too much time trying to trigger the alert / force Lite mode, without any wins).

Then, almost while putting it to bed, we had a happy accident. Without thinking I hit cmd + R to refresh the mobile browser on my desktop via the debugging window and hey! the Canarytoken triggered!

It seems like there’s a flow in Chrome (Lite mode via close/open) that sets 'document.domain' to the empty string "", which is the reason the alert was triggering. (The observant reader would note that our token reported it was running at http://domain.com, that's because we checked document.domain, and reported location.href. The bug means a disconnect between the two.)

So, if you were using Chrome, and your connectivity was bad enough, you'd drop to Lite pages mode, and then it would be possible for the document to be served from ... Chrome on reload? So the document.domain would suddenly be "".

Takeaway

This seems to be pretty unexpected behaviour and is interesting to us for two reasons:

    1. Any site making use of document.domain will have a bad day;
    2. We wouldn't have known any of this was happening without a well-deployed Canarytoken!

This is the second time that Canarytokens deployed by users have found Chrome flirting with the creepy line. In 2018, Kelly Shortridge found Chrome reading files in her Documents folder:


It's the value of both Canaries and Canarytokens. Knowing when things go bump in the night.

Postscript


We pinged the Chrome team and got this reply:
Max, some of the lite pages are served locally by Chrome itself.  Specifically, if Chrome has an offline version of the page available locally on the device, it will serve that page directly from cache. Since that page is not coming directly from origin, document.origin for those pages is not set.

Developing a full stack… of Skyballs

We like solving problems. Sometimes, we make up new ones so we can solve them. Skyball Pyramids are one such case!

Last year we discovered these amazing Skyballs and decided to make them a regular feature at our conference booths. 

Canary Skyballs
They have just the right amount of heft and weight to make them genuinely fun to play with. Of course, this leaves us with the devilish problem of how to display them...

At Infosec Europe 2018, some of our team attempted to stack them in a pyramid shape.

The problem: Skyballs do not like to be stacked. In fact, they like to roll all over the place uncontrollably, frustrating the person that is attempting to stack them.

Exhibit A
Exhibit B
Note the use of Canary-green duct tape in an attempt to keep them in place. 

So, as RSAC 2019 was approaching we needed a better solution; something that was simple, yet effective. (We could have simply taken a bowl, but have you ever tried to fly with a bowl in your carry-on?)

Last year we purchased an Ultimaker 2+ for the office, and  since then we have printed some pretty awesome (though ultimately useless) things.
Yes Max, he missed the goal because of vision problems
Finally! A moment for our 3D printer to shine. 

The Criteria:
  • Easy and light to transport (it would need to fly with our baggage)
  • Modular (we weren’t sure of how big/small the base needed to be)
  • Simple to print (no complex connections or overhangs)
The Solution:
We created a model of the ball (measure the ball and insert the dimensions; easy-peasy) and then with Andrew Hall's help designed a ring with a simple dove-tail joint that allowed for symmetric assembly (yes, there are other fancier joints we could have used, but the design was time efficient and bulk printing friendly). 

Skyball-Ring-v1

We were able to fit 5 rings on the print-bed at a time, and whilst the print failed on a handful of the connectors (we were experiencing a heat-wave at our office at the time, so warping was an issue; glue to the rescue!), we were able to print 65 connectors (enough for an 8 x 8 pyramid) pretty quickly.
A 4x4 Base!
The simple design worked perfectly at our booth. At the beginning of the conference, we used a 7 x 7 pyramid, and by day 4, with a dwindling supply of Skyballs, we were able to reduce the base size all the way down to 2 x 2. 

Look Ma! No Duct Tape!
If you’d like to check it out, and/or use our design, you can download the STL files here.