Ping Me Maybe - When SubCrawl Started Talking to Teams

When you spend enough time with a research framework, you start having conversations with it.
Sometimes those conversations go like this:

Me: “Hey SubCrawl, could you just tell me when something juicy pops up?”
SubCrawl: “Sure. I’ll store it in SQLite or MISP for you.”
Me: “No, I mean like… tell me. Ping me.”
SubCrawl: “You want me to… talk?”

And that’s how TeamsStorage was born.


1. Why SubCrawl Needed a New Voice

The original SubCrawl by HP Threat Research is a brilliant framework:
it crawls open directories, fingerprints suspicious content, matches YARA and ClamAV rules, and stores results elegantly.

But in 2025, “store and analyze later” sometimes isn’t enough.
If you’re knee-deep in operations sometime, you just want real-time context, not just a neat database.

Hence, my fork: kaeptenbalu/subcrawl.
Same architecture, same philosophy — but with a few tweaks to make SubCrawl speak with you.


2. The MISPStorage Evolution

Let’s start with the older sibling — the MISP connector.

The original MISPStorage module was solid, but built for a world where you run a scan, go for coffee,
and later import everything into your shiny MISP instance.

My updated version simply… grew up.

It now includes:

  • Event caching, so repeated domains don’t spam MISP with twins.
  • tagging, pulling Tags from URLhaus, YARA, ClamAV, Payload and TLSH.
  • Adaptive findings logic, creating events only when something genuinely interesting happens.
  • Better error handling — no more crying over one bad header field.

All of this keeps the original module’s spirit, just tuned for real-world tempo.
It’s not “faster” or “better” in a marketing sense — it’s just more conversational.

MISP.


3. Introducing: TeamsStorage 🛰️

Then there’s the extrovert in the family — TeamsStorage.

Where MISP is your sharing and knowledge library, TeamsStorage is your chatty assistant who bursts in shouting,

“Hey, found an open directory full of PHP shells! You might wanna see this.”

Technically speaking, it’s a new storage module that sends results directly to Microsoft Teams via webhook.
No servers, no dashboards, no MISP dependencies — just instant, formatted notifications.

Each message includes:

  • 🧩 Domain name
  • 🧬 Detected findings (YARA, ClamAV, Payloads)
  • 🪪 URLhaus tags
  • 📂 Open directory detections
  • (optionally) Associated teams_id metadata if present in results

All wrapped in tidy Markdown — because security alerts should be readable and stylish.

Teams.


4. Under the Hood

The module is fully compatible with the SubCrawl core, implemented as a standard storage class.

It uses:

  • dataclasses for clean data aggregation
  • A dedicated _analyze_url_content() method to unify module results
  • Timeout-hardened requests to avoid hanging on external APIs
  • Simple JSON payloads for Teams (no fancy cards — because reliability > glitter)

Configuration is as simple as adding this to your config.yml:

teams:
  webhook_url: "https://outlook.office.com/webhook/your-webhook-url"

Then, run SubCrawl like this:

python3 subcrawl.py -f urls.txt -p YARAProcessing,ClamAVProcessing -s MISPStorage,TeamsStorage

That’s it. The next time SubCrawl hits something interesting, your Teams channel lights up faster than your caffeine tolerance.


5. The Takeaway

SubCrawl didn’t change at its core — it just found new ways to speak. MISPStorage got a vocabulary upgrade; TeamsStorage got a microphone.

One stores intelligence; the other shares it. And together, they make sure no “Index of /backups” hides quietly again.


Bottom line:
Threat intelligence shouldn’t only collect information — it should communicate it. Sometimes that means structured events. Sometimes, it just means a friendly ping saying:

“Hey, I think you’ll want to see this one.”


Tags: #subcrawl #misp #teams #cti #automation #opensource