A Detection Engineering Case Study Series on Unmodeled Attack Surfaces in Real-World Applications

Executive Overview

This project demonstrates how a standard public hiring website can be transformed into a high-fidelity source of security telemetry. While most organizations prioritize monitoring for authenticated systems and cloud control planes, they often ignore public-facing surfaces like careers pages and job search inputs. These anonymous, input-heavy areas are frequently probed by attackers but rarely instrumented, creating a significant visibility gap in modern detection engineering

To address this, I developed a fictional hiring platform for Big’s BBQ & Smokehouse and embedded custom client-side detection logic into every user input field. By simulating controlled attack scenarios—including SQL injection, XSS, and URL manipulation—I generated real-world telemetry that correlates disparate actions into a single session. This case study proves that web input fields can be treated as monitored attack surfaces, providing structured events that allow security teams to identify and hunt for early-stage attacker behavior

Problem Framing

Public input surfaces are rarely logged or treated as security signals.

Modern detection engineering focuses on authenticated systems, control planes, and privileged access paths.
However, many attacks begin long before authentication - in public, anonymous input surfaces such as search bars, application forms, chat widgets, and URL parameters.
Because these surfaces are treated as "application features" rather than "security signals", they are rarely instrumented, logged, or monitored, allowing early-stage attacker behavior to go completely unnoticed.

Detection architecture diagram showing attack surfaces and telemetry flow — Fig 1. 📸 Screenshot: The search bar, application form, chat, and URL params highlighted

How Might We

How might we treat every public input field on a hiring platform as a source of security telemetry so that common web attack behavior can be captured, structured, and made visible to detection and threat hunting workflows?

From a UX perspective, hiring platforms are built for convenience

Job search functionality
Quick-apply forms
Navigation filters
Chat assistants

From an attacker’s perspective, these are entry points

This project simulates a realistic candidate hiring environment and intentionally treats these UX features as monitored attack surfaces. The goal was not to block input, but to capture attacker behavior as structured events.

Design Standard

A client-side detection engine was built in JavaScript and embedded into the site via security-logger.js.

This logger:

Inspects user input across search, chat, forms, and URL parameters
Matches input against injection and probing patterns
Normalizes events into a canonical schema (hcs.web_event.v1)
Correlates activity using a persistent session ID
Logs structured events to the browser console for evidence capture

Each event contains:

Session_id
event_type (page_view, input_blur, form_submit, chat_message, query_params)
Redacted input metadata (length + hash)
Detection score and rule hits
URL, timestamp, and client context

Browser DevTools and the hcs_web_event_v1 telemetry stream were captured for each action.

A Jupyter notebook was then used to analyze the raw JSON event logs exported from the session.

Attack Scenarios (Phase 1)

Session 1 - Attack Matrix

The following attacks were performed and documented during Session 1:

Attack Type	Surface	Example Payload	Result
SQL Injection	Search bar	`' OR 1=1 --`	Detected, flagged
XSS Injection	Chat	`<script>alert(1)</script>`	Detected, flagged
Parameter Tampering	URL	`?item=../../admin`	Detected, flagged
Command Injection	Form	`; ls -la`	Detected, flagged
Prompt-style abuse	Chat	`ignore previous instructions`	Low score, not flagged
Long anomalous input	Search	`200+ char string`	Logged, not flagged (visibility gap)
Script tag probe	Chat	`<script>`	Flagged, matched injection regex
HTML tag probe	Search	`<hello>`	Not logged, browser input sanitation prevented telemetry
HTML tag probe	Chat	`<hello>`	Logged - rule not modeled
Baseline normal input	All	`Normal user behavior`	Score 0

Detection Engineering Insights

Evidence Standard

Each insight below is backed by DevTools telemetry, correlated by session ID, and validated via exported JSON event analysis

DevTools console screenshots showing structured events
JSON exports reviewed in Jupyter Notebook
Session ID correlation across surfaces

Detection Coverage is Surface Dependent

What you're seeing

An attacker-style anomaly probe is used to test whether the system flags non-signature abuse that can lead to evasion or performance impact.

What this proves

Detection logic is not global to the application - it is defined by the browser controls, HTML input constraints, and which surfaces are modeled with detection rules.

Evidence

Long anomalous input in search bar — Fig 1.1 **Screenshot:** Browser controls have stopped the probing attempt and did not allow any type of detection from our ruleset.

Long anomalous input in chatbot — Fig 1.2 📸 **Screenshot**: The same probing attempt on the chatbot did generate a telemetry event. The finding was false.

Malicious Intent Can Be Seen Even When the Browser Blocks the Attack

What you're seeing

This is a SQL injection, used to trick the database into giving unauthorized access to the attacker.

What this proves

The browser prevented submission because it did not fit the ZIP code format and the telemetry does capture it as an attempted attack.

Evidence

Browser controls prevention — Fig 2.0 📸 **Screenshot**: Like previous screenshots, the browser controls prevented the attack but unlike the previous attempts, this attack triggers a telemetry event **AND** gives a score of 4.

Semantic Abuse is Invisible to Syntactic Detection Models

What you're seeing

A normal chat message with angle brackets did NOT trigger detection.

What this proves

Screenshot proves semantic attacks are not modeled.

Evidence

Semantic attacks not modeled — Fig 3.0 📸 **Screenshot**: Proves semantic attacks are not modeled.

Same Input, Different Surfaces, Different Outcomes

What you're seeing

The same benign HTML input hello was entered into 3 different input surfaces:

Search field
Chat assistant
Application form

There was not a consistant result from the surfaces.

What this proves

Detection behavior is not determined solely by input content.

Evidence

Harmless HTML stopped at browser controls — Fig 4.1 📸 **Screenshot**: HTML tag entered into the search field did not trigger any detection event.

Harmless HTML not flagged in chatbot — Fig 4.3 📸 **Screenshot**: The HTML tag was entered in the chatbot, but this time it generated a detection event score of 0 and was not flagged. Chat surface captured telemetry but **does not model HTML patterns as threats**.

The system models attacks, not adversaries.

What you're seeing

While it is not in Zip code format, the detection system sees it as normal behavior.

What this proves

The detection strategy is exploit-centric rather than adversary-centric. While it may be able to catch bad attempts, it does not know what normal human behavior looks like.

Evidence

High-entropy attack — Fig 5.0 📸 **Screenshot**: 200+ character input entered into the search field.

Post-load URL manipulation bypasses telemetry collection

What you're seeing

The URL was modified after the page finished loading, and no telemetry event was generated.

What this proves

Query parameter inspection only runs during initial page load. Any URL changes after load occur completely outside the telemetry pipeline.

Evidence

Post load URL manipuation — Fig 6.0 📸 **Screenshot**: Screenshot proves telemetry does not observe post-load URL manipulation.

Detected but not escalated; scoring models can hide real attacks in plain sight

What you're seeing

A low score was given to a clear injection/probe attempt.

What this proves

Risk scoring models can downplay true malicious behavior, creating false negatives without any rule failures.

Evidence

Probe attempt — Fig 7.0 📸 **Screenshot**: The system detected a probe (`probe_admin`) but then assigned it a low score and did not flag it - evidence that detection occurred, but prioritization logic caused the attack to be operationally ignored.

Detection Only Sees What It Is Designed to See

What you're seeing

The following is a screenshot from the Jupyter notebook, showing the telemetry analysis of all events generated during the attack session.

This notebook shows:

All event types recorded
Which events were flagged
Which events were not flagged
That every event belonged to the same session
A count of how often each surface was exercised

What this proves

The system did not fail randomly. It actually behaved exactly according to the detection rules implemented.

Events were only flagged when thye matched an explicit rule.

All other attack attempts passed through as normal telemetry because no rule existed to classify them as malicious.

Evidence

Canonical Event Telemetry Was Consistent

What you're seeing

Independent attack attempts across multiple UI surfaces (search field, form inputs, chatbot) all produced the same small set of normalized event types.

What this proves

Every attack surface produced clean, structured, and repeatable events.

Evidence

The 4 normalized event types — Fig 9.1 📸 **Screenshot**: Event analysis shows that every attack surface generated the same normalized event types, proving the telemetry architecture was canonical, structured, and repeatable - a prerequisite for detection engineering.

10.0

Single Attacker Across Multiple Surfaces

What you're seeing

Jupyter Notebook output showing only one session observed across all events.

What this proves

All events came from the same actor and session.

Evidence

Proof of same session observed — Fig 10.0 📸 **Screenshot**: Notebook evidence showing all recorded events share the same session_id, proving a single actor moved across multiple attack surfaces.

What's Next

Turning Findings into Detection Rules

Now that we have the baseline attacks and findings out of the way, Phase 2 is where we will redo the security.logger.js to fortify it against those visibility gaps and deploy to Cloud Logger. Phase 3 will be when we incorporate actual AI in the chatbot to address prompt injection secuity, etc.

The initial findings exposed visibilty gaps:

Inputs that were logged but not flagged.
Inputs the browser prevented from reaching telemetry.
Inputs that revealed rule coverage blind spots.
Events that needed better semantic modeling.

Those visibilty gaps have articulated the new detection engineering requirements:

Model semantic prompt-injection behavior.
Differentiate harmless HTML tags from executable script tags.
Add rate limiting and reliability to telemetry.
Detect URL manipulation after page load.
Introduce session correlation across attack surfaces.
Implement field classification (search vs form vs chat)

Successful implementation takes the original logger from simply capturing events to performing behavior classification before logging.

Big's BBQ & Smokehouse:

Core Objective

Tools & Technologies

Risk Mitigation

Executive Overview

Problem Framing

Public input surfaces are rarely logged or treated as security signals.

How might we treat every public input field on a hiring platform as a source of security telemetry so that common web attack behavior can be captured, structured, and made visible to detection and threat hunting workflows?

Introduction

Design Architecture

Design Standard

Attack Scenarios (Phase 1)

Session 1 - Attack Matrix

Detection Engineering Insights

Evidence Standard

Detection Coverage is Surface Dependent

Evidence

Malicious Intent Can Be Seen Even When the Browser Blocks the Attack

Evidence

Semantic Abuse is Invisible to Syntactic Detection Models

Evidence

Same Input, Different Surfaces, Different Outcomes

Evidence

The system models attacks, not adversaries.

Evidence

Post-load URL manipulation bypasses telemetry collection

Evidence

Detected but not escalated; scoring models can hide real attacks in plain sight

Evidence

Detection Only Sees What It Is Designed to See

Evidence

Canonical Event Telemetry Was Consistent

Evidence

Single Attacker Across Multiple Surfaces

Evidence

What's Next