Skip to content

fix: parse threat detection results from detection.log instead of agent_output.json; and add tests#22832

Merged
davidslater merged 6 commits intomainfrom
davidslater/bugfix-thread-detection-parsing
Mar 25, 2026
Merged

fix: parse threat detection results from detection.log instead of agent_output.json; and add tests#22832
davidslater merged 6 commits intomainfrom
davidslater/bugfix-thread-detection-parsing

Conversation

@davidslater
Copy link
Contributor

@davidslater davidslater commented Mar 25, 2026

Changes

  • Read from detection.log instead of agent_output.json (new DETECTION_LOG_FILENAME constant)
  • Support both output formats: stream-json (--output-format stream-json) and raw text (--print)
    • Stream-json: extracts from {"type":"result","result":"THREAT_DETECTION_RESULT:..."} envelopes
    • Raw text: matches bare THREAT_DETECTION_RESULT:{...} lines
    • Only extracts from type:result (not type:assistant) to avoid double-counting
  • Fail-safe error handling: errors on missing result, multiple results, invalid JSON, and file I/O failures instead of silently defaulting to clean
  • Comprehensive tests: 34 tests covering both formats, edge cases, and error paths

…nt_output.json

The threat detection parser was reading agent_output.json (the main agent's
structured output) instead of detection.log (the detection copilot's stdout
piped via tee). Since agent_output.json never contains THREAT_DETECTION_RESULT,
the verdict always defaulted to clean — silently ignoring real detections.

Changes:
- Read from detection.log (DETECTION_LOG_FILENAME constant) instead of
  agent_output.json
- Support both stream-json format (type:result envelope) and raw text format
- Error on missing result, multiple results, invalid JSON, and file I/O
  failures instead of silently defaulting to clean
- Extract only from type:result envelopes (not type:assistant) to avoid
  double-counting in stream-json mode
- Add comprehensive tests for both output formats
Copilot AI review requested due to automatic review settings March 25, 2026 05:15
@davidslater davidslater changed the title fix: parse threat detection results from detection.log instead of agent_output.json fix: parse threat detection results from detection.log instead of agent_output.json; and add tests Mar 25, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes threat-detection verdict handling by parsing the detection copilot’s detection.log output (instead of agent_output.json) so real detections are not silently ignored.

Changes:

  • Switch threat detection parsing to read detection.log, supporting both stream-json and raw text formats.
  • Add fail-closed error handling for missing/invalid/multiple verdict lines and I/O failures.
  • Introduce DETECTION_LOG_FILENAME constant and add comprehensive Vitest coverage for the parser.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
actions/setup/js/parse_threat_detection_results.cjs Implements detection.log parsing (stream-json + text), strict result extraction, and fail-safe workflow failure behavior.
actions/setup/js/parse_threat_detection_results.test.cjs Adds extensive unit tests for parsing logic and documents intended main() behaviors.
actions/setup/js/constants.cjs Adds DETECTION_LOG_FILENAME export for shared path construction.
actions/setup/js/constants.test.cjs Extends constants tests to cover the new filename export.
Comments suppressed due to low confidence (1)

actions/setup/js/parse_threat_detection_results.test.cjs:306

  • Similarly, this skipped test asserts "Multiple THREAT_DETECTION_RESULT lines found", but the current code reports "Multiple THREAT_DETECTION_RESULT entries found (...)" (and prefixes with ERR_PARSE). Updating these assertions will keep the tests accurate as behavioral documentation.
      expect(mockCore.setOutput).toHaveBeenCalledWith("success", "false");
      expect(mockCore.setFailed).toHaveBeenCalledWith(expect.stringContaining("Multiple THREAT_DETECTION_RESULT lines found"));
    });

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…sages, update prompt

- Reject non-boolean threat flag types (string "false" would have been
  misclassified as true via Boolean coercion)
- Add regression tests for string/number flag values
- Fix skipped main() test expectations to match current error messages
- Update threat_detection.md prompt to require JSON boolean types
Reject null, arrays, strings, and numbers with a clear error message
instead of falling through to the boolean field check with a confusing
'expected boolean, got undefined' error.
@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

⚠️ Security scanning failed for Smoke Claude. Review the logs for details.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

✅ All tools validated successfully! Agent Container Smoke Test confirms agent container is ready.

@github-actions
Copy link
Contributor

📰 BREAKING: Smoke Copilot is now investigating this pull request. Sources say the story is developing...

@github-actions github-actions bot removed the smoke label Mar 25, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

⚠️ Security scanning failed for Smoke Codex. Review the logs for details.

@github-actions
Copy link
Contributor

Agent Container Tool Check

Tool Status Version
bash 5.2.21
sh available
git 2.53.0
jq 1.7
yq v4.52.4
curl 8.5.0
gh 2.87.3
node v20.20.1
python3 3.12.3
go 1.24.13
java openjdk 21.0.10
dotnet 10.0.102

Result: 12/12 tools available ✅
Overall Status: PASS

🔧 Tool validation by Agent Container Smoke Test ·

@github-actions
Copy link
Contributor

🤖 Smoke test #23526665713 complete! Results for PR by @davidslater:

  • GitHub MCP ✅ | GH CLI ✅ | Serena MCP ❌ | Playwright ✅
  • Web Fetch ✅ | File Write ✅ | Bash ✅ | Discussion ✅
  • Build ✅ | Dispatch ✅ | PR Review ✅

Overall: ⚠️ PARTIAL PASS (11/12 — Serena MCP unavailable)

📰 BREAKING: Report filed by Smoke Copilot ·

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smoke test review pass ✅ — PR looks solid. The refactor to read from detection.log instead of agent_output.json is well-structured, with clear support for both stream-json and raw text formats. The 34 tests provide good coverage. Minor suggestions: add clarifying comments on the type:result filter rationale and ensure DETECTION_LOG_FILENAME usage is consistent throughout.

📰 BREAKING: Report filed by Smoke Copilot

@github-actions
Copy link
Contributor

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

The detection command writes to the same file via both --debug-file and
tee -a, so the same THREAT_DETECTION_RESULT line appears 2-3 times.
Deduplicate identical entries instead of erroring. Only error when
entries conflict (different verdicts).
@davidslater davidslater self-assigned this Mar 25, 2026
@davidslater davidslater merged commit e2cea0b into main Mar 25, 2026
55 checks passed
@davidslater davidslater deleted the davidslater/bugfix-thread-detection-parsing branch March 25, 2026 06:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants