fix: parse threat detection results from detection.log instead of agent_output.json; and add tests#22832
Conversation
…nt_output.json The threat detection parser was reading agent_output.json (the main agent's structured output) instead of detection.log (the detection copilot's stdout piped via tee). Since agent_output.json never contains THREAT_DETECTION_RESULT, the verdict always defaulted to clean — silently ignoring real detections. Changes: - Read from detection.log (DETECTION_LOG_FILENAME constant) instead of agent_output.json - Support both stream-json format (type:result envelope) and raw text format - Error on missing result, multiple results, invalid JSON, and file I/O failures instead of silently defaulting to clean - Extract only from type:result envelopes (not type:assistant) to avoid double-counting in stream-json mode - Add comprehensive tests for both output formats
There was a problem hiding this comment.
Pull request overview
Fixes threat-detection verdict handling by parsing the detection copilot’s detection.log output (instead of agent_output.json) so real detections are not silently ignored.
Changes:
- Switch threat detection parsing to read
detection.log, supporting both stream-json and raw text formats. - Add fail-closed error handling for missing/invalid/multiple verdict lines and I/O failures.
- Introduce
DETECTION_LOG_FILENAMEconstant and add comprehensive Vitest coverage for the parser.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| actions/setup/js/parse_threat_detection_results.cjs | Implements detection.log parsing (stream-json + text), strict result extraction, and fail-safe workflow failure behavior. |
| actions/setup/js/parse_threat_detection_results.test.cjs | Adds extensive unit tests for parsing logic and documents intended main() behaviors. |
| actions/setup/js/constants.cjs | Adds DETECTION_LOG_FILENAME export for shared path construction. |
| actions/setup/js/constants.test.cjs | Extends constants tests to cover the new filename export. |
Comments suppressed due to low confidence (1)
actions/setup/js/parse_threat_detection_results.test.cjs:306
- Similarly, this skipped test asserts
"Multiple THREAT_DETECTION_RESULT lines found", but the current code reports"Multiple THREAT_DETECTION_RESULT entries found (...)"(and prefixes withERR_PARSE). Updating these assertions will keep the tests accurate as behavioral documentation.
expect(mockCore.setOutput).toHaveBeenCalledWith("success", "false");
expect(mockCore.setFailed).toHaveBeenCalledWith(expect.stringContaining("Multiple THREAT_DETECTION_RESULT lines found"));
});
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…sages, update prompt - Reject non-boolean threat flag types (string "false" would have been misclassified as true via Boolean coercion) - Add regression tests for string/number flag values - Fix skipped main() test expectations to match current error messages - Update threat_detection.md prompt to require JSON boolean types
Reject null, arrays, strings, and numbers with a clear error message instead of falling through to the boolean field check with a confusing 'expected boolean, got undefined' error.
|
|
|
✅ All tools validated successfully! Agent Container Smoke Test confirms agent container is ready. |
|
📰 BREAKING: Smoke Copilot is now investigating this pull request. Sources say the story is developing... |
|
|
Agent Container Tool Check
Result: 12/12 tools available ✅
|
|
🤖 Smoke test #23526665713 complete! Results for PR by @davidslater:
Overall:
|
There was a problem hiding this comment.
Smoke test review pass ✅ — PR looks solid. The refactor to read from detection.log instead of agent_output.json is well-structured, with clear support for both stream-json and raw text formats. The 34 tests provide good coverage. Minor suggestions: add clarifying comments on the type:result filter rationale and ensure DETECTION_LOG_FILENAME usage is consistent throughout.
📰 BREAKING: Report filed by Smoke Copilot
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
The detection command writes to the same file via both --debug-file and tee -a, so the same THREAT_DETECTION_RESULT line appears 2-3 times. Deduplicate identical entries instead of erroring. Only error when entries conflict (different verdicts).
Changes
detection.loginstead ofagent_output.json(newDETECTION_LOG_FILENAMEconstant)--output-format stream-json) and raw text (--print){"type":"result","result":"THREAT_DETECTION_RESULT:..."}envelopesTHREAT_DETECTION_RESULT:{...}linestype:result(nottype:assistant) to avoid double-counting