Procurement Audit Automation
Automated Compliance & Audit Engine for ERP procurement exports
: Turn messy ERP procurement dumps (Excel/CSV) into audit-ready evidence: ghost vendor detection, PO variance checks, high-value flags, and FOIP/PII risk scanning.
Procurement exports often contain exceptions that are easy to miss in spreadsheets. This project converts those risks into traceable exception reports that can be reviewed quickly and exported as evidence.
Config-driven controls, repeatable audit checks, privacy risk scanning on unstructured text, and exportable CSV evidence tables that support compliance and internal audit workflows.
What this project delivers
Audit thresholds and switches live in YAML to enable change control without code churn.
Ghost vendor detection, PO variance computation, and high-value invoice flagging using deterministic logic.
NER + lightweight heuristics flag potential names/emails typed into unstructured Notes fields.
Each run produces clean CSV outputs suitable for attaching to tickets, review packages, or internal audit evidence bundles.
GitHub Actions runs the deterministic audit pipeline and unit tests. The AI step can be skipped in CI to keep runs stable and fast.
⚡ Quick Links
Demo Flow (video + screenshot plan)
Command: python src/data_generator.py
Command: python src/rule_engine.py
Command: python src/ai_auditor.py
Command: streamlit run app/dashboard.py
Command: pytest -q
Evidence outputs
Evidence tables exported to data/audit_reports/ after each run (timestamped).
Exception list of flagged Notes content to support privacy review before sharing or archiving.
Documentation pages
How rules map to evidence tables and why the checks are audit-grade.
why this design is audit-grade
Thresholds and switches live in a single configuration file, making changes reviewable, traceable, and consistent across runs.
Invoices are left-joined against the vendor master and missing matches are extracted as an evidence table. This is scalable and explainable.
abs(invoice - po) / po produces a transparent control metric that can be tuned via config and justified during review.
Unstructured Notes is the highest FOIP/PII risk surface. The scanner outputs an exception list for review (names/emails), not a compliance decision.
CI validates deterministic logic + tests reliably. The AI step can be toggled off with SKIP_AI=1 to keep CI stable while preserving full local demos.
Mapping to IT Reporting / Compliance work
Produces repeatable exception tables for review, triage, and downstream reporting.
Supports policy-driven checks and highlights FOIP/PII risk before data is shared or archived.