System Architecture2026

Job Automation Engine

A nine-workflow, state-driven n8n pipeline that discovers LinkedIn jobs, scores fit, enriches company context, extracts hiring contacts, and generates tailored CV/cover-letter artifacts before notifying the operator in Slack.

System snapshot

Airtable status transitions route each job through a gated, auditable n8n AI pipeline

Workflow 0: CV ingestion + candidate profile extraction/variant generation (OpenAI)

Workflow 1: weekday job discovery using OpenAI-generated LinkedIn search configs + Apify harvesting

Workflow 2: Airtable webhook router that re-reads job state and dispatches the next workflow

Workflows 3–7: staged extraction/enrichment/scoring/contact-finding/tailoring with input validation and snapshot persistence

Workflow 8: READY_TO_APPLY notification that bundles job metadata + document links to Slack

Subflows: OpenAI response validator/unwrapper and Google Drive file content retrieval utilities

Design focus

Airtable status fields are the single source of truth for inter-workflow communication
Fit-scoring gate prevents enrichment/tailoring unless thresholds are met
Hard-capped search volume by schema constraints (exactly two configs, maxItems=7 each → 14 results/day ceiling)
AI outputs persisted as human-readable .txt snapshots in Google Drive for auditability
Shared subflows reuse OpenAI response validation and Drive retrieval across stages

Context

The Architectural Challenge

Job searching at volume commonly fails in two ways: (1) applying to roles that are a poor fit, and (2) writing generic application materials that don’t reflect the specific role. This system addresses both by treating job search and application preparation as a controlled data pipeline with explicit state transitions, a fit-gating step that blocks weak matches before any tailoring begins, and AI-generated application assets grounded in real job data, real company context, and the candidate’s actual CV. The pipeline is intentionally constrained for daily, curated throughput (maximum 14 raw results per daily run) with an operator reviewing the final packages. The system also emphasizes traceability: every intermediate artifact is persisted as text snapshots in Google Drive and tracked in Airtable, and the workflow graph is inspectable at every stage.

Project parameters

Domain: Automation
Type: System
Complexity Level: Advanced

Technology stack

n8nAirtableOpenAI GPT-5 MiniApifyGoogle DriveSlackPython 3

Core Innovation

The key architectural distinction is using Airtable status field changes as the mechanism for inter-workflow communication. Instead of chaining workflows directly or using a queue, each stage workflow writes a new status to Airtable; that triggers the webhook on Workflow 2, which routes to the appropriate downstream workflow after re-reading the canonical job record from Airtable. This enables pause/inspect/restart/manual advancement by editing a status field in Airtable—without requiring n8n intervention—and allows any job record in the right state to trigger the next stage without replaying the entire pipeline. The pipeline further differentiates computation by enforcing a strict fit-scoring gate in Workflow 4; enrichment and tailoring only occur when defined score thresholds are satisfied, otherwise jobs are marked REJECTED with a scoring reason.

Implementation

Implementation Details

Workflow 0 is triggered manually via an n8n form URL. It accepts a PDF CV upload, extracts text, uploads the CV to Google Drive, and runs two sequential OpenAI agents: one builds a canonical candidate profile, and another generates multiple strategically differentiated profile variants. Each variant is split into a separate Airtable Profile record containing fields such as type, summary, skills, preferences, visa constraints, and locations.

Workflow 1 runs on a scheduled trigger (Monday–Friday at 09:00). It loads active profiles from Airtable, selects one per day using day-of-week rotation, and calls an OpenAI agent to generate two LinkedIn search configurations scoped to the UK, full-time roles, and tech industry IDs 4/5/6 with maxItems=7 each. A subflow validates the generated configs before Apify harvests jobs. The workflow deduplicates fetched jobs by job ID and company, upserts company records, and creates Job records in Airtable with status LISTED.

Workflow 2 receives Airtable webhook events for every job status change. It validates the webhook payload, then re-reads the full job record from Airtable to avoid relying on incomplete webhook content. A Switch node routes to the correct downstream workflow for each active transition: LISTED → DETAILED → SCORED → COMPANY_ENRICHED → CONTACTS_FOUND → TAILORED → READY_TO_APPLY. Each stage workflow begins with a Validate Inputs Code node to enforce preconditions (status and required fields) and throws explicit field-level errors to make failure modes visible in the n8n execution log. Stage workflows call OpenAI and/or Apify as required, write structured text snapshots to Google Drive, create Document records in Airtable, and advance the job status when complete.

OpenAI calls use the Responses API via n8n’s OpenAI/langchain node with json_schema structured outputs (additionalProperties:false and strict:true on schemas). A shared subflow (OpenAI Response Validator & Payload Extractor) unwraps outputs from the expected content path (output[0].content[0].text) and throws if the response status is not completed. Per-job artifacts are organized into Google Drive folders named {Job Title} - {Company} - {Job ID}.

Workflow 8 validates that tailored CV and cover-letter documents exist, merges job metadata (title, company, location, fit score, visa risk, application type), and sends a Slack structured block message including a fit-score label (High/Medium/Low) plus action buttons (Apply, View CV, View Cover Letter). It then sets the job status to READY_TO_APPLY.

A Python 3 sanitization script (scripts/sanitize_n8n_exports.py) is used to strip private identifiers from exported workflow JSON while preserving node structure, routing logic, prompts, and architecture for publishing.

Latency profile

Per-job end-to-end latency: ~5–10 minutes for passing jobs

Each stage includes OpenAI API calls and Airtable reads/writes, and the pipeline advances asynchronously via webhook chaining (multiple jobs can be in-flight at different stages). The system states that end-to-end latency from LISTED to READY_TO_APPLY for a job that passes gating is approximately 5–10 minutes under normal API conditions; Workflow 7 (tailoring) makes four sequential OpenAI calls, contributing to the overall minutes-scale runtime.

System focus

Correctness and auditable decisions over bulk throughput

The pipeline is designed for daily, curated throughput with an operator applying manually. It prioritizes correctness via explicit state gating and validation: workflows only run when the incoming job status matches expected preconditions, and the fit-scoring gate blocks enrichment/tailoring when thresholds are not met. To avoid wasting compute, the fit scoring thresholds (skill_alignment_score >= 0.55 OR overall_fit_score >= 0.6 AND skill_alignment_score >= 0.45) determine whether a job is passed or REJECTED with a populated scoring reason. Operationally, the system also emphasizes inspectability and auditability by persisting intermediate AI outputs as human-readable .txt snapshots in Google Drive and tracking progression in Airtable, enabling review before any operator action. Search volume is capped to a daily ceiling of 14 raw results to keep the workflow manageable and signal-focused.

Outcomes

Outcomes & Future Iterations

- Full pipeline functional across Workflows 0–8, progressing from LISTED to READY_TO_APPLY via explicit Airtable status transitions. - Fit scoring gate enforced: jobs below thresholds are marked REJECTED with a populated scoring reason before enrichment or tailoring runs. - Automatic per-job Google Drive folder creation containing Job_Info.txt, Company_Info.txt, CV_Info.txt, and Cover_Letter_Info.txt. - Slack notification sends a structured block message including fit score label, job metadata, visa risk, application type, and action buttons with direct links to apply and view documents. - Repository exports are sanitized (pin data, credential IDs, webhook IDs, Airtable resource identifiers, Drive folder/file IDs, Slack IDs) while keeping node structure, routing logic, system prompts, scoring thresholds, and overall architecture intact.

Why this matters

Job-application automation without a fit gate wastes compute and time generating tailored materials for poor matches. This system enforces the decision boundary explicitly (before any tailoring), records why jobs are accepted or rejected (via scoring reason fields), and produces an auditable trail for every passing job (Google Drive snapshots for job/company/candidate/tailored artifacts). Because state transitions are stored in Airtable, the operator can inspect and manually advance or halt jobs by editing status fields. The modular n8n design also means individual stages can be replaced, upgraded, or tested independently without rewriting the entire pipeline, making the system practical for iterative improvement while keeping the end-to-end process consistent and controlled.