All writing
Ai

P.E.N.E.: The Prompt Framework I’m Using to Build AI Agents for Network Operations

A practical field guide for using the P.E.N.E. framework as an agent behavior contract for NetOps workflows like BGP reviews, firewall audits, alert triage, NetBox drift detection, change planning, and incident summaries.

P.E.N.E.: The Prompt Framework I’m Using to Build AI Agents for Network Operations
Direct Answer

P.E.N.E. is a practical framework for designing LLM-powered NetOps workflows that behave predictably. It started as a prompt framework, but in the age of AI agents it works better as an agent behavior contract: Persona & Purpose, Examples, kNowledge & coNstraints, and Evaluation & iteration.

P.E.N.E.: The Prompt Framework I’m Using to Build AI Agents for Network Operations

Most AI demos for network engineers look impressive until you ask one simple question:

Can I trust this inside a real workflow?

That is where things get interesting.

Because asking an AI model to explain a config is one thing.

Asking it to analyze network state, return structured findings, respect operational constraints, call tools safely, and feed the next automation step?

That is a different game.

That is why I created the P.E.N.E. Framework.

P.E.N.E. stands for:

  • P — Persona & Purpose
  • E — Examples
  • N — kNowledge & coNstraints
  • E — Evaluation & iteration

At first, I thought about P.E.N.E. mostly as a prompt framework.

A better way to ask AI for useful answers.

But the more I build with LLMs, agents, tools, and automation platforms, the more I see it differently.

P.E.N.E. is not just prompt engineering anymore.

It is an agent behavior contract.

It helps answer the questions that matter before an AI system touches a real NetOps workflow:

  • What role should the AI play?
  • What outcome is it responsible for?
  • What does a good answer look like?
  • What context does it need?
  • What should it never do?
  • What format does the workflow need?
  • How do we know it is getting better instead of just sounding better?

That last one matters.

Because in network operations, sounding confident is not the same as being useful.

The goal is simple: help network engineers design LLM-powered workflows that produce outputs you can actually use.

Not random walls of text.

Not vague recommendations.

Not “this looks fine” when it absolutely does not.

I want AI responses that are:

  1. Consistent
  2. Actionable
  3. Parseable
  4. Constrained
  5. Testable
  6. Useful inside automation

You can find the framework here:

P.E.N.E. Framework on GitHub

The GitHub repo is the framework and toolkit.

This article is the field guide.

Let me show you how I think about it.


Why I Built P.E.N.E.

Most network engineers are not short on tools.

We have monitoring platforms.

We have IPAM.

We have source-of-truth systems.

We have ticketing systems.

We have scripts.

We have dashboards.

And now we have AI.

But here is the problem:

AI without structure becomes another noisy tool.

It may answer the question.

It may explain something.

It may even sound brilliant.

But if the output is inconsistent, hard to parse, too verbose, or ignores your environment, it becomes hard to trust.

And in network operations, trust matters.

A bad recommendation can create an outage.

A vague response can waste time.

A hallucinated command can send someone down the wrong path.

So the real question is not:

Can AI help with network operations?

The better question is:

How do we design AI prompts, agents, and workflows so they behave predictably inside network operations?

That is where P.E.N.E. comes in.


Why P.E.N.E. Still Matters With Better LLMs

A fair question is:

Do we still need a prompt framework if the models are getting better?

Yes.

Actually, I think we need it more.

Better LLMs do not remove the need for structure.

They raise the bar for what we can safely build.

A stronger model may understand BGP better.

It may summarize logs better.

It may generate cleaner change plans.

It may even understand the difference between IOS-XE, IOS-XR, Junos, Arista EOS, and Palo Alto syntax better than older models.

That is useful.

But inside network operations, understanding is only one part of the job.

The bigger question is:

Can the system behave predictably when connected to tools, source-of-truth data, tickets, approvals, and production workflows?

That is where P.E.N.E. still matters.

Persona gives the agent a role.

Examples show the pattern.

Knowledge and constraints define the operating boundary.

Evaluation turns the prompt from a clever demo into something you can test, measure, and improve.

That is the difference between playing with AI and engineering with AI.


The Shift: From Prompt Framework to Agent Behavior Contract

Prompt engineering used to feel like finding the right words.

Write a better instruction.

Add a better example.

Tell the model to “think like a senior engineer.”

That helped.

But modern LLM workflows are moving beyond single prompts.

Now we are building systems that include:

  • structured outputs
  • JSON schema validation
  • tool calling
  • external context
  • source-of-truth lookups
  • multi-step workflows
  • human approvals
  • tickets and cases
  • audit trails
  • evals and regression testing

So the prompt is not just a message anymore.

It is part of the operating system for the agent.

That is why I like thinking about P.E.N.E. as a behavior contract.

It defines what the agent is supposed to do, what it should never do, what output it must return, and how we will judge whether it worked.

For NetOps, that matters because we are not building toy workflows.

We are dealing with routing, firewall rules, capacity, incidents, source-of-truth drift, change windows, approvals, and production risk.

Giving an AI agent a tool without a behavior contract is risky.

That is not automation.

That is gambling with better branding.


The GitHub Repo Is the Framework. This Article Is the Field Guide.

The P.E.N.E. GitHub repo gives you the reusable structure.

It defines the framework, shows the template, includes quick-start guidance, and provides practical NetOps examples like BGP analysis, network alert triage, and configuration generation.

This article has a different job.

This article is about the operational mindset.

GitHub RepoBlog Article
Defines the frameworkExplains why it matters
Gives the reusable templateShows where to apply it
Includes prompt examplesConnects examples to real workflows
Teaches the mechanicsTeaches the operational thinking
Built for referenceBuilt for adoption

That difference matters.

Because prompt engineering for network operations is not just about writing better prompts.

It is about creating better operating loops.


The Problem With “Analyze This Config”

Here is where most of us start:

Analyze this BGP config.

That is not a terrible prompt.

But it is incomplete.

The model does not know:

  • what role it should play
  • what platform it should assume
  • what risks matter most
  • what output format you need
  • whether it should recommend changes
  • whether those changes are allowed
  • whether this is production, lab, or staging
  • whether it should call a tool
  • whether it should open a ticket
  • whether the output needs to be consumed by automation

So the model fills in the blanks.

That is dangerous.

Not because the model is useless.

Because the prompt gave it too much freedom.

In network operations, I do not want AI making assumptions about my environment.

I want it working inside boundaries.

That is exactly what P.E.N.E. helps with.


P — Persona & Purpose

This is where we tell the AI what hat to wear.

Not just:

You are helpful.

Better:

You are a senior network engineer reviewing BGP configurations for a production service provider network.

Your task is to identify misconfigurations that could cause routing instability, security exposure, or operational risk.

That gives the model direction.

It now knows the role.

It knows the job.

It knows what “good” looks like.

For NetOps, this is huge because different tasks require different perspectives.

A NOC engineer triaging alerts thinks differently from a security engineer reviewing firewall policy.

A network architect reviewing design intent thinks differently from an automation engineer generating config.

A change planning assistant needs to be slower, safer, and more structured than a documentation extractor.

The persona shapes the response.

Here are a few examples:

You are a NOC engineer triaging network alerts and deciding whether escalation is required.
You are a network security engineer reviewing firewall policy for risky access rules and least-privilege violations.
You are a NetOps automation engineer comparing NetBox intended state against observed device configuration.
You are a change planning assistant creating safe, reviewable network change plans with rollback steps.

Same model.

Different jobs.

Different behavior.

That is the point.

A stronger version also defines success:

Success means returning accurate, evidence-based findings in the required schema without recommending unsafe production changes.

That one line changes the tone of the whole workflow.

Now the model is not just “being helpful.”

It has a job to perform.


E — Examples

This is where things start getting practical.

Examples are how you show the model the pattern.

A lot of prompts fail because they describe what they want, but they do not show what “good” looks like.

For network automation, examples are gold.

Example input:

access-list 101 permit ip any any

Expected output:

{
  "severity": "critical",
  "issue": "Overly permissive ACL allows all traffic",
  "recommendation": "Replace with explicit permit rules"
}

Now the model has a pattern.

It can see the input.

It can see the output.

It can see the structure.

This is where AI starts becoming useful for automation, because we are not just asking for insight.

We are asking for insight in a format another system can use.

That is a huge difference.

But here is the part I would add today:

Examples are not only for output style.

They are also for behavior.

Give examples of what should happen when the input is clean.

Give examples of what should happen when the input is broken.

Give examples of what should happen when the model does not have enough information.

That last one is important.

A useful NetOps agent should be able to say:

{
  "status": "insufficient_context",
  "reason": "The peer policy is not provided, so the maximum-prefix value cannot be safely recommended.",
  "next_step": "Request approved peer policy or retrieve it from source of truth."
}

That is better than guessing.

And in production, better than guessing is a very good place to start.


N — kNowledge & coNstraints

This is the part I think most people skip.

And it is probably the most important part for real network operations.

The model needs context:

Context:
- Platform: Cisco IOS-XE 17.x
- This is an eBGP peer
- All external BGP peers require authentication
- Maximum prefix limits should be configured
- This is a production network

But it also needs boundaries:

Do NOT:
- Recommend IOS-XR commands
- Recommend rebooting production devices
- Suggest disruptive changes without human approval
- Flag password strength; that is handled by another policy engine

This is how we move from “AI gave me advice” to “AI followed operational policy.”

That is a big deal.

Because a useful AI agent should not just know what to do.

It should know what not to do.

This is also where P.E.N.E. connects to what people now call context engineering.

The work is not just prompt wording anymore.

The work is deciding what context the model needs at the moment it needs it.

For NetOps, that context might include:

  • NetBox intended state
  • current device config
  • recent alerts
  • interface metadata
  • maintenance windows
  • routing policy
  • change freeze calendars
  • approved standards
  • known exceptions
  • prior incident notes
  • device role and site criticality

The model should not have to invent any of that.

The workflow should provide it.

That is how you build something safer.


E — Evaluation & Iteration

The first prompt is rarely the final prompt.

That is not failure.

That is engineering.

For NetOps, I would test prompts against:

  • clean configs
  • broken configs
  • empty inputs
  • malformed configs
  • mixed vendor syntax
  • very large configs
  • configs with no issues
  • configs with many issues
  • alerts with missing fields
  • logs with misleading symptoms
  • source-of-truth data with bad assumptions
  • cases where the right answer is “do nothing”
  • cases where the right answer is “ask for approval”
  • cases where the right answer is “I do not have enough context”

This is where prompts become more reliable.

Not perfect.

Reliable.

And in operations, reliable beats flashy every time.

I would also separate evaluation into three buckets:

Evaluation TypeWhat It ChecksNetOps Example
Format checksDid the model return valid structured output?Does the JSON include severity, evidence, and requires_human_approval?
Behavior checksDid the model follow the rules?Did it avoid recommending a reboot or unsafe production change?
Outcome checksWas the recommendation useful and accurate?Did it correctly identify missing BGP authentication or ignore a compliant peer?

That is how you move from “the prompt worked once” to “the workflow is getting safer over time.”


Important: P.E.N.E. Does Not Replace Structured Outputs

This is an important distinction.

P.E.N.E. defines the behavior.

Structured outputs or JSON schema enforce the shape.

Workflow logic decides what happens next.

Human approval controls risk.

Evals prove whether the system is improving.

So I would not build production workflows around “Return only JSON” alone.

That is fine for learning and demos.

But for production-style systems, I would rather think like this:

P.E.N.E. prompt

Structured output schema

Validation / retry / fallback

Workflow decision

Human approval if risk is present

Action, ticket, case, or report

That keeps the model in the right place.

The model reasons.

The schema validates.

The workflow controls.

The human approves.

That is a much better design pattern.


Example 1: BGP Risk Analyzer

Here is a stronger P.E.N.E. prompt for a BGP review.

## PERSONA & PURPOSE

You are a senior network engineer reviewing BGP configurations for a production service provider network.

Your task is to identify configuration risks that could cause routing instability, security issues, or operational incidents.

Success means returning evidence-based findings in the required schema without recommending unsafe production changes.

## EXAMPLES

Input:
router bgp 65001
 neighbor 10.0.0.1 remote-as 65002

Expected Output:
{
  "summary": "BGP peer is missing authentication and maximum-prefix protection.",
  "risk_score": 7,
  "issues": [
    {
      "severity": "high",
      "category": "security",
      "finding": "Missing BGP neighbor authentication",
      "evidence": "neighbor 10.0.0.1 remote-as 65002",
      "recommendation": "Add BGP neighbor authentication using the approved standard."
    },
    {
      "severity": "medium",
      "category": "stability",
      "finding": "Missing maximum-prefix limit",
      "evidence": "No maximum-prefix statement found",
      "recommendation": "Add maximum-prefix limits based on peer agreement."
    }
  ],
  "requires_human_approval": true,
  "safe_to_auto_remediate": false
}

Input:
router bgp 65001
 neighbor 10.0.0.1 remote-as 65002
 neighbor 10.0.0.1 password configured
 neighbor 10.0.0.1 maximum-prefix 5000 90

Expected Output:
{
  "summary": "BGP peer meets baseline policy.",
  "risk_score": 1,
  "issues": [],
  "requires_human_approval": false,
  "safe_to_auto_remediate": false
}

Input:
router bgp 65001
 neighbor 10.0.0.1 remote-as 65002
 neighbor 10.0.0.1 password configured

Expected Output:
{
  "summary": "BGP peer is missing maximum-prefix protection, but the approved peer limit is not provided.",
  "risk_score": 5,
  "issues": [
    {
      "severity": "medium",
      "category": "stability",
      "finding": "Missing maximum-prefix limit",
      "evidence": "No maximum-prefix statement found",
      "recommendation": "Retrieve the approved prefix limit from source of truth before drafting a change."
    }
  ],
  "requires_human_approval": true,
  "safe_to_auto_remediate": false
}

## KNOWLEDGE & CONSTRAINTS

Context:
- Platform: Cisco IOS-XE
- All eBGP peers require authentication
- All eBGP peers require maximum-prefix limits
- This is production
- Recommendations must be safe for change review

Do NOT:
- Recommend IOS-XR syntax
- Recommend immediate production changes
- Recommend clearing BGP sessions
- Guess missing peer policy values
- Claim the configuration was changed

## OUTPUT CONTRACT

Return structured output with these fields:
- summary
- risk_score from 1 to 10
- issues array
- requires_human_approval boolean
- safe_to_auto_remediate boolean

## NOW ANALYZE THIS:

[paste BGP config here]

This is already more useful than “analyze this config.”

Now a workflow can route based on risk_score.

A ticket can be created if requires_human_approval is true.

A workflow can stop if safe_to_auto_remediate is false.

A human can review the finding before any change is made.

This is how automation should feel.


Example 2: Network Alert Triage Agent

The alert is where most operations teams feel the pain.

Too much noise.

Not enough context.

Too many manual decisions.

Here is a P.E.N.E. prompt for alert triage.

## PERSONA & PURPOSE

You are a NOC engineer triaging network alerts.

Your task is to classify the alert, identify likely impact, recommend immediate checks, and decide whether escalation is required.

Success means routing the alert to the right next step without recommending configuration changes during triage.

## EXAMPLES

Input:
Interface Ethernet1 is down on CORE-RTR-01

Expected Output:
{
  "severity": "critical",
  "priority": 1,
  "impact": "Potential production traffic impact on core router",
  "immediate_checks": [
    "show interface Ethernet1",
    "show logging | include Ethernet1",
    "show ip route summary",
    "check upstream/downstream dependency"
  ],
  "escalate": true,
  "route_to": "network_oncall",
  "requires_human_approval": true
}

Input:
Interface Ethernet48 is down on ACCESS-SW-22

Expected Output:
{
  "severity": "low",
  "priority": 4,
  "impact": "Likely single access port impact",
  "immediate_checks": [
    "show interface Ethernet48",
    "check connected endpoint inventory"
  ],
  "escalate": false,
  "route_to": "service_desk",
  "requires_human_approval": false
}

## KNOWLEDGE & CONSTRAINTS

Context:
- Devices starting with CORE are critical infrastructure
- Devices starting with ACCESS are edge/access switches
- Critical alerts after hours must page on-call
- Low severity access port alerts should become tickets
- Business hours are 8 AM to 6 PM Eastern

Do NOT:
- Recommend rebooting a device
- Recommend configuration changes during triage
- Escalate low-priority alerts unless multiple related alerts exist
- Claim customer impact unless provided by monitoring, dependency mapping, or incident data

## OUTPUT CONTRACT

Return structured output with:
- severity
- priority
- impact
- immediate_checks
- escalate
- route_to
- requires_human_approval

Now imagine this inside a workflow:

  1. Alert comes in from monitoring.
  2. Workflow enriches the alert with device role and site context.
  3. AI triages it using P.E.N.E.
  4. Workflow validates the structured output.
  5. Critical alert pages the on-call engineer.
  6. Low-priority alert opens a ticket.
  7. Human gets a clean summary instead of raw noise.

That is where AI starts to earn its keep.


Example 3: NetBox Intent Validator

This is where P.E.N.E. becomes really useful for source-of-truth workflows.

Let’s say NetBox says a site should have this VLAN:

{
  "site": "ATL-DC1",
  "vlans": [
    {
      "id": 120,
      "name": "SERVER_BACKEND",
      "prefix": "10.120.0.0/24"
    }
  ]
}

But the device config does not match.

A P.E.N.E. prompt can compare intended state against observed state.

## PERSONA & PURPOSE

You are a NetOps automation engineer validating device configuration against NetBox source-of-truth data.

Your task is to identify drift between intended state and observed device configuration.

Success means identifying drift without assuming the device or NetBox should be changed automatically.

## EXAMPLES

Input:
Intent:
VLAN 120 SERVER_BACKEND 10.120.0.0/24

Observed:
vlan 120
 name SERVER_BACKEND
interface Vlan120
 ip address 10.120.0.1 255.255.255.0

Expected Output:
{
  "drift_detected": false,
  "findings": [],
  "recommended_action": "No action required",
  "requires_change_request": false
}

Input:
Intent:
VLAN 120 SERVER_BACKEND 10.120.0.0/24

Observed:
vlan 120
 name SERVER-BACKEND

Expected Output:
{
  "drift_detected": true,
  "findings": [
    {
      "severity": "medium",
      "type": "naming_drift",
      "intent": "SERVER_BACKEND",
      "observed": "SERVER-BACKEND",
      "recommendation": "Update VLAN name to match NetBox standard after approval."
    },
    {
      "severity": "high",
      "type": "missing_svi",
      "intent": "interface Vlan120 with 10.120.0.1/24",
      "observed": "No SVI found",
      "recommendation": "Create SVI only after change approval and dependency validation."
    }
  ],
  "recommended_action": "Open change request",
  "requires_change_request": true
}

## KNOWLEDGE & CONSTRAINTS

Context:
- NetBox is the source of truth
- Device configuration is observed state
- VLAN names must use uppercase and underscores
- Missing SVIs require change approval
- This workflow detects drift only

Do NOT:
- Generate configuration unless requested
- Assume NetBox is wrong
- Auto-remediate production drift
- Recommend deleting existing VLANs
- Treat naming drift as a production outage

## OUTPUT CONTRACT

Return structured output with:
- drift_detected
- findings
- recommended_action
- requires_change_request

This is the kind of AI use case I like.

Not:

AI, go configure my network.

More like:

AI, compare intent to reality, explain the gap, and help me decide what should happen next.

That is useful.

That is safer.

That is much closer to how real network teams operate.


Example 4: Firewall Policy Review

Firewall reviews are a perfect place to use structured AI assistance.

Not because AI magically understands every business requirement.

It does not.

But it can help create consistency around the first-pass review.

## PERSONA & PURPOSE

You are a network security engineer reviewing firewall policy for risky access rules.

Your task is to identify overly permissive rules, missing logging, risky source/destination combinations, and rules that may violate least privilege.

Success means identifying review-worthy risk without making unsupported assumptions about business justification.

## EXAMPLES

Input:
access-list OUTSIDE_IN permit ip any any

Expected Output:
{
  "risk_level": "critical",
  "finding": "Any-to-any inbound access rule",
  "why_it_matters": "This violates least privilege and may expose internal services.",
  "recommendation": "Replace with explicit source, destination, and service-based rules after dependency review.",
  "requires_review": true,
  "safe_to_auto_change": false
}

Input:
access-list OUTSIDE_IN permit tcp 203.0.113.10 host 10.1.10.5 eq 443 log

Expected Output:
{
  "risk_level": "low",
  "finding": "Specific HTTPS access rule with logging enabled",
  "why_it_matters": "Rule is scoped to a specific source, destination, and service.",
  "recommendation": "No immediate action required.",
  "requires_review": false,
  "safe_to_auto_change": false
}

## KNOWLEDGE & CONSTRAINTS

Context:
- Inbound internet rules must be explicit
- Any-to-any rules are prohibited
- Logging is required on internet-facing permit rules
- This is a review workflow, not an enforcement workflow

Do NOT:
- Recommend deleting rules without dependency analysis
- Assume business justification is missing
- Recommend emergency changes unless risk is critical
- Generate vendor-specific commands
- Claim a rule is unused unless usage data is provided

## OUTPUT CONTRACT

Return structured output with:
- risk_level
- finding
- why_it_matters
- recommendation
- requires_review
- safe_to_auto_change

Notice the guardrails.

The AI is reviewing.

Not making the change.

Not deleting rules.

Not pretending it understands every dependency.

That distinction matters.


Example 5: Change Plan Generator With Human Approval

This is where I think AI agents become really useful.

Not:

Make the change.

More like:

Draft the change plan, explain the risk, and wait for approval.

## PERSONA & PURPOSE

You are a network change planning assistant.

Your task is to create a safe, reviewable change plan based on the approved remediation recommendation.

Success means producing a plan that includes pre-checks, implementation steps, rollback, post-checks, risk notes, and approval status.

## EXAMPLES

Input:
Issue: Missing maximum-prefix on eBGP neighbor 10.0.0.1
Platform: Cisco IOS-XE
Approved recommendation: Add maximum-prefix 5000 90

Expected Output:
{
  "change_summary": "Add maximum-prefix protection to eBGP neighbor 10.0.0.1",
  "risk_level": "medium",
  "pre_checks": [
    "show ip bgp summary",
    "show run | section router bgp",
    "confirm peer accepted prefix limit"
  ],
  "implementation_steps": [
    "Enter BGP configuration mode",
    "Apply maximum-prefix setting to neighbor",
    "Save configuration after validation"
  ],
  "rollback_plan": [
    "Remove maximum-prefix statement from neighbor",
    "Validate BGP session remains established"
  ],
  "post_checks": [
    "show ip bgp summary",
    "show logging | include BGP",
    "confirm prefixes received are below threshold"
  ],
  "requires_human_approval": true,
  "execution_allowed": false
}

## KNOWLEDGE & CONSTRAINTS

Context:
- All production changes require approval
- Change plans must include pre-checks, implementation steps, rollback, and post-checks
- Do not include secrets
- Do not execute changes

Do NOT:
- Claim the change has been applied
- Skip rollback steps
- Recommend clearing BGP sessions
- Include exact passwords or secrets
- Convert the plan into commands unless explicitly requested

## OUTPUT CONTRACT

Return structured output with:
- change_summary
- risk_level
- pre_checks
- implementation_steps
- rollback_plan
- post_checks
- requires_human_approval
- execution_allowed

This is the kind of workflow I want network engineers to build.

AI helps structure the work.

Humans stay in control.

Automation handles the repeatable parts.

That is the balance.


Example 6: Documentation Extractor

Let’s talk about something less flashy but extremely useful.

Documentation.

Every network team has config details that should be documented but usually live inside the config itself.

A P.E.N.E. prompt can extract useful details into a consistent format.

## PERSONA & PURPOSE

You are a network documentation assistant.

Your task is to extract important operational details from a router configuration and return them in a clean Markdown table.

Success means extracting only documented facts from the provided configuration.

## EXAMPLES

Input:
interface GigabitEthernet0/0
 description WAN to ISP-A
 ip address 203.0.113.2 255.255.255.252

Expected Output:
| Section | Value |
|---|---|
| Interface | GigabitEthernet0/0 |
| Description | WAN to ISP-A |
| IP Address | 203.0.113.2/30 |
| Role | WAN |

## KNOWLEDGE & CONSTRAINTS

Context:
- WAN interfaces usually include ISP, WAN, MPLS, DIA, or Internet in the description
- LAN interfaces usually include Users, Servers, Voice, Wireless, or Campus
- Return only documented values from the config
- If a value is missing, write "Not found"

Do NOT:
- Invent missing descriptions
- Guess circuit IDs
- Include secrets
- Include enable passwords, SNMP communities, or keys

## OUTPUT CONTRACT

Return a Markdown table.

This may not get as many clicks as “autonomous AI agent fixes BGP.”

But this is the kind of work that saves teams hours.

And it reduces tribal knowledge.

That matters.


Example 7: Incident Summary Generator

Another practical use case is summarizing incident details for handoff.

The key is to keep it factual.

## PERSONA & PURPOSE

You are an incident communications assistant for a network operations team.

Your task is to summarize the incident using only the provided alert, ticket, and timeline data.

Success means creating a clear, calm, factual summary that works for technical and business stakeholders.

## EXAMPLES

Input:
Alert: CORE-RTR-01 interface Ethernet1 down at 02:14
Timeline:
02:14 alert triggered
02:17 on-call acknowledged
02:21 link restored
02:25 monitoring cleared

Expected Output:
{
  "executive_summary": "A core router interface outage was detected and restored within 11 minutes.",
  "customer_impact": "Potential brief connectivity degradation for dependent services.",
  "timeline": [
    "02:14 - Alert triggered for CORE-RTR-01 Ethernet1",
    "02:17 - On-call engineer acknowledged the alert",
    "02:21 - Link restored",
    "02:25 - Monitoring cleared"
  ],
  "next_steps": [
    "Review interface logs",
    "Confirm physical layer stability",
    "Determine whether provider follow-up is required"
  ],
  "confidence": "medium"
}

## KNOWLEDGE & CONSTRAINTS

Context:
- Audience may include technical leaders and business stakeholders
- Keep summary factual and calm
- Include only information provided
- Separate confirmed facts from assumptions

Do NOT:
- Assign blame
- Invent root cause
- Claim customer impact unless provided
- Use dramatic language

## OUTPUT CONTRACT

Return structured output with:
- executive_summary
- customer_impact
- timeline
- next_steps
- confidence

This is where AI can help us communicate better.

Not by making things fluffy.

By making them clear.


Where P.E.N.E. Fits Into AI Agents

A lot of people hear “AI agent” and immediately think:

Cool, the AI can use tools.

That is part of it.

But tool use without clear instructions is risky.

The prompt is the behavior layer.

It defines:

  • who the agent is
  • what job it performs
  • what good output looks like
  • what context matters
  • what actions are forbidden
  • when escalation is required
  • how the workflow should consume the result
  • how the system should behave when context is missing
  • how the output should be evaluated

This is why P.E.N.E. matters.

It gives the agent a job description, examples, boundaries, and a test loop.

Without that, you are just giving a model tools and hoping it behaves.

That is not engineering.

That is gambling with better branding.


A Simple Agent Flow

Here is how I think about the flow:

Network Event

Enrichment / Context Gathering

P.E.N.E. Behavior Contract

Structured AI Output

Schema Validation

Workflow Decision

Human Approval

Automation Action

Evidence / Ticket / Report

Evaluation / Improvement Loop

That loop is important.

The AI does not need to own every step.

In fact, it probably should not.

A strong design gives the AI the job it is good at:

  • classification
  • summarization
  • reasoning over context
  • drafting recommendations
  • extracting structured details
  • identifying missing information

Then automation handles the repeatable workflow pieces:

  • routing
  • ticket creation
  • approval
  • enrichment
  • notification
  • validation
  • reporting
  • evidence capture

That is the sweet spot.


What This Looks Like in Tines

This is also where tools like Tines become really interesting.

You can take a P.E.N.E. prompt and use it inside an automated workflow.

For example:

  1. A monitoring alert comes in.
  2. Tines enriches the alert with device data.
  3. The AI agent uses a P.E.N.E. behavior contract to triage it.
  4. The output comes back as structured data.
  5. The workflow validates the shape of the response.
  6. The workflow routes based on severity.
  7. A human approves any risky action.
  8. The workflow creates the ticket, updates the case, or triggers the next step.
  9. The result is captured as evidence for later review.

That is not AI replacing the engineer.

That is AI helping the engineer move faster with better context.

And because the output is structured, the workflow can actually use it.

This is the part I care about.

AI should not just sound smart.

It should make the workflow smarter.


How I’ll Teach This in the Workshop

This is also why I’m tying P.E.N.E. directly into my upcoming workshop:

Build AI Agents for Network Operations

We are going to move beyond “ask the chatbot.”

The goal is to help network engineers understand how AI agents are built, how prompts shape behavior, how tools fit into the loop, and how to keep humans in control.

We will look at how to build agents that can:

  • reason through network operations tasks
  • work with structured prompts
  • use tools safely
  • return parseable outputs
  • support human-in-the-loop workflows
  • help engineers troubleshoot and document faster
  • fail safely when context is missing
  • become more reliable through evaluation

The point is not to create a black box.

The point is to build something you can understand, test, and improve.

That is what makes this practical.


Why This Matters Right Now

I do not think AI replaces network engineers.

But I do think network engineers who understand AI systems will move faster than those who only use AI tools.

There is a big difference between this:

Ask ChatGPT a question

And this:

Build an agent that can reason over network state, use tools, respect constraints, return structured outputs, and support workflow decisions.

That second one is where the opportunity is.

And you do not need to boil the ocean.

Start with one workflow.

Pick something painful:

  • BGP peer reviews
  • firewall rule audits
  • interface alert triage
  • NetBox drift detection
  • config documentation
  • change plan generation
  • incident summaries

Then apply P.E.N.E.

Give the AI a role.

Show it examples.

Add your environment context.

Set hard constraints.

Define the output contract.

Validate the output.

Test it against messy inputs.

Iterate.

That is how we build smarter, not harder.


What I Wish I Had When I Started

Here is what I wish I had when I started using AI for network automation:

  1. A reusable prompt structure
  2. A way to make outputs predictable
  3. A reminder to add constraints early
  4. A testing loop for messy inputs
  5. A way to connect AI output to automation safely
  6. A clear separation between model reasoning, schema validation, workflow control, and human approval

That is what P.E.N.E. is trying to solve.

It is not magic.

It is structure.

And structure matters when we are trying to move from cool demos to useful systems.


Final Thoughts

P.E.N.E. is not about making prompts fancy.

It is about making AI useful inside real operations.

For network operations, useful means:

  • consistent
  • actionable
  • parseable
  • constrained
  • testable
  • safe enough to fit into a workflow

That is the bar.

And with LLMs getting better, that bar does not go away.

It gets more important.

Because the better the model gets, the more tempting it becomes to give it access to tools, workflows, and production-adjacent systems.

That is exactly when structure matters most.

So yes, P.E.N.E. started as a prompt framework.

But for AI agents in network operations, I think it is more than that now.

It is a behavior contract.

The repo gives you the framework.

This article gives you the field guide.

The workshop will show you how to build with it.

And later this week, I’ll be publishing a video on my YouTube channel, Automation with Sif Baksh, walking through how I’m using P.E.N.E. to build better AI agents for NetOps.

Grab the framework here:

P.E.N.E. Framework on GitHub

Join the workshop here:

Build AI Agents for Network Operations

Try it with one messy network task and let me know how it goes 💬