With talks about AI transforming telecom some demonstrations use a cloud API, a made-up schema, and a demo that works exactly once. I wanted something different: a 7-billion parameter model running entirely on my laptop, converting a messy natural-language broadband order into a JSON object that a real OSS backend could actually consume — aligned to the TM Forum TMF641 Service Order API.
What followed was Milestone 1 of my Telecom-AI Agent Framework. Here is what I learned forcing a small local model to stop generating prose and start emitting precise, schema-valid data.
The Problem/Context
When a field engineer or customer service rep raises a broadband order, the downstream provisioning systems — OSS, BSS, network controllers — require exact, structured data. There is no room for “Uh, it’s a 1Gbps fiber thing for John at 123 Main Street.”
The challenge of M1 was answering one foundational question: can a 7B local model reliably bridge that gap? And if so, what does the architecture need to look like to make it production-worthy — without cloud APIs, without GPT-4, and without crossing your fingers?
The Delta (my learnings)
Before this milestone, I understood intent-based fulfilment as a concept. I knew LLMs could extract information from natural language. What I did not understand was how to make a small local model do it reliably and measurably.
1. JSON-mode is not enough — you need a re-ask loop
Ollama’s format: json parameter tells the model to output JSON. What it doesn’t guarantee is that the JSON matches your schema. The solution is an Instructor-style coercion loop: if Pydantic validation fails, the model gets another attempt with the specific validation error injected back into the prompt.
for attempt in range(max_retries):
response = await ollama_client.generate(prompt)
try:
return ServiceOrder.model_validate_json(response)
except ValidationError as e:
prompt = inject_validation_error(prompt, e)
# raise after max_retries exhausted
This turns a ~70% baseline schema-validity rate into 90%+ without changing the model or the prompt.
2. Two good few-shot examples beat a long system prompt
The order_intake.py prompt uses exactly two few-shot examples — not five, not ten. Two, covering the most common ambiguities: a residential FTTH order and a business upgrade. A long system prompt telling the model how to think performed worse than a short one with concrete input → output pairs.
3. Build your eval harness before you tune anything
The evals/ directory contains a 10-example JSONL dataset with expected ServiceOrder outputs. The runner scores two metrics per model:
Schema validity — did it produce parseable JSON matching the Pydantic model?
Field accuracy — did it correctly capture bandwidth, address, service type?
Running this across qwen2.5:7b, before touching the prompt gave me a baseline I could actually optimise against. Without the harness, I would have been tuning by gut feel.
Here are the actual results from the M1 eval run (make eval, 10 cases, qwen2.5:7b):
Metric
Result
Schema Validity
100% (10/10 cases)
Field Accuracy
100% (9/9 fields per case)
Latency p50
80.3 s
Latency p95
121.4 s
The accuracy exceeded the M1 acceptance criteria (≥90% schema validity, ≥75% field accuracy). The latency is the honest story: 80 seconds p50 on a laptop running a 7B model locally. This is not a production serving latency — it’s a proof-of-concept baseline. Optimising inference speed is out of scope for M1 and is a known trade-off of fully local execution with no GPU acceleration. For the purpose of this milestone, proving the schema contract holds under real inputs matters far more than response time.
Why This Matters for Telecom AI
The repository is structured around a principle I now consider non-negotiable: isolate your schema contracts from everything else before writing a single prompt.
src/telecom_ai/
├── schemas/ # Pydantic models — the TMF641-aligned source of truth
│ ├── service_order.py
│ ├── customer.py
│ ├── product.py
│ └── common.py
├── llm/ # LLM communication layer
│ ├── ollama_client.py # Async Ollama wrapper with retry logic
│ └── structured.py # Reask loop on ValidationError
├── prompts/
│ └── order_intake.py # System prompt + 2 few-shot examples
└── cli.py # `nl-order` CLI entry point
M1’s only job is to prove the schema contract holds. No RAG, no tool calling, no multi-agent workflows — just: can the model reliably emit TMF-compliant JSON?
This is load-bearing for every later milestone. When M2 adds vector retrieval and something breaks, I will know it is the retrieval layer — not the model’s ability to follow the schema. When M3 introduces tool calling, I start from a stable floor.
The instinct to skip this step and go straight to “agentic RAG” is strong. Resist it.
How To
To try it yourself — no API keys, no cloud:
git clone https://github.com/spereir2/telecom-ai-agent-framework
cd telecom-ai-agent-framework
make up # starts Ollama via Docker Compose, pulls qwen2.5:7b
make smoke # runs the canonical nl-order example
make eval # scores all 3 models on the 10-example eval set
To adapt this pattern for your own domain:
Define your Pydantic schema first — model it on an industry standard (TMF, 3GPP, TM Forum APIs) so it’s immediately meaningful to OSS/BSS integrations
Write two few-shot examples — cover your two most common input variations; don’t over-engineer the prompt
Add the reask loop — catch ValidationError, inject the error message, retry up to 2 times
Build a small eval set — 10 hand-written examples is enough to establish a baseline; grow it as you discover edge cases in production
In closing
Before you build a telecom AI agent, prove your local model can emit schema-valid, TMF-aligned JSON — with an eval harness to measure it. Everything downstream depends on this floor being solid.
Telecom service fulfilment has always been difficult to reason about in a single frame, because it touches the commercial catalogue (BSS), the network, the operational support stack, the customer’s premises, and the access layer at the same time. The order that the customer places is a commercial artefact — a Product Order. The thing that gets delivered is a coordinated set of configurations on physical and virtual resources, delivered via a Service Order, with corresponding records in billing, inventory, and assurance. Fulfilment is the work of holding all of this together while the underlying domains evolve at different rates.
This post sets out a compact set of foundational principles that make the shape of telecom fulfilment explicit. These principles apply regardless of the specific product (e.g., broadband, mobile, Ethernet, VPN, 5G slice) and the specific stack implementing it. They are not a framework or a process. They are the invariants that a fulfilment design must respect if the resulting service is to be usable, billable, supportable, and observable upon delivery.
The motivation for re-stating these from first principles is hardly academic. Capital expenditure intensity has compressed amid flat connectivity revenues, and a majority of operator leadership now considers the existing operational model unsustainable. Reasoning by analogy — patching the existing eTOM-style process flow with one more incremental project — has produced the technical debt that fulfilment designers spend most of their time working around. Reasoning from first principles starts somewhere else: with the network’s physical and rational constraints and the customer’s specific business intent.
Three Elemental Principles of Service Fulfilment
When stripped to its essence, fulfilment must satisfy three irreducible criteria.
The first is requirement persistence. The customer’s goal must remain the primary driver throughout the service lifecycle, from initial order through to ongoing assurance. A goal captured at order time and discarded at activation produces a service that is technically delivered but operationally unmoored from what the customer asked for.
The second is resource visibility. The system must hold an accurate, real-time source of truth about the available network assets — across access, transport, core, and edge — so that the gap between what records describe and what is actually deployed is closed. Without this, every downstream activity is a best-effort approximation.
The third is autonomous conversion. Translating a high-level intent into a specific technical configuration cannot indefinitely depend on static decomposition rules and manual integration. As products diversify and resource topologies evolve, the conversion must be carried out by intelligent agents that can reason about the network’s live state rather than a frozen snapshot encoded in uecatalogue mappings.
These three Principles point toward “invisible journeys” — fulfilment flows where the customer is shielded from the basic complexity, and where a single misquote, mismatch, or configuration error does not cascade into a breach of trust or an SLA failure.
From CFS/RFS to Intent-Driven Modelling
The dominant service modelling paradigm to date has been organised around the CFS/RFS hierarchy. The customer-facing service (CFS) is the technology-agnostic capability the customer perceives — for example, low-latency connectivity, a managed VPN, or a voice trunk. The resource-facing service (RFS) is the technology-specific implementation that delivers it — for example, a 5G slice, an HFC access tail, or an MPLS L3VPN instance. The CFS depends on one or more RFSs; the RFSs depend on resources; and the mapping between them is encoded as a set of decomposition rules in the catalogue and the orchestration layer.
This model has served well as a way to demarcate business and technology domains. The mapping from CFS to RFS, however, is fundamentally static. Decomposition rules are authored against known specification pairs, and the task templates that they produce sequence interactions between systems and users along a fixed path. Where the network is hybrid, cloud-native, and continuously evolving, that fixed path becomes the bottleneck rather than the simplification it was meant to be.
The intent-driven paradigm reframes the CFS as a customer intent—a goal expressed in declarative terms—rather than a fixed entry point within a deterministic decomposition chain. The RFS layer is no longer a single mapped descendant of the CFS; it is a graph of candidate sub-intents and resource compositions that are evaluated against live network situations, statistical priors derived from prior outcomes, and well-defined schemas that act as the contract between layers. This is the same shift that allows stateful agentic systems to wrap stateless LLM reasoning with predictable control logic, which the next sections and future posts build on.
The contrast can be summarised as follows.
Feature
Traditional CFS/RFS Model
Intent-Driven Model
Control logic
Imperative — how to build
Declarative — what to achieve
Abstraction
Static layers
Dynamic, goal-based
Lifecycle
Discrete order-to-provision
Continuous active persistence
Integration
Manual mapping, hand-authored YANG
Autonomous agent decomposition
Response to deviation
Reactive
Proactive, self-healing
Accordingly, the working definition of intent in this post is consistent with Intent-Based Autonomous Networks: a set of clearly stated objectives and desired outcomes, described in high-level terms without specifying execution steps. The separation of goals from execution is the defining characteristic of the model and the precondition for everything that follows.
Feasibility Precedes Commitment
Before promising service, the provider must confirm that service can actually be delivered. This feasibility check spans technical availability (is there a port, a slot, a licence), capacity (is there enough of each resource along the path), compatibility (does the selected profile work with the customer’s location and equipment), reachability (is the site served by the relevant access technology), and regulatory or policy compliance (e.g., local regulatory approvals, porting eligibility, export controls).
Feasibility is where catalogue ambition meets network reality. A fulfilment stack that commits to an order before these checks pass creates downstream failure modes that are expensive to unwind — customer-visible rejections, truck rolls to unserviceable addresses, and billing records for services that never activated. Accordingly, feasibility must be treated as a pre-commit activity, not a post-commit diagnostic.
The harder version of this problem is the gap between as-planned and as-built. The records held in inventory describe what was planned to be deployed, while the on-site reality reflects what was actually deployed, often with differences that accumulate over years of patching, upgrades, and incident-driven changes. Fulfilment that relies on as-planned records as its source of truth will repeatedly commit to orders that the network cannot, in fact, accept, and the failures will surface during physical activation rather than at order capture. Closing this gap is increasingly addressed through digital twins of the access network and the tower estate, and through proactive on-site validation that updates inventory against observed reality before commitments are made against it.
End-to-End Coherence among Domains
Activation must preserve end-to-end coherence. A service is only fulfilled when the customer record, the billing configuration, the service inventory, the network configuration, the device state, and the assurance visibility are all harmonised around a single view of what was delivered. Partial fulfilment — where the circuit is live but billing never activated, or where inventory says one thing and the device says another — is indistinguishable from failure from the operator’s perspective, because the first incident that touches the misaligned record will expose it.
Such coherence has to be built across a multi-domain network. Fulfilment coordinates across access, transport, core, OSS/BSS, field operations, partner networks, and cloud or platform domains, each having its own change cadence, its own interfaces, and its own source of truth. No single domain owns the service. Accordingly, the fulfilment layer is responsible for imposing a coherent commitment across domains that would otherwise drift independently.
Service quality is designed into this coordination, not bolted on. The intended SLA, bandwidth, latency, resilience, priority, and security posture are established at fulfilment time through the specific profiles, paths, priorities, and policies that are configured. A service that is merely connected is not the same as one that meets its intended quality envelope, and the distinction is determined at delivery rather than in operation.
Assurance Begins During Fulfilment
Assurance is not a separate phase that follows fulfilment. Testing, telemetry, alarms, monitoring hooks, and service observability must be enabled as part of delivery, with the configured state registered in the systems that will later detect faults against it. A service that activates without the corresponding assurance scaffolding is observably connected but operationally invisible, which means faults either go undetected or surface first through the customer. The fulfilment flow owns the enablement of these hooks.
Customer premises and field work may be part of this control loop. Telecom fulfilment often includes appointments, installation, device shipment, number porting, site access, and physical activation. These steps are slower, more failure-prone, and less automatable than software configuration, and they frequently gate the completion of otherwise-ready service state. Fulfilment design has to include these steps as first-class activities, with their own commitments, dependencies, and observable outcomes — not as exceptions to the automated path.
Agentic AI as the Engine of Autonomous Fulfilment
The shift from script-based automation to agentic AI is the most consequential operational change of the current cycle. Automation in the traditional sense is incremental and imperative — it replaces a human-typed command with a templated one, but the workflow itself remains static. Agentic AI describes systems that can reason about an objective, plan a chain of actions to meet it, and interact with the environment to close the loop, with limited human control. In a fulfilment context, this maps directly onto the autonomous-conversion truth introduced earlier.
The architecture of agentic operations places autonomous agents at the centre of the operational loop. Such agents decompose intent via a structured inference chain rather than a direct mapping. Given a goal statement, the agent first applies Chain-of-Thought reasoning to resolve the four dimensions of every intent: the service type and topology (the what), the endpoints (mapping operator shorthand like “the Frankfurt hub” to system resource identifiers), the SLA parameters (converting qualitative descriptors like “high capacity” or “super stable” into quantified thresholds such as bandwidth_mbps: 10000 or availability_class: PLATINUM_99_999), and any path or regulatory constraints. Only after this reasoning pass does the agent commit to a structured technical output—a typed schema that serves as the executable instruction flowing into downstream network controllers. Critically, the model is also primed with industry-specific few-shot examples that anchor its inference to telecom-professional logic rather than generic patterns, and where the goal statement is under-specified or technically contradictory, the agent surfaces a clarification request rather than guessing.
This unified loop replaces the historical “throw it over the wall” pattern between BSS and OSS. The system senses the current network state, decides on the configuration that best satisfies the intent under live conditions, and acts by deploying the service and registering the corresponding assurance hooks. Orders fall out because brittle hand-offs between segmented data models are replaced with a shared model that the agents reason against directly.
The capability picture can be summarised as follows.
Agent capability
Fulfilment impact
Operational value
LLM-led intent reasoning
Translates declarative goals into deterministic configurations
Simplifies B2B and enterprise ordering
Autonomous IaC generation
Removes the manual CLI / scripting step
Reduces deployment time and configuration errors
Closed-loop assurance
Detects and remediates degradation against the original intent
Preserves SLA compliance and customer trust
Agentic BSS interaction
Personalises offer composition and rating against live capability
Improves monetisation of differentiated 5G and slicing features
Change as State Transition
In telecom, change and fulfilment are the same class of problem. New installs, modify, suspend, resume, disconnect, and migrate are all state transitions on a service instance, differing in the starting state, the target state, and the resources that must be added, altered, or released. Treating these as distinct workflows produces redundant logic and inconsistent behaviour across the portfolio; treating them as transitions over a common service-instance model keeps inventory, billing, and configuration aligned across the full lifecycle. This is the lifecycle that the agentic loop operates on, not a separate set of provisioning paths.
While manual steps remain when fieldwork and regulated processes are involved, the dominant execution path must be automated for the economics and error rates to work at modern volumes. The state-transition framing makes this automation tractable, because each transition is defined as a delta against a known service-instance state rather than as a bespoke procedure.
A Pragmatic Path – Modular Modernisation
The principles above describe the destination, not the migration. The migration question — how to get from the current OSS/BSS estate to a model that supports intent, agentic operations, and end-to-end coherence — is where most fulfilment programmes fail. The recurring failure mode is the rip-and-replace programme that ensures a unified stack but stalls under integration load before the first revenue benefit lands.
The pragmatic alternative is modular modernisation. Within the TMForum Open Digital Architecture (ODA), fulfilment is decomposed into functional blocks — Core Commerce Management, Production, Party Management, and adjacent domains — each with defined responsibilities and standardised Open APIs (e.g., TMF622 for product ordering, TMF641 for service ordering, TMF639 for resource inventory, TMF620 for product catalogue). Modular modernisation upgrades one or two high-impact blocks at a time, against the same external API surface, so that integration debt is paid down incrementally rather than capitalised into a multi-year programme.
Two caveats apply. The Open APIs standardise the most common system functions, but advanced and operator-specific behaviours frequently remain exposed through proprietary interfaces, so conformance does not equate to interoperability. Real-life integrations still require customisation against generic attribute definitions, and the published conformance lists indicate that very few organisations are certified against more than a handful of APIs. Accordingly, modular modernisation is most effective when paired with a deliberate choice about which API surfaces to standardise on and which domains to keep proprietary, rather than a blanket obligation to the entire ODA catalogue.
The blocks that consistently produce the highest early return are the product catalogue (because it constrains everything downstream), the order management surface (because it is where intent first becomes commitment), and the resource inventory (because it is the source of truth that feasibility, assurance, and agentic decision-making all depend on).
Service fulfilment is complete only when the service is usable, billable, supportable, and observable. A delivered service that fails any of these four is not a partial success; it is a fulfilment defect that will surface later as a dispute, an escalation, or a blind spot. The circuit may work while billing is broken. The inventory may be correct, while assurance has no hooks. The customer record may remain consistent even when the device is misconfigured. Each of these combinations is a fulfilment failure regardless of whether traffic flows.
This is the checklist that the entire chain of principles ultimately serves. Requirement persistence, resource visibility, autonomous conversion, intent-driven modelling, feasibility, coherence, assurance enablement, agentic execution, state transitions, and modular modernisation are the mechanisms; usable, billable, supportable, and observable are the acceptance criteria.
Closing
These principles describe the shape of the problem rather than the shape of any given solution. Different operators will land on different orchestration stacks, different catalogue models, and different degrees of agentic autonomy, but the invariants above are what those designs have to satisfy. Subsequent posts will expand upon specific areas, including the boundary between intent and decomposition, the feasibility-check patterns that close the as-planned versus as-built gap, the reference architecture for an agentic provisioning–assurance loop, and the practical sequencing of a modular ODA transformation.
Introducing telecommunications products and services to the market has always been challenging, as the technical realisation of services in the network tends to permeate the commercial definition and perception of the products and related services.
Network resources such as routers, switches, VNFs, CNFs, and network services (e.g., Layer 2 and Layer 3 VPNs and 5G slices) are tangible. In contrast, marketing definitions of product offerings, with related commercial implications for bundling, pricing of rate plans, and end-user devices, are intangible.
Additionally, the lifecycle of Products and services sold to customers differs from that of network resources and services. Commercial reasons for product changes may include sales periods, new end-user device launches, and competitors’ products and services. Conversely, technical solutions that build connectivity (access, aggregation, and core) can evolve as standards evolve (e.g., 3GPP), vendor equipment adopts new standards, and services are migrated to new equipment or virtualised resources that provide equipment functions and services (e.g., as in 5G core evolution).
This requires separating commercial concerns from technical concerns. Additionally, the implementation of the network and network services, using underlying multi-vendor, domain-specific ecosystems that now include network virtualisation, involves a degree of abstraction to manage the complexity of service delivery.
Decoupling commercial and technical changes
One way to achieve agility in the face of change is to specify products and the knowledge of how those products translate into services that the customer understands, and to demonstrate how product orders translate into service orders. Then relate these product specifications to the technical specifications that represent the resources that allocate and provide the capacity to deliver these services in the network.
Thus, the modelling approach adopted by TM Forum and, to some extent, the MEF is to separate the Specification of an Entity from the Entity that is instantiated. This is one of the four patterns discussed in this post. Given my additional work in enterprise architecture, I find it helpful to compare these patterns with Domain-Driven Design patterns.
The Specification / Entity Pattern
The Specification/Entity Pattern is different to the Specification Pattern in Domain-Driven Design (DDD). TM Forum EntitySpecification–Entity pattern is a type/instance configuration pattern, whereas the DDD Specification pattern is a business rule predicate pattern; they answer different questions and are used in different layers
TM Forum Specification /Entity Pattern
Intent: Model “type versus instance” and make entity types configurable at runtime. ‘EntitySpecification’ is a template/configuration profile; ‘Entity’ (or ‘ManagedEntity’) is a concrete instance that “conforms to” that spec.
Structure: ‘EntitySpecification’ holds characteristics, default values, and often an ‘entitySchema’ that defines the structure of instances. (example from TM Forum). Entity instances carry values for those characteristics and reference exactly one (or sometimes zero/one) specification
Example use cases: Situations where new “types” must be added or extended without changing the code model, using characteristics and schemas for dynamic attributes
This pattern is used when there is need for a catalog of types and a runtime-configurable data model: catalog-driven products, custom entities, configuration profiles, etc.
DDD Specification pattern
Intent: This is a behavioural design pattern used to encapsulate business rules (criteria predicates) that objects (or functions) must satisfy.
Structure:A Specification is a predicate over an entity (e.g. OrderIsApproved, CustomerIsCreditworthy). Specifications can be composed with and, or, not into larger rules.
Example use cases: Complex validation/business rules that do not naturally belong inside a single entity. Reusable filters for querying (e.g. passing Specifications to repositories).
This pattern is used when there is a need for declarative, composable business rules that can be reused in validation, decision logic, and querying
Comparison
Aspect
Specification Pattern
Specification/Entity Pattern
Main Purpose
Encapsulate business rules / processes
Separate type definition from instances
Usage
Filterning, validation, quering
Catalogs, dynamic schemas, extensible types
Specification Rule
Predicate logic (isSatisfiedBy)
Template/schema for entity creation
Entity Role
Object being tested
Instance of a specification
Composability
Combine specifications (AND/OR/NOT)
Extend/compose schemas and relationships
Typical Domain
Domain-driven design, business logic
Catalog-driven systems, TM Forum SID
Summary
Specification Pattern (Wikipedia):
Focuses on expressing and combining business rules.
Used for querying/filtering objects based on criteria.
Specification/Entity Pattern (Catalog):
Focuses on defining types and instantiating entities based on those types.
These notes aim to guide the specification, design, and building of an interface between NetCo and ServCo. A NetCo is primarily responsible for managing physical infrastructure, such as Fibre Optic cables. Consequently, the Physical Network Inventory (PNI) system and application must accurately represent physical network elements (which are mostly passive) both logically and spatially. THe SerCo on the other hand centres around customer services and product offerings, enabling mobile and Internet product and service access to end users.
A PNI application and system must integrate with GIS (Geographic Information System) Systems and with Production systems (in the TMForum ODA sense) that are part of the ServCo. While GIS systems gather, manage, and analyze spatial and geographic data, TMForum ODA Production systems are responsible for fulfilling Telecommunication Services.
A key challenge in fibre network management is the lack of open standards for inventory data exchange. This necessitates that Communication Service Providers (CSPs) and Digital Service Providers (DSPs) specify and develop custom integrations between the PNI system and the Production System, specifically the Logical Network Inventory (LNI) system and the Service Fulfillment system. Some level of decoupling between systems, particularly when provided by different vendors, can be achieved using a REST architecture or SOAP interface.
Constrains on specifying, designing and building the interface
The specification, design, and building of this interface are subject to specific constraints, particularly concerning communication, coordination, and consistency. These concepts, as covered in “Fundamentals of Software Architecture” by Mark Richards and Neal Ford, are critical to understanding the design choices made.
Communication: The chosen communication method had to be synchronous. Implementing and productionising queuing and messaging frameworks or mediations between existing systems was not feasible within the project’s scope.
Coordination: The LNI and Service Fulfillment components already orchestrate the provisioning of Resource Facing Services for various transport technologies (e.g., DWDM, MPLS) and enterprise services (e.g., Carrier Ethernet, SD-WAN). This existing orchestration relies on APIs for synchronous request-response interactions, rather than the choreography seen in Intent-Based Networking where autonomous systems communicate collaboratively to fulfill an order.
Consistency: Given the synchronous communication and the coordinated nature of orders between the two systems, it is important for transactions to operate atomically. This ensures near real-time consistency in the lifecycles and operational states of Optical Fibres and the services they transport.
Fulfillment and Orchestration Logic, wherever possible, requires automated decisions for path-finding in both the physical and logical networks, along with corresponding resource allocations in PNI and LNI. This highlights the complex handoffs between the NetCo and ServCo domains. Visualizing the flow within the interface contract is essential to surface the intricate dependencies for completing specific flows that consume dedicated API endpoints to achieve the above.
Data and Information Modelling
This section outlines the critical aspects of data and information modeling required for effective interface specification within the ServCo-NetCo domain. These elements are essential for enabling effective system interactions.
Physical and Resource Logical Relationship Mapping: This involves defining and mapping the relationships between physical network assets and their logical representations. A clear understanding of these relationships is fundamental for network planning, provisioning, and fault management.
Location and Cost Modeling for Path-Finding: Accurate location and cost models are vital for efficient path-finding algorithms. These models facilitate the identification of optimal routes for services, considering factors such as latency, bandwidth, and cost to route.
Data Master-ship: A clear definition of data master-ship is essential. This specifies which system is the authoritative source for each shared entity, preventing data inconsistencies and conflicts across the integrated environment.
Fulfilment and Orchestration Logic
Fulfillment and Orchestration Logic requires, where possible, automated decisions for path-finding in the physical network and correspondingly in the logical network, and related resource allocations in PNI and LNI.
Error and Fallout Handling
Designing comprehensive error handling strategies, including specific retries, circuit breaking, and backpressure expectations, is essential for managing unexpected failures.
1. Communication Errors: These can arise from network issues or misconfigured endpoints. Handling involves implementing retries, timeouts, and backoff strategies to manage transient failures.
2. Data Consistency Errors: Inconsistent data between systems can lead to incorrect operations. Ensuring atomic transactions and using techniques like idempotency can help maintain consistency.
3. Authorization and Authentication Errors: Incorrect handling of security tokens or permissions can prevent access. Implementing robust authentication mechanisms like OAuth2 and ensuring proper scope management can mitigate these issues.
4. Resource Allocation Errors: Failures in finding or reserving resources can occur. Implementing fallback mechanisms and pre-order planning can help manage resource-related errors.
5. Orchestration and Coordination Errors: Errors in the sequence of operations can disrupt service fulfillment. Using sequence diagrams and ensuring proper orchestration logic can help prevent these errors.
6 Observability and Operability: Implementing correlation IDs, audit events, and metrics can help in diagnosing and resolving errors quickly by providing insights into system operations.
Key Considerations for Robust Interface Design
Beyond the core logic, several other critical aspects contribute to a robust and maintainable interface:
Diverse Fibred Paths: This refers to having multiple routes for data to travel through the network. The idea is to ensure that if one path fails (due to damage or congestion), others are available to maintain communication. Paths can be physical, as in Optical Fibres, and logical, as in service circuits that carry Ethernet and IP traffic.
Security and Compliance: This includes defining authentication and authorization (e.g., OAuth2/bearer tokens), scopes, token handling, and considerations for Personally Identifiable Information (PII).
Observability and Operability: This covers correlation IDs, audit events, metrics, logs, and traces aligned to Service Level Indicators (SLIs) and Service Level Objectives (SLOs), along with documented dashboards and alerts.
Lifecycle and Governance: This encompasses change management and history, ensuring proper control and tracking of interface evolution.
In the end the integration option selected depends on multiple factors existing and emerging with clarifying and implementing requirements. Some points to consider when designing and building such an interface are provided here.
Intent-based networking enhances the operationalisation of network services by translating high-level business intents into dynamic network configurations. Traditional networks enable services for customers through imperative commands. The service’s lifecycle follows a typical provisioning order workflow, where orders create, modify, or delete services. Each order is created and modified based on the necessary rework required for aspects of service provisioning parameters.
Intent-based networking provides services for customers through declarative commands that achieve the desired state of the network, supporting the intended use by customers. The service lifecycle is now supplanted by the lifecycle of the intent, which will last as long as the goals for the customer’s intended use of the network remain relevant to their objectives for the consuming service.
Expectations of an intent-based network
Intent-based network design augments automated networks to adapt rapidly to changing demands. Such a design assumes that some level of automation is already achieved in the networks and that the network demonstrates a level of elasticity to scale and accommodate fluctuating demands.
An intent-based network ensures that it consistently meets the desired service levels and quality of service (QoS) parameters defined by the intent. The QoS here would indicate how close the observed network state is to the desired state as defined by the intent.
Expectations of customers of intent-based services
As network functions shift to cloud-based services, more dynamic and flexible network provisioning must interwork with changing customer service demands. Customers demand faster and customised services with zero wait time to consume them. Additionally, the security of the network and services is maintained with every change occurring dynamically in the network.
Characteristics of an intent-based network
Intent-based networking uses Dynamic Network Resource Allocation (DNRA) to leverage such automation and AI to optimise network resource consumption based on high-level business intents.
At a high level, the following bounded contexts are required to design and implement an intent-based network, either as a layered architecture or a microservices / service-based architecture.
Analytics and Monitoring
Provides real-time insights into network performance.
Uses telemetry data to inform decision-making.
AI and Machine Lerning
Predicts network demands and optimises resource allocation.
Learns from historical data to improve future allocations.
Intent Engine and related Management and Orchestration
Interprets high-level intents and translates them into network policies.
Continuously monitors and updates policies based on feedback and analytics.
Automation
Automates configuration changes and resource adjustments.
Continuously monitors and updates policies based on feedback and analytics.
Network state reflects the configured behaviour for intent-based networking.
This topic is an in-depth topic for further updates in future posts.
Monitoring and security are maintained at all layers for intent-based networking.
This topic is an in-depth topic for further updates in future posts.
Use of Digital Twins
Digital twins are increasingly used in intent-based networking (IBN) to enhance network management and optimisation. More on this topic later
Defining and using Domain Specific Languages (DSLs) for Intent-based Networking
Domain-Specific Language (DSL) for IBN are specified to express intents in a human-readable and machine-executable way.
Key features include
High-Level Abstractions
Declarative Syntax
Policy Definition
Description of Network Topology and relationship between network elements (as in protected, diverse, etc)
Declarations for validating intents against the current network state and policies
Declarations for closing feedback loops that monitor network state and adjust configurations as needed.
Challenges with designing and maintaining Intent-based Networks.
Intent-based networking is not without its challenges.
Complexity with closed loops at multiple layers – Business, Service, Technology
Achieving closed-loop automation, a key goal of AN, relies on the network’s ability to translate intents into configuration actions, monitor the outcomes of those actions, and make necessary adjustments to ensure intent fulfilment. Such closed-loop automation requires advanced monitoring, analytics, and AI/ML capabilities at multiple layers to enable the network to learn and adapt to dynamic conditions.
Hidden side effects of closed loops – hidden commands and hidden states
Developing robust mechanisms for expressing intents unambiguously and enabling the network to interpret those intents accurately is essential for network stability and interoperability.
Advances in natural language processing, standardisation of intent models, and potentially domain-specific ontologies would benefit such defined expressions of intent and ensure shared understanding between users and the network.
Security and trust in intent-based systems
As network operations evolve towards greater autonomy, ensuring the security and trustworthiness of intent-based interactions should also grow in parallel. Security in intent-based networks includes protecting intent expressions from unauthorised modification, verifying the authenticity of intent sources, and implementing protections to prevent malicious or unintended consequences from automated actions.
International voice services are still relevant to telecommunications as they play a key role in connecting people and businesses across borders. Millions of calls are made daily as such services facilitate frictionless communication. The premise of such services are termination of voice calls to the correct destination in any part of the world that has telephony.
And as the margin for such services falls each year due to advances in technology, such as VoIP and 5G, among other factors, it is essential to have finely tuned business models and processes for these three Ts. While over-the-top (OTT) services continue to grow in terms of minutes of voice traffic, carrier services still handle significant numbers of voice traffic for business and social calls as they terminate large volumes of international voice traffic. So the challenge is offering such services with improved QoS and improved efficiencies in Trading, Trunking, and Technology. The following are some of my learnings on this topic.
Trunking in International Voice Services
Consumer Voice vs Wholesale Voice
Voice traffic is broadly classified as wholesale and retail. Wholesale voice traffic refers to large volumes of calls terminating at destinations in a carrier’s network. Retail voice is largely carrier-specific, and consumer voice traffic is routed directly by the service provider to the destination, usually bundled with some form of data service.
Voice Termination in Destination Networks
This is the final step in a carrier’s call processing and delivery ecosystem for international carrier voice. The high-level steps are as follows:
Call Origination: A call starts when initiated by a caller and is routed through the caller’s local telecom provider’s network.
International Transit: The call is transported across borders using physical transport technologies like submarine cables, satellite links, and protocols for VoIP like SIP and RTP.
Call Routing: The call is routed across borders by international carriers selecting paths based on efficiency, cost, and agreements, using interconnect solutions that include private and dedicated IP links, public IP networks, third-party private IP networks, and hubbing services.
Call Termination: The call reaches the recipient’s local carrier, ensuring delivery with quality and minimal latency.
Technology in International Voice Service
High-quality voice transmission is essential, requiring technologies to minimize latency, jitter, and packet loss. In addition to these network functions, the following section describes control plane functions involving decision-making and policy application, and data plane functions representing data generated from call activities.
Routing Technology
Least Cost Routing: This is as explained in detail on Wikipedia. Least-cost routing (LCR) in telecommunications is a method used by carriers to select the most cost-effective paths for outbound communication traffic. This process involves analyzing various routing options based on cost, quality, and capacity. LCR teams within telecom companies regularly evaluate and choose routes from multiple carriers, which can be done manually or automated through least-cost router software..
The LCR Process is also explained on Wikipedia. An example of the LCR team cycle in a carrier is as follows.
Buyers negotiate new price schedules with suppliers.
Prices are loaded into software for termination cost calculations and comparisons.
A route is selected, establishing a cost-for-pricing, and new prices are issued.
New routes are implemented on the switch; traffic volumes and margins are monitored via billing system reports.
Loss-making traffic and odd routing is investigated; billing data is corrected, or action taken on routing and pricing as needed.
Origin Based Routing: With origin-based routing, carriers include the originating country or originating network to determine the routing of calls. So, the routing of calls is based on the origin of the call. Origin-based routing should allow carriers to manage cost, QoS, and regulatory compliance more effectively using the following data sets.
Origin Sets: These are groups of area codes or country codes that define the origin of calls. These sets are used to categorize calls based on their geographical or network origin, allowing for differentiated routing strategies.
Routing Destinations: The endpoint to target location for call termination explained above. Routing destinations are also manifested as country codes or numbers allocated to specific network operators within a country. It is the destination that is used to determined what route a call can take to be terminated correctly.
Call Routing Labels: Routing rules and policies define the specific routes the call takes through the network. Each routing rule has an identifier or tag associated with it, and this mechanism enables the call to be routed according to the rule enforced by the tag, e.g., QoS, priority routes per tag(s), cost-based routing decisions per tag(s). In this way, call routing is aligned with business models and pricing models.
Routes: The specific path the call takes from call origination to call termination. Routes are determined based on various factors, including cost, quality, and regulatory compliance. Carriers use routing tables and algorithms to select the most appropriate route for each call, taking into account the origin set and routing destination.
The Role of Call Detail Records
This is defined as in Wikipedia. A Call Detail Record (CDR) is a data record produced by a telephone exchange or other telecommunications equipment that documents the details of a telephone call or other telecommunications transaction, such as a text message or data session. CDRs are used by telecommunications companies for billing, monitoring network usage, and analyzing network performance. Here are the key components typically found in a CDR:
1. Call Date and Time: The date and time when the call was initiated and terminated.
2. Call Duration: The length of the call, usually measured in seconds.
3. Calling Party Number: The phone number of the person who initiated the call.
4. Called Party Number: The phone number of the person who received the call.
5. Call Type: The type of call, such as local, long-distance, international, or roaming.
6. Call Status: Information about whether the call was completed, failed, or busy.
7. Unique Call Identifier: A unique identifier for the call, which helps in tracking and managing records.
8. Route Information: Details about the network path the call took, which can include information about the switches and trunks used.
9. Billing Information: Data used for billing purposes, such as the rate plan, cost of the call, and any applicable taxes or surcharges.
10. Service Provider Information: Details about the service provider handling the call.
11. Additional Features: Information about any additional services used during the call, such as call forwarding, conference calling, or voicemail.
Trading in International Voice Services
Termination fees in International Voice Services
Whenever a call is completed on a carriers network a termination fee is charged by that carrier. This fee forms the bases of the cost structure of international voice services, and it is the cost incurred by the carrier to connect the call to the called user. Each carrier involved with routing the call from the call origination to the call destination plays a role in call termination and hence is entitled to intercarrier compensation. These fees vary significantly depending on the destination country and the related carrier network in that country. This is explained in detail on Wikipedia
Cost Structures and Pricing Models for International Voice
In the international voice carrier wholesale market, carriers employ various pricing models to optimize revenue and maintain competitiveness. These models are designed to address the dynamic nature of the market, which is influenced by factors such as demand fluctuations, regulatory changes, and technological advancements. Here are some common pricing models used by carriers:
1. Cost-Plus Pricing: This model involves calculating the cost of providing the service and adding a markup to ensure profitability. It is straightforward but may not always reflect market conditions or competitive pricing.
2. Tiered Pricing: Carriers offer different pricing tiers based on volume commitments or quality of service. Higher volumes or better quality often come with lower per-minute rates, encouraging customers to commit to larger volumes.
3. Destination-Based Pricing: Prices are set based on the destination of the call. Rates can vary significantly depending on the country or region, reflecting the cost of termination and regulatory factors in those areas.
4. Time-of-Day Pricing: Rates vary depending on the time of day the call is made. This model helps manage network congestion and incentivizes off-peak usage.
5. Flat-Rate Pricing: A single rate is charged regardless of the destination or time of the call. This model simplifies billing and can be attractive to customers who prefer predictability.
6. Dynamic Pricing: Prices are adjusted in real-time based on demand and network capacity. This model requires sophisticated technology to implement but can maximize revenue by capturing higher rates during peak demand.
7. Bundled Pricing: Carriers offer packages that include a set number of minutes or a combination of services (e.g., voice, data, SMS) for a fixed price. This can increase customer loyalty and reduce churn.
8. Promotional Pricing: Temporary discounts or special offers are used to attract new customers or increase usage among existing customers. This can be effective for entering new markets or countering competitive threats.
9. Quality-Based Pricing: Different rates are offered based on the quality of service, such as premium routes with guaranteed quality versus standard routes. Customers can choose based on their quality requirements and budget.
10. Revenue Sharing: In some cases, carriers may enter into revenue-sharing agreements with partners, where revenue is split based on predefined terms. This can be beneficial for expanding reach without significant upfront investment.
Carriers often use a combination of these models to tailor their offerings to different customer segments and market conditions. The choice of pricing model can significantly impact a carrier’s competitiveness and profitability in the international voice carrier wholesale market.
Hubbing vs Bilateral Trading in Voice International Interconnections
Bilateral trading focuses on direct, high-quality connections between two carriers, while hubbing involves multiple carriers and offers more flexible routing options but can vary in quality depending on the interconnection topology used.
Bilateral Trading:Bilateral voice international IP interconnections involve two carriers directly interconnecting their networks to transport voice calls and services. This setup is typically used to interconnect retail networks, whether mobile or fixed. The combination of voice traffic to be carried through the international IP interconnection is determined through bilateral negotiations between carriers
Hubbing: The international voice hubbing service involves multiple international carriers to deliver voice services between end users. It allows the exchange of international voice calls with multiple networks via one voice over IP interconnection. Hubbing services can involve multiple transit hops as calls transit through several carriers towards their destinations, especially for lower quality levels. Hubbing allows for more flexible routing options and can be used to reach multiple destinations through a single interconnection point.
While the industry is mature, the challenge is in efficincies and optimization for these three Ts.
With rapidly changing digital landscapes in IT and Telecoms, agile and efficient network services and provisioning of the same is of increasing importance. A well-behaved network service orchestration layer is essential for meeting these demands, particularly as technologies like Software-Defined Networking (SDN) and Network Functions Virtualization (NFV) continue to evolve. The following identifies key elements that constitute an effective orchestration layer, highlighting its role in facilitating seamless service delivery, to be explored further in future.
1 Model Driven Architecture
A robust orchestration layer should adopt a model-driven architecture. This approach utilizes service models to abstract service configurations from the specific implementations tied to vendor devices. The benefits of this abstraction include:
Simplified Service Design:
By decoupling service design from device specifics, operators can create and modify services more rapidly.
Standardisation:
Utilizing standards like YANG allows for a human and machine-readable format, making programmatic manipulation easier and more efficient.
2. Programmatic Configuration
Programmatic configuration is another enabler of a well-behaved orchestration layer. This involves:
Automation of Configuration Tasks:
By leveraging standards-based interfaces such as NETCONF and YANG, the orchestration layer can automate network element configurations.
Increased Accuracy and Consistency:
Automating these tasks reduces reliance on manual command-line interface (CLI) inputs, minimizing errors and enhancing service provisioning speed.
While the adoption of NETCONF and YANG is still progressing, supporting non-standard interfaces for devices lacking native support is important for comprehensive orchestration.
3. Integration with Operational Systems
A successful orchestration layer must seamlessly integrate with existing operational systems, including:
Business Support Systems (BSS) and Operational Support Systems (OSS):
This integration is facilitated through well-defined northbound APIs, enabling efficient information exchange and process coordination.
NFV Management and Orchestration (MANO):
Interaction with the NFV MANO stack is important for deploying and configuring virtual network functions (VNFs) based on specific service requirements.
4 Transactional Reliability
Transactionality and reliability maintains network integrity. A well-behaved orchestration layer should:
Enforce Atomic Operations:
Configuration changes should be applied as a single transaction, meaning either all changes succeed or none do. This prevents partial configurations that could lead to inconsistencies and errors.
Safeguard Nework Consitency:
By rolling back transactions in case of failures, the orchestration layer helps maintain a stable network environment.
5 Stateful Network Awareness
Maintaining stateful network awareness is essential for effective orchestration. This involves:
Real-Time Monitoring:
The orchestration layer should have a near-real-time view of the network’s configured state, including the status of all services and devices.
Informed Decision-Making:
This awareness supports reliable configuration decisions and enables quick rollbacks if issues arise.
Incorporating these key elements into a network service orchestration layer significantly enhances the agility, efficiency, and reliability of network operations. Market evolution along with complexity and diversity of Telecommunication Networks makes it challenging to standardize APIs. This could be one of the reasons why SDOs like TMForum and MEF APIs have a Service-Centric approach in the definition and standardisation of APIs. Additionally Network equipment vendors often have proprietary interface and APIs adding to integration challenges. To be explored further in future posts.