GenAI for telecom OSS

This post explores the use of GenAI in transforming service life cycles and service operations in Telecom OSS.

The use of AI starts with identifying and prioritising use cases to be defined that will result in the implementation of transformed service lifecycles and service operations that use AI or GenAI.

Identifying and prioritising use cases for OSS

One way of identifying use cases for ML, AI, and GenAI in Telecoms is by considering the sources of data available from CSPs.

These sources of data are available at the following layers

Data Collection and Data Flow Layers

This includes

  • Data collected from Network Elements, Network Element Management Systems per Technology Domain
  • Data collected from Customer Devices, and Physical and Virtual Customer Network elements
  • Data entering and leaving the network via Carrier Interconnect
  • Data traversing internally within the CSP Network (Access, Aggregation and Core)
  • Data leaving customer sites, traversing the access, aggregation and core networks, entering and leaving data centres and cloud.
Data Aggregation, Processing, Control/Routing Layers

This include

  • Data Aggregated across Technology Domains – Network Device and Connectivity data from related platforms and Cloud platform data
  • Control Data from Network Controllers that work across domains with respective Domain Controllers
  • Transactional Data from Network Automation used for network configuration, network provisioning, and service activation.
Service Orchestration, Service Assurance Layers

This includes

  • Monitoring and performance data from Network Services and Customer Facing Services
  • Service availability and uptime metrics
  • Performance metrics of automated processes
  • Success and failure rates of automated tasks
Selection and use of Pre-Trained models

General purpose LLMs are in most cases not suitable to drive improvements in the use cases identified using the above.
The inferences provided by such models mostly do not benefit the cost of using them. Telecom and OSS tasks are specific to multiple technology domains like DWDM, Carrier Ethernet, Wireless, and IP.
Hence pre-trained models on telecom data provide a head start toward the use of such models in the use cases described above, in the context of relevant business scenarios.

Identifying and selecting pre-trained models requires considerations of cost to obtain a license and cost to run, costs to customise and costs to fine-tune. Additional considerations include the type of model to use for the type of problem. e.g. Transformer type models used in LLMs or “simpler” models that use Decision Trees, Linear/Logistic Regression for supervised learning, K-NN for unsupervised learning or Feedforward Neural networks.

Additionally, as LLMs continue to evolve, the commercially available model could be updated by the vendor. So inferences made previously might change significantly and these outputs may be significantly different requiring altering of prompts previously used.

Fine Tuning Models for Telecommunications use

Generic models can be fine-tuned for telecom by training on telecom-specific Q&A types of texts or prompts. A final stage would use reinforcement learning from human feedback.

Other Challenges and Considerations

One of the challenges with using Gen AI is remaining non-biased and this requires vigilance when using data sets to avoid the use of skewed data sets.
Another challenge is understanding the training algorithms and training parameters, which means understanding how the models are trained to help prevent bias creep and misinterpreted results.
A third challenge is handling data privacy, especially with the use of customer data.
And a fourth challenge is keeping up with regulatory laws as they are updated.
Future posts to elaborate on the above.

Navigating the Complexities of TMF APIs in Modern Telecom and E-commerce

Overview of TMF APIs

The TMForum OpenAPIs are a set of RESTful APIs designed to support the telecommunications industry. The OpenAPIs can be grouped into categories such as Product & Service Management, Customer Management, Service Assurance, and Resource Management.

The current v5 API definitions use Open API v3. And the shared data model is based on TMF SID and published as a set of JSON schema definitions. The shared model is implemented across multiple TMF Open API specifications so any specific business model e.g. Customer domain requires model components (schema definitions) from other related Open API specs in order to maintain SID data model compliance.

While these APIs and models promote interoperability across service providers in a Telecom services domain, there are some challenges to be faced when working with business services in e-commerce like domains that have marketplaces for products that represent endpoint OTT services. These OTT services can use any of the underlying connectivity services provided by CSPs.

The API-first approach in these domains is different from TMF API-first built products and services in the BSS/OSS domain.

Navigating API and Schema Integration Challenges in Business Services

Marketplaces for business applications and/or business applications themselves have sets of API and related JSON schema definitions that rarely map one-to-one with TMF JSON schema definition sets used by TMF APIs.

The need for such mapping could arise when offering additional services to customers with existing telco services and subscriptions. This allows customer to also subscribe to these OTT services which can be delivered and billed with existing Telco services.

Given most of the interactions are B2B there is some challenge in building this adaptation for use cases related to ordering and fulfilling such services.

Evolving API Architecture for Interoperability and Consumer Focus

While there is a need for interoperability between service providers there is also a need for simplicity and interoperability between B2B2X providers and consumers. And API design ideally needs to be API consumer focussed. Consumer designs depend on the context. There are consumers of Telco services APIs and consumers of e-commerce business services APIs.

When using TMF APIs, REST resources are used to represent Collections, Tasks, and Individual Resources represented by REST APIs and related JSON schemas. And these resources are identified via an identity of an “Addressable” resource; they can be extended (using polymorphism) via an “Extensible” resource. The “Types” of resources have a type related meta-data. These can either be extended or modified to add attributes to the resource that the “Type” represents by way of characteristics.

When using e-commerce APIs however, in many cases the APIs are nested to maintain logical relationships between resources. This is for grouping related information in logical end points by association. This helps align with the bounded context of the solution. For example /users/:userId/orders/:orderId or /categories/:categoryId/products/:productId

Bridging the Gap: Integrating Diverse Models and Languages

This requires additional effort to integrate, such as the development of anti-corruption layers or translation layers to bridge the gap between different models and languages.

And this makes it challenging to implement all the mandatory aspects of the TMF APIs as described in the API specification. Especially when integrating with Non-TMF Compliant SaaS APIs which leads to complexity with data mapping and extensions of APIs.

Therefore, the scope of adoption and the extent of adoption of the TMF APIs have to carefully considerered in any architecture design for such scenarios.

Exploring Automation in Telecoms – opportunities and challenges -drivers of efficiencies.

Thinking about Telecom Automation for a local round table with CSPs in the country, a few topics surfaced that will be noted here as part of a series on this.

The first discuses the why of automation in the Telecom industry and Telco’s move from CSPs to DSPs.

One aspect of why is drivers for efficiency which includes the usual candidates; reducing costs, improving customer experience, and enhancing network performance.

A second aspect of why is the adoption of standards to leverage AI and ML for network optimisation and service operationalisation. (e.g. ETSI TS 129 520 from 3GPP TS 29.250). This allows for effectiveness and efficiency in managing the network, deploying Devices, Virtual Devices and Containerised Devices representing Network Functions to the network, and fine-tuning product and service offerings to each individual customer.

A third aspect of why is to enable integrations with Suppliers and Partners to allow entry to the Digital Services Markets offered by SaaS providers. Business capabilities are exchanged to provide E2E services. It is almost certain that some data transformation is required for integration.

Assume patterns of Domain Driven Design are used in such integrated bounded contexts and data model translation occurs in such contexts following best practice for effective and efficient designs with microservices. Then depending on how the SaaS service forms the bounded context relationship with the CSP, the definition and use of APIs for effective collaboration and communication changes. Additionaly any such translation can be stateful or stateless. Given such volatility in design the use of standards as provided by TMForums OpenAPI cannot adapt and increase complexity in integration in such contexts.

Operationalization of SD-WAN services

SD-WAN service orchestration and control allows managing the life-cycle of SD-WAN services as available with API, automation tools, and eco-systems from SD-WAN providers.

There is a need to operationalize SD-WAN services using existing IT systems in CSPs that have customer data.

The key capabilities of SD-WAN are as follows.

SD-WAN includes secure overlay connectivity that uses IP packet forwarding and routing based on policies for end-user applications and their traffic demand.

SD-WAN has multiple options for Transport underlay and is independent of it.

Service assurance is automated for each SD-WAN connection (as a virtual tunnel). SD-WAN Orchestrators and Controllers centralize management, orchestration and control of SD-WAN services.

Services are highly available; this helps automate deployments of specific SD-WAN offerings from SD-WAN vendors.

Operationalization via IT systems integration

Integration with these systems would depend on a few factors – at least three.

  1. Whether the CSP SD-WAN solutions include multiple SD-WAN vendors
  2. Whether the deployment is on a large scale, especially at the Telco Edge
  3. How well any automation loop can be closed with the operational stack’s downward-facing service fulfilment and the operational stack’s upward-facing service assurance

All of these factors call for the introduction of a NaaS layer to allow IT systems to use the SD-WAN deployment systems.

Options for such a NaaS layer can be implemented

  • Using standards as described in TMForum NaaS
  • Using Hyperscalers
  • Using a Customized in-house adaptation of Vendor offering

Using TMForum NaaS

This is described in TMForum IG1224, and the application of NaaS to SD-WAN has my contribution to IG1224 with my SD-WAN PSR model (Product-Service-Resource) model as peer reviewed and approved in Appendix A.

Using Hyperscalers

Using compute, storage and network from hyperscalers increases the cost as management and orchestration endpoints at the edge increase. Large-scale edge deployments with virtualized CPEs may not scale cost-wise on hyper-scalers.

In-house

Using a customised in-house adaptation of Vendor offering depends on CSP’s existing system and capabilities.

Finally, for any SD-WAN service, aligning with the performance and capacity of the underlying transport layer is required for closed-loop automation.

The existing performance monitoring solution and its closed loop will ensure that SD-WAN services meet their SLAs.

SD-WAN Operating Models 

Traditional Enterprise WAN connectivity requires L2/L3 (L2.5) type protocols like MPLS to guarantee bandwidth between sites. The speeds selected from product offers for MPLS were guaranteed by the network based on QoS settings in the network. A form of security is provided by the MPLS network in the sense that the network is separate from the Internet and forms a VPN between enterprise sites.

However, maintaining such networks is costly for Enterprise Business, and SD-WAN has emerged as a possible less costly alternative. With SD-WAN, the control and data plane are separated, with the routing decisions centralized in the controller and forwarding done at the edge.

SD-WAN has been described extensively by MEF, and the service components there have been identified as

  • SD-WAN Edge – SD-WAN provider to SD-WAN subscriber interface
  • SD-WAN Controller – For Service Management
  • Service Orchestrator – which I consider a Domain Orchestrator Orchestrator
  • SD-WAN Gateway – for interconnect of transport service
  • Subscriber – web Portal – for Service Management and Service Operations

While the service and service management models have been described extensively, the “operate” model that connects to IT systems has also to be considered.

In a similar vein as such operate models are being considered for 5G slicing, an “operate” model of SD-WAN is considered here. Additional details are in my contribution to TMForum IG1224 for NaaS.

The following figures are based on the MEF models and illustrate

A logical representation of the SD-WAN net,

And a logical representation of SD-WAN related circuits.

The “operate” model utilizes this by exposing SD-WAN services to the IT layer, which is the core commerce layer in TMForum’s open digital architecture.

Such a model is described in detail in my contribution to IG1224 Release 13 (currently in draft).

Further notes on 5G Service and 5G Slice Relationship

5G Slicing enables the partitioning of physical 5G core and RAN Networks into one or more logical networks. 

Each network slice is isolated end-to-end and tailored to fulfil diverse requirements that a particular service requests. This technology allows multiple 5G services to use one or many 5G slices, and one 5G service can use multiple 5G slices, depending on the specific needs of the service.

In the era of 5G, diverse use cases and services have different requirements in terms of latency, reliability, bandwidth, security, and quality.

Network slicing allows the creation of dedicated network resources tailored to various business customers, paving the way for the provisioning of specialized services.

Example of 5G Service and 5G Slice Relationship

To illustrate the relationship between 5G services and 5G slices, let’s consider three main types of 5G services that use network slicing: Enhanced Mobile Broadband (eMBB), Massive Machine-Type Communications (mMTC), and Ultra-Reliable Low-Latency Communications (URLLC) 

  1. .eMBB: This service focuses on providing high data rates and improved broadband experiences for users. A network slice tailored for eMBB would prioritize high bandwidth and capacity to support data-intensive applications like video streaming and virtual reality.
  2. mMTC: This service is designed to support a massive number of connected devices, such as IoT sensors and smart meters. A network slice for mMTC would prioritize the efficient handling of a large number of connections with lower data rates and power consumption.
  3. URLLC: This service aims to deliver secure communications with ultra-low latency, high reliability, and minimal data packet loss. A network slice for URLLC would prioritize low latency and high reliability, making it suitable for applications like autonomous vehicles and remote surgery.

In this example, each 5G service has unique requirements, and network slicing allows the creation of dedicated network resources tailored to these needs. A single 5G service may use multiple slices to meet its requirements, or multiple 5G services may share a single slice if their requirements align. This flexibility in resource allocation enables efficient utilization of network resources and ensures that diverse services can coexist on the same physical network infrastructure.

One such example is the use of slicing to broadcast a sporting event like the Tour de France. The PSR (Product, Service, Resource) model for this is as follows. 

How this PSR relates to the CFS/RFS model was presented logically in a previous post to be described further in the next post.

What defines a communication/connectivity service model consuming Network Slice as technology neutral

As described in TMF IG1224, 3GPP identifies services requiring 5G slices as Communication Services (3GPP Term). An example of a communication service would be a service modelled with technology-neutral Customer Facing Service Specifications (CFSSpec). The service model is instantiated as a service instance (end-to-end) of the corresponding Customer Facing Service Specifications and their relationships. The service instance represents the connected flow of traffic in the network, which flow is the actual network service delivering customer traffic from source to destination.

Modelling such a service requires defining the service technically in terms of its service parameters that must meet service quality requirements defined by SLOs. Service parameters are specified in CFSSpecs, and the definition and relationship between CFSSpecs of an end-to-end service evolve from service dependencies within the Service Topology model. The Service Topology model is built from the Network Topology Model, and related Network Traffic flows for the service.

Network Topology model is a resource model that represents a technical domain. (e.g. Carrier Ethernet, DWDM etc). Service Models are models representing services L2 Services (e.g., L2 VPNs, L2 Access) and L3 Services (e.g., L3 VPNs, Service Tunnels) that are ultimately abstracted as technology-neutral before providing to customers. This makes the lifecycle of the service presented to the customer independent of the realization of the services via the lifecycle of the resources and technologies.

In addition, the modelling of such services also needs to consider the management and operations of such services and how tasks that control such services are orchestrated. Such as service creation, modification and termination.

Finally, the modelling of such services also needs to consider the lifecycle of the service itself, how the service specification is created, what its requirements are, how it would be designed and assigned, and how it would be configured and provisioned.

This makes up the CFS-RFS relationship, as I have specified in IG1224 for 5G splicing. To be elaborated on in future posts.

5G Slicing in Standalone Mobile Network -Fulfilment & Assurance

This post relates to my contributions to TMForum IG224 NaaS Transformation Document.

A 5G Network Slice supports communication (and connectivity) services offered by the Production layer (of TMForum’s ODA Framework) using Network-as-a-Service (NaaS) APIs.

Examples of such communication services and fulfilment of the same are described in TMFIG1224 Appendix A.

The communication and connectivity services offered by NaaS are exposed as technology-neutral services to the Core Commerce Layer of ODA.

This means the Information-model of this connectivity service should map TMForums SID model to the RFS models of 3GPP’s NetworkSlice.

The data model requires mapping

  • the communication and connectivity services exposed by NaaS
  • each customer’s or tenant’s use of this connection as a flow within the communication service
  • technology service exposed by 3GPP

Assurance

Assurance in 5G is dynamic, with various levels of closed loops. The assurance approach and standards are described in TMF IG1224 Appendix A.

RAN Slicing and QoS Management

RAN slicing creates independent virtualized sub-networks that convert user requirements to different QoS requirements on the network. User requirements are service requirements as specified in CFSSpecs.
These requirements cover scenarios for creating RAN subnet slices and modifying RAN subnet slices as required by created or modified services.

One of the service requirements for creating or modifying a RAN subnet-slice relates to Quality of Service (QoS). QoS Profile allows mapping of packets flowing from UE to Data Network using QoS flows and Data Radio Bearers (DRBs). In 5GNR QoS flows represent logical streams of data packets that have specific QoS requirements.
Rules in the UE and in the RAN map QoS Flows to DRBs. Setting up QoS Profiles and mapping rules is part of the design time activity for the slice.

The following diagrams illustrate a logical diagram of QoS in RAN.

In the figure, each cell is identified by the radio beam and frequency of a Remote Radio Head (RRH) sector.
The Tracking Area Code identifies each cell in a Tracking area
TAI = PLMN ID + TAC
PLMN-ID = MCC + MNC

Ultimately the RAN subnet slice would have its QoS Profile defined at design time via the NST and then mapped to QoS definitions in the CFSSpec. These definitions abstract a specific QoS Profile in the RFSSpec representing the Slice Profile.

QoS characteristics are considered representative of service usage as they are relevant to dynamic decisions made during operations of QoS flow, such as using the priority level as a tiebreaker when two flows compete for resources.
There are many scenarios for QoS events, as seen in the 5G QoS Identifier, 5QI Table available from 3GPP and ETSI.

QoS ER and logical relationships are shown below.

From the figure, each RAN slice is associated with a specific set of QoS parameters that define the desired quality of the slice and its performance characteristics, such as latency, throughput, packet loss, and reliability. These parameters represent the requirements of the applications or services the slice supports and comprise the QoS Profile.

In this example, the slice consumes resources that are primarily radio resources in terms of physical resource blocks in a radio beam that makes up the sector of a cell.

Multiple QoS flows can be mapped to the same QoS Profile.
However, each QoS flow is associated with a single QoS Profile represented by 5QI. The 5QI value defines a set of QoS parameters, such as priority and packet error rate, that can be applied to multiple QoS flows, each identified by its unique QoS Flow Identifier. At runtime, the QoS characteristics work with a priority level.

Multiple QoS flows can be associated with a single RAN slice, as long as the QoS flows have similar QoS requirements that can be met by the same RAN slice. For example, multiple QoS flows carrying video streaming traffic with similar QoS requirements may be associated with a single RAN slice designed to support high data rates and low latency.

If we assume RAN slicing is at the MAC layer (does Qos Scheduling), the Physical Radio Blocks (PRBs) on the radio beams from RRH are allocated to a slice, performed dynamically depending on users allocated to the slice and traffic conditions.

The QoS scheduler determines the appropriate allocation of resources for the RAN slice. The QoS scheduler also assigns Dedicated Radio Bearers (DRBs) to the slice. DRBs are logical connections between the user equipment (UE) and the core network, enabling the transfer of data and control information. Each DRB is associated with specific QoS parameters and represents a dedicated channel for the slice to communicate with the network.

By allocating DRBs based on the QoS requirements, the QoS scheduler enables the identification and creation of different slices within the RAN. Each slice can have its own allocated resource set, including DRBs, allowing for the isolation and customization of services based on the associated QoS parameters.

Mapping of QoS flows to be discussed in future.

S-NSSAI and 5G Slice Usage

As illustrated by 3GPP standard TR 28.801 Study on management and orchestration of network slicing for next-generation network (Release 15), one Network Slice Instance can be consumed by many Communication Service Instances. And one Communication Service Instance can consume more than on NSI. This is as per the logical model shown in Figure 4.9.3.1 of 3GPP TR 28.801.

This relationship is shown in the Information model of Figure 4.2.2.1 of 3GPP TR28.801

A specific example of this is illustrated in my contributions to TMForum IG1224

The S-NSSAI is created during the 5G Network Slice Creation Process. The specific requirements for the slice (based on the service profile) are provided by the OSS/BSS system when ordering a slice from the Network Slice Management function (NSMF). This includes the S-NSSAI used to identify the appropriate Network Slice Instance based on Service Profile.

How the S-NSSAI is used depends on how the 5G slicing network is configured and available network functions like

  • NSSF
  • URSP
  • pre-configured S-NSSAIs based on specific requirements of network slice
  • default S-NSSAIs as defined by 3GPP for specific types of services and applications

The NSSF is not a mandatory 5G Core Network function and plays a role in 5G slicing either

a) when AMF cannot provide the requested NSSAI (made up of S-NSSAIs)

b) when there is more than one slice with the same NSSAI

Without an NSSF function in the network, the S-NSSAI would be pre-configured in the network. This would be part of the slice creation workflow. In this case, the S-NSSAI can be selected based on the preconfigured policies without needing an NSSF.

In cases where there is no NSSF function in the network and pre-configured policies have not been defined, the default N-SSAI is used.

Multiple consuming services on a slice

For the scenario in which more than one Communication Service Instance consumes one Networks Slice Instance, each Communication Service Instance would have its own Resource Facing Service Instance related to the NSI. Because the provisioning of the RFS uses parameters specific to the RFS and specified in the RFSSpec.

As shown below multiple RFSs relate to one NSI.

Consuming service requires multiple slices

For the scenario where one Communication Service needs more than one NSI, then the CFS would decompose to multiple RFSs, with each RFS instance related to its respective NSI.

RFS instance tracks service usage and characteristics of service usage per slice consumer.