Introduction to SML – A Standard Semantic Modeling Language

It’s been over ten years since my co-founders and I launched AtScale to democratize analytics for everyone. We quickly realized that we needed to create a business-friendly view on top of the technical data so that ordinary business users could ask questions about data using familiar business terminology. To achieve that goal, we needed to free the semantic layers embedded within analytics tools and data platforms and create an independent, universal semantic layer for any tool or data platform.

The Need for an Industry Standard Modeling Language

There were precisely zero semantic layer platform vendors when AtScale was founded in 2013. Now, besides AtScale, several vendors are pushing their versions of a semantic layer. We’ve heard from many of our customers, partners, and industry analysts’ friends that now is the time to coalesce around a standard language for semantic modeling.

In response to this demand, we are open-sourcing a semantic modeling specification to promote semantic model portability and foster a vibrant community of model builders. By standardizing on a single modeling language, I hope we can encourage others to develop a library of shareable semantic models that can be plugged into any semantic layer platform. With a ready-made semantic model definition, thousands of known schemas, whether SaaS applications or industry ontologies, could be instantly consumable by any business user or data scientist.

SML: The Semantic Modeling Language

Semantic Modeling Language, or SML for short, encompasses over a decade of hands-on development, solving use cases for hundreds of customers across industries such as finance, healthcare, retail, manufacturing, CPG, and more. SML covers more than just tabular use cases. At its core, it is a multidimensional semantic modeling language that supports metrics, dimensions, hierarchies, semi-additive measures, many-to-many relationships, cell-based expressions, and much more.

SML delivers on the following requirements:

  • Object-oriented: SML is an object-oriented language that promotes composability and inheritance. This allows semantic objects to be shared within other semantic objects and across organizations, supporting easy and consistent model-building.
  • Comprehensive: SML is based on more than a decade of modeling experience across various industry verticals and use cases. SML handles multi-dimensional constructs and serves as a superset of all other existing semantic modeling languages.
  • Familiar: SML is based on YAML, a widely adopted, human-readable, industry-standard syntax.
  • CI/CD Friendly: SML is code, so it is compatible with Git and CI/CD practices for version control, automated deployment, and software lifecycle management.
  • Extensible: SML syntax can be enhanced to support additional properties and features.
  • Open: SML is Apache open-sourced to support community innovation and is free to use in any application or use case.

You can see many of these principles in action in the SML code snippet below.

SML code snippet

Sample SML code for a Model object

What Is Being Open-Sourced?

Open-sourcing SML aims to promote the building of reusable models and semantic objects. We are making the SML specification available for public consumption and collaboration. Soon, we will add software tools to make serializations and translations from various semantic dialects easier.

We are or will be open-sourcing the following:

  1. A YAML-based Language Specification: The SML specification is documented and encompasses tabular and multidimensional constructs.
  2. Pre-built Semantic Models: The GitHub repository contains pre-built semantic models incorporating standard data models such as TPC-DS, common training models such as Worldwide Importers and AdventureWorks, and marketplace models such as Snowplow and CRISP. We expect to add semantic models for SaaS applications such as Salesforce, Google Analytics, and Jira soon.
  3. Helper Classes (coming soon): We will release helper classes that will facilitate the programmatic reading and writing of SML syntax.
  4. Semantic Translators (coming soon): We will release converters for migrating other semantic modeling languages to SML, including dbt Lab’s semantic layer and Power BI. Shortly, we expect to release a variety of converters to support the legacy (i.e., Microstrategy, Business Objects, Cognos) and modern (i.e. Looker) semantic modeling tools.

What’s Next

By open-sourcing SML, we at AtScale hope to promote the adoption of semantic layer platforms and facilitate customers’ migration between proprietary vendor solutions. In addition, by coalescing around a single semantic modeling standard, we encourage others to author and share semantic models for various applications and industry ontologies. With a standard way of expressing business logic, we aim to promote accelerated analytics consumption and interoperability to bring data and analytics to more people.

WHITEPAPER
Enable Natural Language Prompting with AtScale’s Semantic Layer and Generative AI