Return to site

Using a Schema.org-First approach to build a single source of truth and a unified Content Management System

October 6, 2020

This blog post picks up where my previous post about exploring a Schema-First approach to Drupal and Content Management System left off. After more research and thought into the goals and processes for taking a Schema-First approach to the Information Architecture behind a Content Management System, I realized that what I was calling a "Schema-First" approach is more aptly-named a "Schema.org-First" approach because I am exploring using Schema.org to structure reusable data within a "Schema-First" approach to developing software.

To clarify…

Schema-first development establishes a contract between developers and the operational expectations from project managers. A schema is foundationally an agreed upon set of standards and approaches – and in establishing the schema as a constrained contract, you can ensure that no matter what comes out of development, it will align with the stated overall goals.

-- Using a Schema-First design as your single source of truth

...and...

Schema.org is a collaborative community activity with a mission to "create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond."

-- https://en.wikipedia.org/wiki/Schema.org 

Key to the success of a Schema-First approach is "an agreed-upon set of standards and approaches," which I believe for web content (i.e., structured data on the Internet) is Schema.org.

Previously, I highlighted some of the benefits of using Schema.org to structure a Content Management System's information architecture. The background, reasoning, and process for taking a Schema-First approach to API design is discussed in Kristopher Sandoval's blog post, Using a Schema-First design as your single source of truth. In this post, I want to spend some time talking about an overarching benefit for using a "Schema.org-First" approach to create a "single source of truth," leading to a unified content management system.

Single source of truth

The goal of Schema.org-First is to unify an organization's data, make it easy to author and distribute content, and to create a "single source of truth."

In information systems design and theory, single source of truth (SSOT) is the practice of structuring information models and associated data schema such that every data element is mastered (or edited) in only one place.

-- https://en.wikipedia.org/wiki/Single_source_of_truth 

To create a single source of truth for content, organizations need to create one place for content to be accessed and ideally edited.

Most enterprise organizations have multiple Content Management Systems and authoring tools for creating content. It is no surprise each department's schema for a blog post is different.

Reinventing the (information architecture) wheel when it comes to web-related content is unnecessary

Establishing that all web content within an organization should use Schema.org as the foundation for structuring the data pushes all departments to create homogeneous content. An added benefit of having Schema.org as the standard is it reduces duplicate work and costs related to building and maintaining a website's content architecture. Simultaneously, there is still a common problem that organizations with multiple departments most likely have various Content Management Systems.

In theory, each department could maintain separate Content Management Systems that transforms and pushes Schema.org data into a centralized Content Hub/Repository, which could result in a lot of repetitive work, additional overhead, and complexity. The most practical approach is to have a centralized and unified Content Management System using a Schema.org-First approach.

Unified Content Management System

Besides the immediate benefits of Schema.org improving Omnichannel publishing, SEO, and the syndication and aggregation of content, a Schema.org-First approach helps address one of the biggest channels to unifying content, which is "Governance."

Governance in this context is about unifying content and departments within an organization. Frankly, I find it easy to transform and migrate content between systems but always challenging to transform a team's approach to authoring and managing content. Sometimes it is hard for people within an organization to accept change unless it helps the organization and makes their lives easier. Having a unified Content Management System helps an organization establish better governance, improve the quality of the content, and reduce the costs of maintaining multiple Content Management Systems.

I want to pause and emphasize that reducing costs is about reducing the amount of duplicate work and money spent around maintaining multiple Content Management Systems, and not about eliminating people's jobs. My goal is to show how reducing redundant costs and would allow departments to be able to do more with their resources and people.

Simply put, a unified Content Management System makes it easier for an organization to produce higher quality content, faster.

Moving dozens of department websites into a unified Content Management System can be a daunting task. Addressing this challenge is where the "Schema-First" aspect of a "Schema.org-First" approach stands out when migrating content and people to a unified Content Management System.

A Schema.org-First Migration

Schema-First is an agreed-upon set of standards and approaches. Schema.org is the standard. Now let's talk about the process of migrating content to a unified Content Management System.

Migrations are a two-step process where the data must be mapped/transformed from the old system and then consumed into the new system. There are lots of ways to approach transforming and migrating data from one system to another. In this discussion, the destination system, which is the unified Content Management System, can pull data from each source Content Management System. One can also have the source systems push content into the destination system. Finally, there can be a collaborative migration, where the source system is transforming the data via an API, and the destination system is consuming the data. A collaborative migration can work out the best because an organization can have the developers and architects in the source system working in parallel with the developers and architects in the destination who are consuming the data.

The beauty of a Schema.org-First approach is that the source and destination content management teams know exactly how the migrated data needs to be structured before any code is created; the specification is available at Schema.org.

The source Content Management System's team knows the existing data architecture, and they have the immediate tools to transform it into Schema.org data. In most cases, the team should already know Schema.org, and they may have already started using it on their website. Because the source Content Management System most likely has a unique content architecture, each source team will have to write a custom data transformation. The good news is this could be the last time these developers ever need to write a data transformation script for the organization.

The even better news is the destination Content Management System's team only needs to write one script, ideally a microservice, to consume data from multiple Content Management Systems. Suppose the unified Content Management System development team spends the time to create a robust and stable Schema.org data aggregation system. In that case, the unified Content Management System will have a tool to ingest data from any internal or external content source. For enterprise organizations that might acquire a new company with a website, the unified Content Management System will have a well-defined process, as well as a tool for moving a website into a unified Content Management System.

I want to pause again and state the real value behind a Schema.org-First approach is not about technology or saving costs - it’s about making it easier for people to excel at their jobs. When I discuss how a Schema.org-First approach helps acquire a company's website, I think it also provides a way to onboard an acquired company's web content team. A successful and collaborative migration strategy provides an immediate and easy win for everyone involved and allows the acquired companies team to become part of the unified Content Management System.

Schema.org and Schema-first development's common goal is collaboration that makes it easier for people and organizations to be successful.

Conclusion

A Schema.org-First approach makes it easier to connect data and people.

Schema.org provides a standard to connect and structure web content. A Schema-First approach establishes an immediate connection between the people who are collaborating to build software. Using a Schema.org-First approach will create a single source of truth for an organization's web content and move its data and teams towards a unified Content Management System.

Getting organizations to leverage Schema.org while adopting a Scheme-First approach is a big challenge. The "API Evangelist" has some helpful thoughts and contemplations about "Why Schema.org Does Not See More Adoption Across The API Landscape?". I also highly recommend reading Kristopher Sandoval's blog post, Using a Schema-First design as your single source of truth.

If you are thinking about structured data and migrating to a unified Content Management System, which becomes your organization's single source of truth, my hope is that you will think about using a Schema.org-First approach.