Exploring a Schema-First approach to Drupal and Content...

Exploring a Schema-First approach to Drupal and Content Management Systems

· Schema,Drupal,CMS,Schema-First

What is the future?

Our future user experience with computers is going to be significantly based on the intersection of personalization and voice assistants. Simply put, voice assistants are going to completely understand every question being asked, find the perfect personalized answer to every question, and empathetically return an appropriate response.

I have become slightly obsessed with voice assistants. In my limited free time, I am exploring both Alexa and Google Assistant. Both technologies have their strengths, weaknesses, and differences but their underlying user interaction and even their back-end code is glorified if/then statements. For example, “if” an end-user asks a question like, “What is the weather?”, “then,” the back-end code looks up the weather for the user’s current location and returns a response.

One of the key challenges for building useful voice assistant applications is conversational design. Although we all know how to have a conversation, designers and developers will need to discuss the problem and figure out the solution. A secondary challenge that I am noticing for creating engaging voice user experiences is providing the data behind the voice. Organizations are going to have to restructure their data to be more omnichannel and consumable by voice applications. For example, Mayo Clinic recently discussed how they had to rethink their editorial process to create content that is more distributable to voice channels.

As we begin to develop voice assistant applications and strive to build personalized user experience, everyone is going to come to the realization that we need to rethink how we structure, share, and consume data. If we collectively want to succeed, we need to collaborate and work together to define and implement standardized data structures.

Defining, standardizing, and structuring our data using Schema.org

The internet and modern computing exist because collectively we have become good at collaboratively creating and sharing open standards and open-source software. Schema.org is the most recent standard to emerge and the blog post excerpt below established who is behind Schema.org and why these organizations are working together.

“On June 2nd (2011) we announced a collaboration between Bing, Google and Yahoo to create and support a standard set of schemas for structured data markup on web pages. Although our companies compete in many ways, it was evident to us that collaboration in this space would be good for each search engine individually and for the industry as a whole.”
-- http://blog.schema.org/2011/07/on-june-2-nd-we-announced-collaboration.html 

The homepage of Schema.org succinctly defines what is Schema.org.

“Schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond.”
-- https://schema.org/

Based on the popularity of Schema.org, there is little need for any further explanation or even examples because everyone is implementing it…we are just doing it wrong. “Wrong” is a very harsh and critical word, and I hesitate to ever use it. At the same time, in this context I am now required to justify its usage and illustrate the mistake that we are collectively making when thinking about Schema.org.

First off, for many existing websites and internet applications, schema is an afterthought that is being implemented on top of our existing web pages and content. Fortunately, adding schema to our web pages usually results in improved SEO and Google page ranking. If we spend the time implementing speakable schema, we can also have our content promoted in voice applications.

To address the growing need to include schema within our webpages, every enterprise CMS offers some mechanism to add schema to generated content. Drupal has a Schema Metatag module. WordPress has a Schema plugin. Adobe Experience Manager has the ability to create, add, and manage metadata schemas. Laying schema on top of our existing content architecture feels more like a workaround or band-aid to the challenge of collectively defining, standardizing, and structuring our data. Everyone is still defining their unique content architecture.

We are failing to address one of the biggest challenges in computer science... naming things

If we step back and think about it, there are many different ways to describe something as simple as an image object. For example, should the text associated with the image be called title, caption, label, or text, and then how do we want to describe the image’s URI and metadata?

Would it be possible for everyone to standardize on one canonical definition on an image object? What would happen if a content management system adopted a Schema-First approach?

A Schema-First approach

Several people have begun talking about the benefit of a well-defined schema at the very beginning of a project.

“I think it would be fantastic if most vendors would be able to use schema.org as a starting point.”
-- Using schema.org as a starting point for (headless) WCM 

“What’s the solution, then? Establishing a single source of truth through the use of schema-first design can help align software development. Identifying a common goal, and establishing a source for teams and processes to align themselves with is incredibly important, and in this piece, we’ll give you the understanding, and some tools, to do exactly that.”
-- Using A Schema-First Design As Your Single Source of Truth

As we start creating decoupled, headless content management solutions, it becomes more critical that the front-end teams get the expected data in the expected format. “Communication” is the single word that best describes the overarching benefit to a Schema-First approach. The idea that our websites and applications communicate using the same data structures would make the concept “Omnichannel publishing” a thing of the past and change how we syndicate and aggregate information.

Besides “Communication” there are several secondary benefits worth highlighting and using to as arguments to stakeholders who need to understand the benefits of going Schema-First.

SEO

“Google uses structured data that it finds on the web to understand the content of the page, as well as to gather information about the web and the world in general.”
-- https://developers.google.com/search/docs/guides/intro-structured-data

Everyone especially, site owners and marketers want to have a website with excellent SEO, which increases the website's overall Google page ranking. Schema.org’s structured data is specifically designed to help improve how search engines understand shared content. SEO is the reason most organizations have added schema to the web pages. Now, we are just taking it one step further and building web pages on top of a well-defined shared schema.

Omnichannel

Omnichannel is a cross-channel content strategy that organizations use to improve their user experience and drive better relationships with their audience across points of contact. 
-- https://en.wikipedia.org/wiki/Omnichannel

COPE (Create, Once, Publish, Everywhere) approach to content management, which has been rebranded as “Omnichannel,” has helped people understand the importance of creating and distributing content. COPE and Omnichannel is still focused on an organization reaching users across an organization’s contact points. A Schema-First approach would make the “everywhere” in COPE mean “everyone”. Everyone inside and outside an organization would be able to share and consume data.

Syndication & Aggregation

Any organization that has implemented schema on their web content has seen the benefit of their content - that it is more accessible within search results and voice applications. It’s very hard for people to conceptualize what it would be like if websites and applications could seamlessly push and pull data from one website to another. It is hard to imagine that every single webpage with biographical or location information could have the data structured in the same way.

Even though we are creating API First Content Management Systems, it still requires developers to document the data being syndicated, and the developer aggregating this data needs to understand, transform, and consume the data. It pains me to admit that many organizations’ internal teams still struggle with sharing and consuming data.

Shouldn’t an organization press release be automatically consumable by any application?

An amazing proof-of-concept would be Schema-First CMS being able to pull in any NewsArticle from a website, like The New York Times, which implements Schema.org for their articles. A content manager should be able to cut and paste a URI, and with absolutely no code to normalize or massage the data, the external content should be available with the CMS.

Limitations

“Now, one thing that schema.org won’t do for you is map to your product strategy, content strategy or layout.”
-- https://markdemeny.com/2019/09/using-schema-org-as-a-starting-point-for-headless-wcm/

A Schema-First approach is not a solution but provides a rock-solid foundation for building out an organization’s digital strategy and user experience. We also have to recognize that Schema.org is continually evolving, improving, and even deprecating some entities and properties. Fortunately, as one of my next steps, I want to explore prototyping a Schema-First Content Hub/Repository using Drupal, which as a community in their upgrade from Drupal 8 to Drupal 9, is learning how to evolve and properly deprecate code and data structures.

Next steps

Analyze, Prototype, Evangelize

The concept of taking a Schema-First approach to the Information Architecture behind a Content Management System is not going to be difficult to sell; the challenge is going to be implementing it successfully.

Analyze

Schema.org is an evolving standard, and its shortcomings need to be analyzed and discussed. Examining how we are transforming our legacy content structures to conform to Schema.org’s specifications could show us what is missing from Schema.org’s specifications. We also need to determine what data should and should not be modeled via schema. For example, we should also think about how to manage presentation information and where it should live.

I am more of an implementation person and not necessarily a specification person. Everyone has their strengths and weaknesses. This is why I frequently use the word collaboration throughout my blog posts, to encourage everyone to contribute. The best feedback which I am going to be able to provide is going to come from implementing and prototyping a Schema-First content management solution. Fortunately, there are people like Peter F. Patel-Schneider at Nuance Communications, Inc. analyzing Schema.org (Article - Video),

Prototype

Of course, I am guiltily optimistic in thinking that Drupal and its community can solve any problem. At the very least, Drupal is the open-source leader for enterprise content management and user experiences. And fortunately, I am not alone in thinking that Drupal could be a forerunner for a Schema-First CMS.

“So that’s why I’ve been thinking about how it would be if we just had one CMS which would, obviously, be perfect. The perfect CMS, or PCMS, for short...PCMS doesn’t exist, but many of the concepts mentioned in this article do. Drupal has something called the Content Construction Kit which brings ‘custom fields’ to a new level.”
-- A proposal for a perfect CMS

For me, I want to follow up this post with a proposal on how we can prototype a Schema-First implementation of a decoupled Drupal application, or maybe even more specifically, a “Content Hub/Repository.”

Evangelize

For open source and open standards to succeed, we collectively need to evangelize our thoughts and ideas. I know some organizations have invested time and resources in building better content models based on Schema.org. Organizations need to share their experience and get involved by encouraging clients to use and improve these standards.

For now, when your organization is layering Schema.org on top of your existing websites you should think about the possibility of a single way for everyone to organize, structure, and share data because I believe there is one, and Schema.org might be the solution.