We often get asked what exactly we mean when we talk about creating a semantic framework for data intelligence in a firm or organization. It's a great question:
First, let's consider what it doesn't mean:
1. It doesn't mean replace all of your relational databases with knowledge graphs.
2. It doesn't mean creating complex ontologies for every concept and process in your firm.
3. It doesn't mean putting all your data into one large repository.
4. It doesn't mean overhauling your data framework overnight. The beauty of a semantic approach is that it is not all or nothing. It is flexible and can evolve.
So what do we mean? We mean,
1. Standing up a knowledge graph in which we can deposit standardized and normalized data from around the firm.
2. Starting a way to create, share and extend key concepts in extendible models for reuse in many places. This will provide a framework allowing us to integrate data from different sources and integrate with information. This library of models will be a tool people can use to normalize their data to some standard before they deposit it in the graph.
3. Creating a semantic-based approach to tracking your data assets. This might start with deploying your firm's data catalogs and glossaries in a semantic framework and then following that up with a way to link those resources to the data assets across the firm, giving us a metadata repository in which users can quickly understand where they can find data and the shared meaning that it has.
4. Develop ways to share information about the ETL jobs used to transport data from structured sources into your graph. This makes it easy for people to retain their data for legacy systems but also quickly share and integrate their data in an on-demand fashion.
5. Using the resources developed in (3) and (4), create a semantic factor store. A store that machine learning specialists can use to take advantage of the fact that others have identified conceptual overlap and the normalization tools developed so that they can quickly pull naturalized/harmonized data into their ML environments on an as needed basis to derive new insights.
6. Develop a process to update your graphs with insights derived from ML and other analyses. This means that new insights don't just live in a report or analysts notebooks, but becomes fresh new information that can be reused for deriving further insights. In a semantic environment, metadata is data but insights derived from data also has the potential to be data!
7. Find ways to make the data in your graph accessible to all users in the firm. This may mean creating special APIs or query interfaces for different use cases that allows different users to understand the different datasets across knowledge graphs.
8. Find ways to equip all data practitioners in the firm with the ability to add new taxonomies and data sources into the framework. Yes, this means making all of your structured sources more sharable and much easier to find, but it also means fostering a culture where your folks are educated on the best practices and tools.
At Reclassify AI we have experience with the tools, processes and people needed to realize the vision outlined above. Give us a call to discuss how we can kick it off in your firm.
Comments