Looking Ahead: Working with AI Data Steward Assistants

By: Malcolm Chisholm

Using GenAI, it is possible to develop AI assistants for certain domains of data. Let’s explore what an AI
Assistant for a Data Steward looks like. As most of us know, volunteering individuals to be Data
Stewards isn’t always an easy task. There is understandable resistance as most folks don’t want to deal
with extra work and new unknown responsibilities. In Data Governance we can try and mitigate this
resistance with education, workflows, other forms of automation, Organizational Change Management
and more. As we have seen, assigning Data Stewards gets easier if more than one person helps with the
stewarding. Unfortunately, budgets and time constraints mean this is not always an option, and this is
where an AI Data Steward Assistant comes in.

An AI Data Steward Assistant (ASDA) can be fine-tuned with knowledge about a specific domain. This
means documents are internalized, as are data models, metrics, standards, policies and more. Then the
ASDA is further trained with human reinforcement learning, whereby the Data Steward chats with the
ASDA. After the initial finetuning, knowledge must be continually fine tuned into the ASDA, with the
appropriate frequency (weekly, monthly, quarterly, etc…) to keep knowledge up to date.
Now that your enterprise has an ASDA for a particular domain of data, what does this give you?

1) A chatbot that can answer questions 24/7. Always online, always available.

2) ASDA can be used in multi-agent systems when knowledge about multiple domains is needed for a
multi-domain solution. In our opinion at Data Millennium, this alone is a tremendous reason to develop
ASDA, as multi-agent systems will proliferate soon and enterprises need ASDA (and other AI Agents) in
place for this.

3) To take the effectiveness of an ASDA to the next level, they need to be fully integrated into a Data
Catalog. The Data Catalog should have workflows, and other automation, that integrates directly with
the ASDA. The goal is to take as much pressure and load off the Data Steward as possible. The ASDA
should be able to answer questions other users may have. ASDA should also be able to periodically
check content and suggest new content in the Data Catalog.

In conclusion, if the ASDA is correctly implemented, the enterprise has an extremely valuable resource it
can utilize. For reasons of brevity, we have jumped over some of the finer details of ASDA development,
however we strongly suggest using a Small Language Model (SLM) chatbot for a foundational model for
a variety of reasons. Some of these reasons are listed in this posting from Splunk – LLMs vs. SLMs: The
Differences in Large & Small Language Models.

As GenAI initiatives launch, AI Agents such as ASDA may very well be created anyway, with or without
the support of Data Governance. To stay relevant and one step ahead, Data Governance should be
championing and pioneering AI Agents such as ASDA. If you or your organization are interested in being
one step ahead, then let’s have a discussion about the finer details of developing an ASDA, AI Agents
and/or Data Governance centric GenAI.

Message us through LinkedIn or email at:
contact@datamillennium.com

Tagged data catalogs, Data Governance, Data Steward, GenAI

Leave a Comment Cancel Reply