The Four Pillars of AI Development

I’ve worked on AI projects in both academia and practice.

There are significant differences between these two contexts. In academia, the focus is often on developing new models and algorithms, with less emphasis on deploying them in real-world scenarios. In practice, however, the focus is on building AI systems that can solve real-world problems and deliver business value.

In my experience, the success of AI projects in practice depends on four key pillars. Whether you are a product manager or an ML engineer, everybody on the project team needs to have an understanding of these factors. It is even more critical if you are a full-stack AI practitioner, who has responsibility for all project phases and parts of the tech stack.

Now, let’s dive in.

The Four Pillars of AI Development

Pillar #1: Well-Defined Business Case

To ensure the success of an AI project, it's crucial to align the expectations of developers and all stakeholders by defining the underlying business case. This process should be executed for each AI project, whether it's an AI-powered product or feature.

At a high level, defining the business involves the following steps:

  1. Generate potential problems that can be solved (better) with AI.

  2. Conduct a feasibility study (research only, no coding!) to draft an AI-based solution for the problem.

  3. Determine the ROI for each project based on the desired change in the primary business metric. It's vital to assign monetary values here.

  4. Select the project with the highest ROI and complete the ML canvas for it. This forces you to specify both the business and technical aspects of the project.

By the end of this process, all stakeholders should have a shared understanding of what to expect in the given project.

Pillar #2: End-User Focus

In my opinion, a lack of end-user focus is the main failure point in AI projects, period. I make this mistake all the time.

AI as a technology has unique characteristics which need to be accounted for in UX design:

  • Probabilistic nature: AI is primarily built by training statistical machine learning algorithms on noisy and biased datasets. This can result in error-prone behavior and undesirable outputs.

  • Complexity: Due to huge datasets and model sizes, understanding model outputs is very difficult. This creates additional cognitive effort for the user.

  • Capabilities: AI can take over tasks that were previously reserved for humans (e.g., generate images). Thus, new design patterns for user interfaces might emerge.

To ensure that the AI user experience is properly designed, it's important to start the design process early by conducting end-user studies with low-fidelity prototypes. These prototypes could be mockups or simple web apps. The AI can be mocked by fixed rules or a naive baseline model. By doing this, AI practitioners can ensure that the system is properly designed to meet the needs of users and that any potential issues are identified early in the process.

Pillar #3: Data-Centric Approach

Data-centric AI is a term coined by AI pioneer Andrew Ng. The discipline aims to "systematically engineer data to build an AI system". Therefore, in all phases of an AI project, from prototype to deployment, you should prioritize data above all else. The ultimate goal is to build a machine that can learn continuously, even when the engineers are on vacation.

Data-centric AI revolves around several key tasks:

  • Data labeling: Focus on creating diverse and high-quality datasets by evaluating label quality.

  • Error analysis: Identify common failure cases of your model and update your datasets accordingly.

  • Data monitoring: Continuously monitor the quality and distribution of incoming data.

Spend most of your time creating high-quality datasets, and reduce the time spent on model tuning. Use AutoML solutions to derive baselines quickly since ML models have become commoditized these days.

Pillar #4: Appropriate Operationalization

Once you have built a prototype based on a well-performing model and a user-friendly interface, it is time to put your product into production.

Fortunately, we can rely on proven software engineering principles. The field of MLOps is all about applying reproducibility, automation, versioning, and collaboration to ML-specific artifacts, namely data, and models. While best practices are still emerging, there is a wealth of information available to build on.

Warning: As engineers, we often fall into the trap of over-engineering. The amount of required automation for an AI project depends heavily on the use case. If your input data distribution does not change often, you probably do not need an automated retraining pipeline. Manual retraining once a month will most likely suffice.

As a general guideline, it is best to focus primarily on setting up appropriate monitoring. This will enable you to detect issues quickly and collect additional data for improving your product.

Conclusion

If you focus on these four pillars, you will drastically increase your chances of success when building AI-powered products.

We will dive deeper into each of these pillars in the upcoming issues.

If you like what you see, please spread the word by forwarding this message to your friends and colleagues. If you came across this article on my blog, please consider subscribing to my newsletter in which I share insights like these every week. Make sure to follow me on Twitter and LinkedIn where I post short-form content daily.