Reinforcement Learning POC Process
1. Business Context
Any problem demands context from the business. A solution for one industry may not be applicable to another, nor is every business the same. Establishing shared context helps get the project off to the right start.
2. Domain Knowledge Transfer
Businesses are often experts in their own domain. This domain expertise is valuable to help direct future solutions.
3. Problem Definition/Clarification
POCs usually start with a vague idea of what problem they are trying to solve. But the problem definition often changes over time, becoming more concrete, adapting to what is possible given the data.
4. MDP Design
The formulation and definition of the Markov Decision Process is a crucial part of the solution design and is often refined over a number of iterations.
5. Environment Definition/Creation
Defining the environment can take some time to get right, because it must be representative, and it’s unusual to allow the use of a real system.
6. RL Agent Development
The development of first a baseline agent, then a sophisticated RL agent is take in stages to ensure the MDP and environment actually represent the problem. It’s important to keep validating the solution.
7. Agent Evaluation and Analysis
Agents can take a while to train, especially if they are in complex environments. In this phase we validate that the problem is viable and hopefully produce promising results.
8. Reporting
Once models are validated then it’s time to report the results back to the stakeholders. After this phase we often start looking at another problem, or promote it to a fully-fledged reinforcement learning development project.