Reinforcement Learning Development Process
1. Problem Definition
A key phase where business problems are defined and prioritized. It is worth spending time to get this right, as subsequent effort could be ineffective and wasted.
2. Domain Knowledge Transfer and Infrastructure Setup
Businesses are often experts in their own domain. This domain expertise is valuable to help direct future solutions. In this phase we also ensure any prerequisites are available, including working with our very own ML engineers and MLOps consultants to ensure the infrastructure meets our needs.
3. MDP Refinement
The definition of the MDP is crucial in RL projects. We often iterate over the MDP design to help improve performance.
4. Environment Development/Refinement
The environment, whether in simulation or in real life, needs refinement. Accurate simulations help improve the sim2real problem and updating environments to incorporate new information can significantly boost performance.
5. RL Data Analysis/Refinement
Like much of data science, understanding and appreciating the data is important. Refining what the agent can “see” significantly improves learning performance.
6. RL Algorithm Development
Working on the actual RL algorithm takes a surprisingly small amount of development time, but it is often necessary, especially when improving policy models.
7. Agent Evaluation and Analysis
Thorough and robust evaluation practices are vital for directing development. These results are often shared with stakeholders as a representation of progress. Note how we often iterate back to the MDP refinement to apply new learnings.
8. Deployment and Monitoring
In the final phase we deploy and operate our agents. Do not underestimate this phase; there are a lot of pitfalls, especially when operating at scale. We collaborate with our very own expert team of MLOps consultants to help here.