This checklist is organized into two main categories:

System concerns: Address quality attribute concerns that apply to the system as a whole, without requiring special considerations for ML components compared to traditional components.

Process concerns: Focus on the effective management and execution of the development and maintenance process for ML-enabled systems, as well as concerns specific to architecting such systems.

Each check in the checklist is described using the following fields:

ID: A unique identifier following a naming convention: the capital initial of the main category (S for System, P for Process) followed by the capital initial of the subcategory (e.g., A for Availability) and an integer. If ambiguities arise, a two-letter abbreviation is used (e.g., SMD for System–Modularity).

Name: A brief descriptive label summarizing the focus of the check.

Check: The specific question or statement to be considered during architectural decision-making.

ML Specificity: The degree to which the check is specific to ML-intensive systems, categorized as low (L), medium (M), or high (H).

Download the Checklist
System
Name ID Check ML Specificity
Data visualizationSU1Do you have data visualization techniques in place?L
Visualization techniquesSU2Have you considered visualization techniques to identify or highlight relationships between data and computing tasks?M
Data preparationSDQ1Do you have strategies for data preparation and for making statistics on data?H
Data cleaningSDQ2Is your dataset clean, of good quality, and free from potential bias?H
Dataset sizeSDQ3Are you concerned about dataset size in ML processing?H
Concept driftSDQ4Do you engineer your ML-based system to adapt to input data changes (concept drift)?H
System correctnessSC1Do you have techniques for ensuring system correctness?H
Model validationSMV1Are you performing validation of the model to predict how a learning algorithm will behave on new data?H
Model validationSMV2Are you combining model validation with data validation to detect data errors?H
Independent upgradeabilitySMD1Are you building a component-based distributed system where parts may need to be upgraded?M
High Cohesion and low couplingSMD2Are high cohesion and low coupling important?L
MicroserviceSMD3If you are interested in maintainability and modifiability, did you consider using a microservice architecture?L
Discrete serviceSMD4Can you decompose your system into discrete services?L
Modeling intrinsic uncertaintySMN1Can you explicitly model the intrinsic uncertainty of ML components and assess its impact at the design stage?H
Time predictabilitySMN2Do you have mechanisms for monitoring and post-analysis of time predictability?H
Monitoring driftSMN3Do you have tests that monitor changes in input distributions?H
Continuous integrationSDE1Can you use continuous integration techniques for system development?M
Infrastructure as codeSDE2Do you manage IT infrastructure like servers, databases, and networks through code for your ML system?M
Blue/Green, Canary testingSDE3Are you including Blue/Green or canary testing in your MLOps pipelines?H
Failure recovery strategySA1Did you consider failure recovery strategies to avoid failure propagation?L
Domain knowledgeSA2Do you have the required domain knowledge to handle availability decisions?M
Layered/tiered architectureSA3Can you split business logic from ML components using a layered/tiered architecture?M
UncertaintySR1Do you have complete information on ML uncertainty at design time?H
Fail safeSS1Do you have techniques to quickly reach safe states when needed?H
Safety evaluationSS2Have you included evaluation processes for architectural safety choices?L
Coding standardsSS3Do you use strict and certified coding standards for safety-critical ML components?L
External certificationSS4Is your system safety-certified by external authorities?L
Design to defendSS5Are you designing your ML system to defend vulnerable code sections from cyberattacks?H
Safety and fairnessSS6Do you ensure systematic fairness and safety in your ML system?H
Data lossSP1Do you handle data loss reduction and privacy preservation, e.g., using federated learning?H
Process
Name ID Check ML Specificity
DocumentationPD1Do you have proper documentation or a plan to document your ML system?M
TeamPT1Do you have heterogeneous teams mixing ML developers, data engineers, and architects?D
Test-drivenPT2Do you have a test-driven development strategy for your QA and testing process?M
Separate pipelinesPSP1Do you separate the branches for training pipelines from model training?H
Model customization and reuseSML1Do you have expertise to customize and reuse models?H
VersioningSML2Do you manage and version ML models?H
ML infrastructure for deploymentSMI3Have you defined ML infrastructure and deployment processes?H
Model testingSML4Are you testing the quality and performance of your models?H