March 16, 2026 · 10 min read

AI Readiness Assessment: 8 Questions Before Your First Custom Model

The 8 critical questions UAE enterprises must answer before investing in custom AI model development - covering data readiness, infrastructure, team capability, and use case selection.

Most failed AI projects do not fail because of bad algorithms. They fail because the organisation was not ready - the data was not there, the infrastructure could not support it, the team did not have the right skills, or the use case was wrong. An AI readiness assessment answers these questions before you spend six figures on model development.

At mlai.ae, we run AI readiness assessments for UAE enterprises across fintech, healthcare, real estate, and retail. These eight questions are what we evaluate. If you can answer them honestly before engaging any AI vendor, you will make better decisions and avoid the most common failure modes.

Question 1: Do You Have Enough Relevant Data?

This is the most important question and the one most often answered incorrectly. “We have lots of data” is not the same as “we have enough relevant, labelled data for this specific ML task.”

What “enough” means depends on the task:

Tabular classification (fraud detection, churn prediction, credit scoring): 5,000-50,000 labelled examples per class, depending on the number of features and class imbalance
NLP tasks (document classification, entity extraction, sentiment analysis): 2,000-10,000 labelled examples for fine-tuning; more for training from scratch
Computer vision (defect detection, document processing, medical imaging): 1,000-10,000 labelled images per class for fine-tuning; 10,000+ for training from scratch
Time-series forecasting (demand prediction, capacity planning): 2-3 years of historical data at the prediction granularity you need

UAE-specific data considerations:

Many UAE enterprises have operated for fewer years than their Western counterparts, producing shorter data histories. Expatriate workforce turnover means customer and employee datasets have higher churn rates. Regulatory changes (VAT introduction in 2018, corporate tax in 2023) create structural breaks in historical data that limit how far back training data remains relevant.

An honest assessment of your data readiness for AI prevents the scenario where you discover mid-project that you have 500 labelled examples when you need 5,000.

Question 2: What Is Your Data Quality?

Quantity is necessary but not sufficient. Data quality issues are the leading cause of model underperformance in UAE enterprise deployments. The most common problems we find:

Inconsistent labelling. Different staff members label the same data differently. In one UAE bank engagement, we found three different labelling conventions for the same transaction fraud categories across different branches - making the combined dataset internally contradictory.

Missing values. UAE datasets frequently have systematic missing data patterns. Customer records may have complete financial data but sparse demographic data. Clinical records may have diagnosis codes but missing procedure details. Missing data is manageable if you know the pattern; it is dangerous if you assume completeness.

Stale data. Data collected 3+ years ago may not represent current patterns. UAE markets move fast - a property valuation model trained on 2022 Dubai transaction data would perform poorly on 2026 market conditions.

Encoding issues. Arabic text data often has encoding inconsistencies - mixed UTF-8 and Windows-1256, inconsistent diacritic usage, multiple representations of the same Arabic character. These must be normalised before training.

Bias in historical data. If your historical data reflects biased human decisions (biased lending approvals, biased hiring, biased clinical triage), a model trained on that data will learn and reproduce those biases. UAE’s diverse demographic profile makes this a particularly important consideration.

A data quality audit takes 1-2 weeks and saves months of wasted model development on dirty data.

Question 3: Is Your Data Accessible?

Data exists but cannot be accessed in a usable form. This is more common than enterprises expect. Accessibility problems include:

Data locked in SaaS platforms with no API or bulk export capability
Data spread across disconnected systems that have never been integrated (CRM, ERP, core banking, document management)
Data in unstructured formats (PDFs, scanned documents, email attachments) that require extraction before ML use
Data governance restrictions that prevent combining datasets across departments without compliance review
Legacy system dependencies where accessing historical data requires coordination with vendors or IT teams managing end-of-life systems

In UAE enterprises, we frequently encounter a split between modern cloud-native systems (recent SaaS deployments) and legacy on-premises systems (core banking, hospital information systems, ERP). Bridging this gap to create a unified training dataset is often the longest phase of an AI project.

Assess data accessibility early. If it takes three months to get data access, that is three months before model development even starts.

Question 4: What Problem Are You Actually Solving?

“We want to use AI” is not a problem statement. A clear problem statement for an ML project specifies:

What decision the model will make or support
What data is available as input at prediction time (not just for training)
What output the model produces (classification, score, ranking, generation)
What accuracy is required for the output to be useful (and what the current baseline is without ML)
What volume of predictions is needed (daily, hourly, real-time)

Good problem statement: “Classify incoming wire transfers as low, medium, or high risk for AML review within 100ms of receipt, using transaction metadata and sender/receiver profiles, achieving at least 90% recall on high-risk transactions while keeping false positive rate below 15%.”

Poor problem statement: “Use AI to improve our compliance processes.”

The difference between these two statements is the difference between a project that can be scoped, built, and measured - and a project that drifts for months without clear success criteria.

For UAE enterprises evaluating multiple potential AI use cases, use case prioritisation should rank by three factors: business impact (revenue or cost), data readiness (do you have the data today), and technical feasibility (is this a well-understood ML task or a research problem).

Question 5: Can You Measure Success?

Every ML project needs a measurable definition of success, established before development begins. This requires:

A baseline to beat. What is the current performance without ML? If human reviewers classify transactions with 85% accuracy, the model must exceed 85% to justify deployment. If there is no current process, define what “good enough” means for the business case.

A metric that aligns with business value. Accuracy is not always the right metric. For fraud detection, recall (catching fraud) and false positive rate (not blocking legitimate transactions) matter more than overall accuracy. For property valuation, median absolute percentage error (MdAPE) matters more than mean error. Choose the metric that measures what the business actually cares about.

A validation methodology. How will you test the model before deployment? Holdout test sets must be representative of production data. For time-dependent predictions, use temporal validation (train on past, test on future) rather than random splits. For UAE models, ensure the test set includes the demographic, seasonal, and regulatory variation the model will encounter in production.

An ongoing measurement plan. Model performance degrades over time. Define how you will monitor production accuracy and when you will trigger retraining. Without this, a model that was 95% accurate at launch quietly drops to 80% over 12 months.

Question 6: Do You Have the Right Infrastructure?

AI infrastructure requirements depend on the model type and scale:

Training infrastructure: Fine-tuning a 7B-parameter LLM requires GPU compute - typically A100 or H100 instances. Training a traditional ML model (XGBoost, random forest) runs on standard CPU instances. Cloud GPU costs for a fine-tuning run range from $200 to $5,000 depending on model size and training duration.

Inference infrastructure: How will the model serve predictions? Real-time inference (sub-100ms response) requires dedicated GPU or optimised CPU serving infrastructure. Batch scoring (daily or hourly) can run on standard compute. The cost and complexity difference between real-time and batch inference is significant.

Data infrastructure: Can your data platform support ML workflows? You need the ability to extract training data, version datasets, run preprocessing pipelines, and feed production features to inference endpoints. If your data lives in a transactional database with no analytics layer, building the data infrastructure may take longer than building the model.

UAE data residency: Many UAE enterprises - particularly in financial services and healthcare - have data residency requirements. Training and inference infrastructure must be located in UAE or approved jurisdictions. Cloud providers offer UAE regions (AWS Bahrain, Azure UAE North, GCP Doha) but not all GPU instance types are available in all regions.

Question 7: Does Your Team Have the Right Skills?

An honest team capability assessment prevents two failure modes: building a project your team cannot maintain, or hiring for skills you do not actually need.

To build and deploy a custom model, you need:

ML engineering: Someone who can select algorithms, design features, train models, and evaluate results. This is the core technical skill
Data engineering: Someone who can build data pipelines, manage datasets, and ensure reliable data flow from source systems to model training and inference
MLOps/DevOps: Someone who can containerise models, deploy to production infrastructure, set up monitoring, and manage the model lifecycle
Domain expertise: Someone who understands the business problem, can validate model outputs, and can define what “correct” means in your specific context

In UAE enterprises, the most common gap is MLOps capability - teams can build models in notebooks but cannot deploy and operate them in production. The second most common gap is domain-specific data engineering - connecting to UAE-specific data sources and handling Arabic text processing.

You do not need all skills in-house from day one. A common pattern is to engage an external team (like mlai.ae) for the initial build and deploy, then hire or upskill internal staff for ongoing operations with a managed support retainer during the transition.

Question 8: What Is Your Timeline and Budget Expectation?

Misaligned expectations on timeline and budget are the most common source of AI project dissatisfaction. Honest benchmarks:

Timeline benchmarks for UAE enterprise AI projects:

AI readiness assessment: 2 weeks
Data preparation and pipeline setup: 2-6 weeks
Model development and training: 4-8 weeks
Integration and deployment: 2-4 weeks
Total from start to production: 10-20 weeks for a typical first model

Add 4-8 weeks if data access is not ready at project start. Add 2-4 weeks if data quality requires significant cleaning. These are the most common delays.

Budget benchmarks:

AI readiness assessment: AED 18,000-35,000
First custom model (build and deploy): AED 75,000-250,000 depending on complexity
Ongoing model operations (monitoring, retraining, support): AED 15,000-40,000/month

These ranges cover mid-complexity UAE enterprise projects. Simpler use cases (single-model classification on clean tabular data) cost less. Complex use cases (multi-model systems, real-time inference at scale, regulatory compliance requirements) cost more.

If your budget expectation is AED 20,000 for a production AI system, you are not ready for custom model development. Consider starting with off-the-shelf AI tools or API-based solutions and revisit custom models when the business case justifies the investment.

Running Your Own Readiness Assessment

You can use these eight questions as a self-assessment framework. Score each question on a 1-5 scale:

1: Not ready (no data, no infrastructure, no clear use case)
3: Partially ready (some data, some infrastructure, general idea of use case)
5: Fully ready (clean labelled data, production infrastructure, clear problem statement with success metrics)

An average score below 2.5 means you should invest in foundational data and infrastructure before model development. A score of 2.5-3.5 means a readiness assessment will identify specific gaps to close. A score above 3.5 means you are ready to scope a model development project.

Get a Professional Assessment

Self-assessment has limits. mlai.ae’s AI Readiness Assessment is a structured two-week engagement that evaluates all eight dimensions with hands-on data inspection, infrastructure review, and use case scoring. The output is a prioritised AI roadmap - which use cases to pursue first, what preparation is needed, and realistic timelines and budgets.

Book a free AI discovery call to discuss your AI readiness. We will give you a candid assessment of where you stand and what it takes to get to your first production model.

Build It. Run It. Own It.

Book a free 30-minute AI discovery call with our Vertical AI experts in Dubai, UAE. We scope your first model, estimate data requirements, and show you the fastest path to production.

Talk to an Expert