I want AI/ML! Do you need it though...

Nish · July 22, 2023

Organisation’s are always quick to jump on the bandwagon when it comes to AI/ML. As a result many will end up falling into common pitfalls when actually trying to get things up and running.

📣 Here are 10 common examples of pitfalls which are typically encountered by organisations (both large & small) when either implementing or thinking about implementing AI/ML:

1️⃣ Unexpectedly high infrastructure costs You may expect less infrastructure than traditional software engineering but this really isn’t the case. AI/ML typically requires a whole software stack to serve especially when designing production grade applications to ensure its robust, scalable. The AI/ML part can be thought of as adding the additional complexity of data collection and governance, along with model complexities on top of more traditional applications.

2️⃣ No data collected yet If you’re just company just starting out you truly need to ask yourself “Do I have enough data?” and if the answer is no I suggest pausing and circling back on this. Now I am not saying the data has to be processed and in perfect format (next point) but you need at least some raw data to get started, otherwise your in no position to consider using AI/ML

3️⃣ Assuming data is ready for use when it’s not Say you do have some data which has been logging for years inside some storage account which some other team controls (having not been touched for awhile). It’s highly likely this data wont be suitable for straight modelling and will take time to preprocess (requires domain expertise to understand & technical skills to then make something of it) and if it hasn’t been touched for awhile you’d probably want to report, analyse and draw insights from it first to determine if it’s even useful at all. Don’t underestimate how long this will take!

4️⃣ Keeping humans in the loop AI/ML systems are great in that if implemented correctly they can help automate and improve core business processes however this means they now become extremely important (business critical) meaning decision makers might be more risk-adverse in changing the process. Make sure humans and always in the loop to mitigate this risk by checking over the data over time, handling Edge cases when necessary etc.

5️⃣ ML algorithm based value proposition Yeah AI/ML is cool and you may have decided to invest in it because you heard about a particular algorithm BUT your users more than likely couldn’t care less and would be way more concerned about whether your good/service actually satisfies their need better than their existing solution (e.g does the new feature work as intended and provide a better recommendation).

6️⃣ ML optimising for the wrong thing Your organisation may have built something successfully end-end HOWEVER if it’s not optimised on what you truly desire it might not be worth the time! e.g if Google Search was hypothetically optimised just for engagement that could lead to the engine recommending completely bad results just to increase the time spent engaging.

7️⃣ Is your ML improving things in reality? Can you actually see an improvement relative to the prior process or other important base metrics? You definitely need to think about this as ultimately the whole point of using AI is to make things better, you can’t know unless you set up ways to detect it. This then comes to statistical/hypothesis testing which requires additional technical debt to execute properly.

8️⃣ Using pre-trained models vs building your own

A lot of the recent breakthroughs involve large open source/proprietary pre-trained models however if your use cases utilise structured data these might not be too useful in which case you’ll need to develop your own. There can be a massive difference here which is easy to conflate (worse if you think you need a pre-trained model but actually require something custom).

9️⃣ Models must be trained frequently

It’s a normal feeling at first to think you just train a model once and it’d be good to go for the next year, this is hardly ever the case! It’s important to have retraining checkpoints in place so you can keep track of the model and retrain if necessary if not your likely going to face major performance issues down the line, making your potentially worthwhile investment wasted.

🔟 Designing your own algorithms in-house

99.9% of the time you don’t need to hire a full research team to do this! Great algorithms and (sometimes) their implementations are already out their free and backed by solid research institutions. Don’t waste time going down this route unless it’s absolutely necessary.

Citation Information

If you find this content useful & plan on using it, please consider citing it using the following format:

@misc{nish-blog,
  title = {I want AI/ML! Do you need it though...},
  author = {Nish},
  howpublished = {\url{https://www.nishbhana.com/ML-Pitfalls/}},
  note = {[Online; accessed]},
  year = {2023}
}

x.com, Facebook