Would you like to code artificial intelligence?
I’ll show you the essential basics and 3 methods for programming your first AI.
Let’s get started!
What is Artificial Intelligence?
Because computer scientists have not yet agreed on a definitive and standardised definition, you can sell almost any software “as artificial intelligence”. I (and other computer scientists) associate artificial intelligence such complicated terms such as neural networks, tensors, split training, LLMS, weighting and large data sets.
You could already call a programme an AI if it checks for a threshold value with an If statement.
If the person is 1.90 metres tall and weighs 108.3 kg, then they are overweight.
This principle is called a decision tree and the Fraunhofer Institute already calls this system AI.
The artificial intelligence that I will describe here is based on statistics and stochastics, a part of mathematics that deals with data analysis and probabilities. The most boring lectures at university are the most exciting and relevant subjects for AI that you can take. But you don’t have to study to programme AI. Nevertheless, you should have a look at the basics so that you can better understand the tutorials on the internet in the background.
So now you want to understand how a “real” AI works, which has dominated the news in recent years. Here we go:
Function of Classic Software
To understand the idea behind an AI, we need to turn the concept into a model.
Classical software works according to the following scheme:
Software data → results
- Posting software Unprocessed transaction lists → Posted transactions
- Newsletter software Mailing list → Informed users
- Navigation systems Map material → Calculated route
Function of Artificial Intelligence
Artificial intelligence turns the classic software idea on its head:
Data results → software (AI) and further AI inputs (new data) → results
Let’s turn the goal around!
- School essays Corrections by teachers → Grammar software
- Images Labelling of images by humans → Image recognition software
- Scans of texts typed by humans → text recognition software (OCR)
The resulting software (model) can process new, unknown use cases. A programmer does not have to code in the logic, but the computer teaches itself the ability. The computer does not learn the skill itself, but tries to recognise patterns and present them with new information.
- The grammar software recognises errors in sentences that it has never processed before.
- The image recognition software labels unknown new images.
- Text recognition software can extract text from unknown scans.
If you want to learn more about the difference between artificial intelligence, machine learning and deep learning, have a closer look at the following articles:
The Basics of Artificial Intelligence
School and university teach theory stuff that you’ll never need again.
Theory is fun when you have a concrete use case, a challenge on the table.
If you set yourself the goal of programming artificial intelligence, then you have the necessary motivation to dive straight into the theoretical basics.
For your AI you need …
- Statistics
- Stochastics
- Programming language
- A computer
Employers call the statisticians of the past a data scientist. With the renaming, boring, dry subjects are suddenly hip.
Methods and Models
You can programme artificial intelligence with many mathematical models (very simple to highly complex). You need to test which model works best for your use case.
The most popular “hip” methods are …
- Recursive neural networks (Alexa)
- Large language models (ChatGPT)
- Tensor (FaceID)
For many use cases, these (sometimes complex) approaches do not produce better results than the “classic” methods …
- Regression
- Clustering
- Extreme value detection
Depending on the use case, the data scientist must experiment with all methods and not just limit themselves to the hip methods. A data scientist does not programme AI, but plays around with the parameters until they have found the right settings. You don’t find the optimum value with theory, but with trial and error.
Results and Data for Training
All AI methods need data. The data basis should …
- Quantity: AI systems create better models with more training data. The better the model, the better the results.
- Quality: Error-free labelled data points are essential because the AI learns something wrong. The training data should be as free of confounding factors as possible, valid and filtered.
Link tips:
- Kaggle Datasets
- Google Dataset Search
- Population projects: Boston Housing
3 Methods of AI
I present three methods of what an AI can look like
Simple Artificial Intelligence – Decision Tree
A few nested if-statements (if-then-branching) should be considered artificial intelligence. This simple construct is the prerequisite for an expert system:
Is the printer switched on? If(yes) → X Otherwise → Y
Are the cartridges full?
Is there a paper jam in the feeder?
Artificial Intelligence Based on Web Mining
An AI based on web mining can predict from the choice of words of the tweeting crowd of demonstrators whether riots are to be expected.
Stochastics simplified:
Peaceful demonstration 0 % – 50 % riots – 100 % civil war
If 100 of the 1000 protesters tweet words such as “punch”, “freak out”, “polish” or “throw”, then the probability increases.
If the majority post more factual comments on Twitter, the AI lowers the probability.
Which words are relevant or irrelevant for such an analysis? A machine learning programme determines this. The AI looks at the following data set
- Texts of all tweets under the hashtag Demo X (30,000 words)
- Excess (true, false)
- Number of violations
The labelling by field 2 and 3 help the AI to find the relevant words. What is the intersection of words from 60 demos where a riot occurred? The algorithm will not spit out the words that are demo-specific (reason for the demo) or generic words(stopwords).
Stopwords are filler words without much meaning (is, man, you, a, the, your, his, have).
Link tip: Common crawl for unfiltered web data for your analyses
Artificial Intelligence via Regression
Regression is a statistical model that attempts to establish the relationship between cause(s) and a result. The applications of regression are manifold:
- Predicting the gross national income of a country based on the number of people in employment
- Predicting the salary of a graduate based on the final grade
- Predicting the number of points in an exam based on the hours studied
- Predict body weight based on calories consumed per day.
Causes are the “independent” variables that influence the outcome, the dependent variable.
Types of variables are …
- Nominal: gender [female, diverse, male], addictions [smoker / non-smoker]
- Ordinal school grades [1,2,3,4,5,6]
- Metric: age, weight, height (continuous measure)
Regression is useful for predicting how old you will be if you don’t change your lifestyle. A common trap in thinking is that correlation does not equal causation. Someone who jogs in summer is more likely to get sunburnt than someone who is a couch potato. But that’s not true. If the couch potato were to lie in the sun, she would get sunburnt in exactly the same way. Jogging is not causally related to sunburn.
Example of a Regression
Two independent (arbitrarily selected) factors can determine your lifespan:
- Sleep: How many hours / minutes a day do you sleep? 8 hours of sleep is optimal and the more or less you sleep, the sooner you die.
- Sport: How long do you exercise each day? The more sport you do, the longer you live. If you overdo it, you will pass away sooner.
For representative data you need about 1000 people. You have to observe them over their lifetime and note the time of death. Researchers collect this data over several generations to create a large data set.
First steps
The first thing you should do is get an overview of the data. To do this, display two scatter plots for sleep age and sports age.
Now you can decide whether the relationship between the variables is linear or non-linear. If the relationship is linear, you could draw a straight line through the scatter plot and use a linear equation to determine the age of the daily habit.
Analysing and creating the model
If you can visualise objects spatially well, you can create a 3D graph with Z = age, X = sport and Y = sleep. The plot should show you a peak (highest age) with optimal sleep and sport habits.
The artificial intelligence model looks like this if there is a linear relationship
60 2 (hours of sleep) 0.5 (minutes of exercise) = predicted age
With 7 hours of sleep and 30 minutes of exercise per day, that is an age of 89 years.