Category Archives: Python

Augmented reality with Python and OpenCV (part 1)

You may (or may not) have heard of or seen the augmented reality Invizimals video game or the Topps 3D baseball cards. The main idea is to render in the screen of a tablet, PC or smartphone a 3D model of a specific figure on top of a card according to the position and orientation of the card. 


Figure 1: Invizimal augmented reality cards. Source: 

Well, this past semester I took a course in Computer Vision where we studied some aspects of projective geometry and thought it would be an entertaining project to develop my own implementation of a card based augmented reality application. I warn you that we will need a bit of algebra to make it work but I’ll try to keep it as light as possible. To make the most out of it you should be comfortable working with different coordinate systems and transformation matrices.


First, this post does not pretend to be a tutorial, a comprehensive guide or an explanation of the Computer Vision techniques involved and I will just mention the essentials required to follow the post. However, I encourage you to dig deeper in the concepts that will appear along the way.

Secondly, do not expect some professional looking results. I did this just for fun and there are plenty of decisions I made that could have been done better. The main idea is to develop a proof of concept application.


With that said, here it goes my take on it.

Where do we start?

Looking at the project as a whole may make it seem more difficult than it really is. Luckily for us, we will be able to divide it into smaller parts that, when combined one on top of another, will allow us to have our augmented reality application working. The question now is, which are these smaller chunks that we need? 

Let’s take a closer look into what we want to achieve. As stated before, we want to project in a screen a 3D model of a figure whose position and orientation matches the position and orientation of some predefined flat surface. Furthermore, we want to do it in real time, so that if the surface changes its position or orientation the projected model does so accordingly.

To achieve this we first have to be able to identify the flat surface of reference in an image or video frame. Once identified, we can easily determine the transformation from the reference surface image (2D) to the target image (2D). This transformation is called homography. However, if what we want is to project a 3D model placed on top of the reference surface to the target image we need to extend the previous transformation to handle cases were the height of the point to project in the reference surface coordinate system is different than zero. This can be achieved with a bit of algebra. Finally, we should apply this transformation to our 3D model and draw it on the screen. Bearing the previous points in mind our project can be divided into:

          1.  Recognize the reference flat surface.

          2.  Estimate the homography.

          3.  Derive from the homography the transformation from the reference surface coordinate system to the target image coordinate system.

          4.  Project our 3D model in the image (pixel space) and draw it.

AR - Page 1(1)

Figure 2: Overview of the whole process that brings to life our augmented reality application.

The main tools we will use are Python and OpenCV because they are both open source, easy to set up and use and it is fast to build prototypes with them. For the needed algebra bit I will be using numpy.

Recognizing the target surface

From the many possible techniques that exist to perform object recognition I decided to tackle the problem with a feature based recognition method. This kind of methods, without going into much detail, consist in three main steps: feature detection or extraction, feature description and feature matching.

Feature extraction

Roughly speaking, this step consists in first looking in both the reference and target images for features that stand out and, in some way, describe part the object to be recognized. This features can be later used to find the reference object in the target image. We will assume we have found the object when a certain number of positive feature matches are found between the target and reference images. For this to work it is important to have a reference image where the only thing seen is the object (or surface, in this case) to be found.  We don’t want to detect features that are not part of the surface. And, although we will deal with this later, we will use the dimensions of the reference image when estimating the pose of the surface in a scene.

For a region or point of an image to be labeled as feature it should fulfill two important properties: first of all, it should present some uniqueness at least locally. Good examples of this could be corners or edges. Secondly, since we don’t know beforehand which will be, for example, the orientation, scale or brightness conditions of this same object in the image where we want to recognize it a feature should, ideally, be invariant to transformations; i.e, invariant against scale, rotation or brightness changes. As a rule of thumb, the more invariant the better.

Blank Diagram - Page 1

Figure 3: On the left, features extracted from the model of the surface I will be using. On the right, features extracted from a sample scene. Note how corners have been detected as interest points in the rightmost image.

Feature description

Once features have been found we should find a suitable representation of the information they provide. This will allow us to look for them in other images and also to obtain a measure of how similar two detected features are when being compared. This is were descriptors roll in.  A descriptor provides a representation of the information given by a feature and its surroundings. Once the descriptors have been computed the object to be recognized can then be abstracted to a feature vector,  which is a vector that contains the descriptors of the keypoints found in the image with the reference object.

This is for sure a nice idea, but how can it actually be done? There are many algorithms that extract image features and compute its descriptors and, since I won’t go into much more detail (a whole post could be devoted only to this) if you are interested in knowing more take a look at SIFT, SURF, or Harris. The one we will be using was developed at the OpenCV Lab and it is called ORB (Oriented FAST and Rotated BRIEF). The shape and values of the descriptor depend on the algorithm used and, in our case,  the descriptors obtained will be binary strings.

With OpenCV, extracting features and its descriptors via the ORB detector is as easy as:

img = cv2.imread('scene.jpg',0)

# Initiate ORB detector
orb = cv2.ORB_create()

# find the keypoints with ORB
kp = orb.detect(img, None)

# compute the descriptors with ORB
kp, des = orb.compute(img, kp)

# draw only keypoints location,not size and orientation
img2 = cv2.drawKeypoints(img, kp, img, color=(0,255,0), flags=0)

Feature matching

Once we have found the features of both the object and the scene were the object is to be found and computed its descriptors it is time to look for matches between them. The simplest way of doing this is to take the descriptor of each feature in the first set, compute the distance to all the descriptors in the second set and return the closest one as the best match (I should state here that it is important to choose a way of measuring distances suitable with the descriptors being used. Since our descriptors will be binary strings we will use Hamming distance). This is a brute force approach, and more sophisticated methods exist.

For example, and this is what we will be also using, we could check that the match found as explained before is also the best match when computing matches the other way around, from features in the second set to features in the first set. This means that both features match each other. Once the matching has finished in both directions we will take as valid matches only the ones that fulfilled the previous condition. Figure 4 presents the best 15 matches found using this method.

Another option to reduce the number of false positives would be to check if the distance to the second to best match is below a certain threshold.  If it is, then the match is considered valid.


Figure 4: Closest 15 brute force matches found between the reference surface and the scene

Finally, after matches have been found, we should define some criteria to decide if the object has been found or not. For this I defined a threshold on the minimum number of matches that should be found. If the number of matches is above the threshold, then we assume the object has been found. Otherwise we consider that there isn’t enough evidence to say that the recognition was successful.

With OpenCV all this recognition process can be done in a few lines of code:

cap = cv2.imread('scene.jpg', 0)    
model = cv2.imread('model.jpg', 0)
# ORB keypoint detector
orb = cv2.ORB_create()              
# create brute force  matcher object
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)  
# Compute model keypoints and its descriptors
kp_model, des_model = orb.detectAndCompute(model, None)  
# Compute scene keypoints and its descriptors
kp_frame, des_frame = orb.detectAndCompute(cap, None)
# Match frame descriptors with model descriptors
matches = bf.match(des_model, des_frame)
# Sort them in the order of their distance
matches = sorted(matches, key=lambda x: x.distance)

if len(matches) > MIN_MATCHES:
    # draw first 15 matches.
    cap = cv2.drawMatches(model, kp_model, cap, kp_frame,
                          matches[:MIN_MATCHES], 0, flags=2)
    # show result
    cv2.imshow('frame', cap)
    print "Not enough matches have been found - %d/%d" % (len(matches),


On a final note and before stepping into the next step of the process I must point out that, since we want a real time application, it would have been better to implement a tracking technique and not just plain recognition. This is due to the fact that object recognition will be performed in each frame independently without taking into account previous frames that could add valuable information about the location of the reference object. Another thing to take into account is that, the easier to found the reference surface the more robust detection will be. In this particular sense, the reference surface I’m using might not be the best option, but it helps to understand the process.

Homography estimation

Once we have identified the reference surface in the current frame and have a set of valid matches we can proceed to estimate the homography between both images. As explained before, we want to find the transformation that maps points from the surface plane to the image plane (see Figure 5). This transformation will have to be updated each new frame we process.


Figure 5: Homography between a plane and an image. Source: F. Moreno.

How can we find such a transformation? Since we have already found a set of matches between both images we can certainly find directly by any of the existing methods (I advance we will be using RANSAC) an homogeneous transformation that performs the mapping, but let’s get some insight into what we are doing here (see Figure 6). You can skip the following part (and continue reading after Figure 10) if desired, since I will only explain the reasoning behind the transformation we are going to estimate.

What we have is an object (a plane in this case) with known coordinates in the, let’s say, World coordinate system and we take a picture of it with a camera located at a certain position and orientation with respect to the World coordinate system. We will assume the camera works following the pinhole model, which roughly means that the rays passing through a 3D point p and the corresponding 2D point u intersect at c, the camera center. A good resource if you are interested in knowing more about the pinhole model can be found here.


Figure 6: Image formation assuming a camera pinhole model.  Source: F. Moreno.

Although not entirely true, the pinhole model assumption eases our calculations and works well enough for our purposes. The u, v coordinates (coordinates in the image plane) of a point p expressed in the Camera coordinate system if we assume a pinhole camera can be computed as (the derivation of the equation is left as an exercise to the reader):


Figure 7: Image formation assuming a pinhole camera model. Source: F. Moreno.

Where the focal length is the distance from the pinhole to the image plane, the projection of the optical center is the position of the optical center in the image plane and k is a scaling factor. The previous equation then tells us how the image is formed. However, as stated before, we know the coordinates of the point p in the World coordinate system and not in the Camera coordinate system, so we have to add another transformation that maps points from the World coordinate system to the Camera coordinate system. The transformation that tells us the coordinates in the image plane of a point p in the World coordinate system is then:


Figure 8: Computation of the projection matrix. Source: F. Moreno.

Luckily for us, since the points in the reference surface plane do always have its z coordinate equal to 0 (see Figure 5) we can simplify the transformation that we found above. It can be easily seen that the product of the z coordinate and the third column of the projection matrix will always be 0 so we can drop this column and the z coordinate from the previous equation. By renaming the calibration matrix as A and taking into account that the external calibration matrix is an homogeneous transformation:


Figure 9: Simplification of the projection matrix. Source: F. Moreno.

From Figure 9 we can conclude that the homography between the reference surface and the image plane, which is the matrix we will estimate from the previous matches we found is:


Figure 10: Homography between the reference surface plane and the target image plane. Source: F. Moreno.

There are several methods that allow us to estimate the values of the homography matrix, and you maight be familiar with some of them. The one we will be using is RANdom SAmple Consensus (RANSAC).  RANSAC is an iterative algorithm used for model fitting in the presence of a large number of outliers, and Figure 12 ilustrates the main outline of the process. Since we cannot guarantee that all the matches we have found are actually valid matches we have to consider that there might be some false matches (which will be our outliers) and, hence, we have to use an estimation method that is robust against outliers. Figure 11 illustrates the problems we could have when estimating the homography if we considered that there were no outliers.


Figure 11: Homography estimation in the presence of outliers. Source: F. Moreno.


Figure 12: RANSAC algorithm outline. Source: F. Moreno.

As a demonstration of how RANSAC works and to make things clearer, assume we had the following set of points for which we wanted to fit a line using RANSAC:


Figure 13: Initial set of points. Source: F. Moreno

From the general outline presented in Figure 12 we can derive the specific process to fit a line using RANSAC (Figure 14).


Figure 14: RANSAC algorithm to fit a line to a set of points. Source: F. Moreno.

A possible outcome of running the algorithm presented above can be seen in Figure 15. Note that the first 3 steps of the algorithm are only shown for the first iteration (indicated by the bottom right number), and from that on only the scoring step is shown.


Figure 15: Using RANSAC to fit a line to a set of points. Source: F. Moreno.

Now back to our use case, homography estimation. For homography estimation the algorithm is presented in Figure 16. Since it is mainly math, I won’t go into details on why 4 matches are needed or on how to estimate H. However, if you want to know why and how it’s done, this is a good explanation of it.


Figure 16: RANSAC for homography estimation. Source: F. Moreno.

Before seeing how OpenCV can handle this for us we should  discuss one final aspect of the algorithm, which is what does it mean that a match is consistent with H. What this mainly means is that if after estimating an homography we project into the target image the matches that were not used to estimate it then the projected points from the reference surface should be close to its matches in the target image. How close they should be to be considered consistent is up to you.

I know it has been tough to reach this point, but thankfully there is a reward. In OpenCV estimating the homography with RANSAC is as easy as:

# assuming matches stores the matches found and 
# returned by bf.match(des_model, des_frame)
# differenciate between source points and destination points
src_pts = np.float32([kp_model[m.queryIdx].pt for m in matches]).reshape(-1, 1, 2)
dst_pts = np.float32([kp_frame[m.trainIdx].pt for m in matches]).reshape(-1, 1, 2)
# compute Homography
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)


Where 5.0 is the threshold distance to determine if a match is consistent with the estimated homography. If after estimating the homography we project the four corners of the reference surface on the target image and connect them with a line we should expect the resulting lines to enclose the reference surface in the target image. We can do this by:

# Draw a rectangle that marks the found model in the frame
h, w = model.shape
pts = np.float32([[0, 0], [0, h - 1], [w - 1, h - 1], [w - 1, 0]]).reshape(-1, 1, 2)
# project corners into frame
dst = cv2.perspectiveTransform(pts, M)  
# connect them with lines
img2 = cv2.polylines(img_rgb, [np.int32(dst)], True, 255, 3, cv2.LINE_AA) 
cv2.imshow('frame', cap)


which results in:


Figure 17: Projected corners of the reference surface with the estimated homography.

I think this is enough for today. On the next post we will see how to extend the homography we already estimated to project not only points in the reference surface plane but any 3D point from the reference surface coordinate system to the target image. We will then use this method to compute in real time, for each video frame, the specific projection matrix and then project in a video stream a 3D model of our choice from an .obj file. What you can expect at the end of the next post is something similar to what you can see in the gif below:


As always, I will upload the whole code of the project as well as some 3D models to GitHub for you to test when publishing part 2.

Part 2 is already here!


Landscape generation using midpoint displacement

Today I will present how to implement in Python a simple yet effective algorithm for proceduraly generating 2D landscapes. It is called Midpoint Displacement (or Diamond-square algorithm, which seems less intuitive to me) and, with some tweaking it can also be used for creating rivers, lighting strikes or (fake) graphs. The final output may look like the following image.


Example terrain generated with the presented algorithm.

Algorithm overview

The main idea of the algorithm is as follows: Begin with a straight line segment, compute its midpoint and displace it by a bounded random value. This displacement can be done either by:

  1. Displacing the midpoint in the direction perpendicular to the line segment.
  2. Displacing only the y coordinate value of the midpoint.

Different methods for displacing the midpoint.

This first iteration will result in two straight line segments obtained from the displacement of the midpoint of the original segment.  The same process of computing and displacing the midpoint can be then applied to each of this new two segments and it will result in four straight line segments. Then we can repeat the process for each of this four segments to obtain eight and so on. The process can be repeated iteratively or recursively as many times as desired or until the segments cannot be reduced more (for graphical applications this limit would be two pixel’s width segments). The following image may help to clarify what I just said.


From top to bottom, successive iterations of the algorithm.

And that’s it! This is the core idea of the midpoint displacement algorithm. In pseudocode it looks like:

Initialize segment
While iterations < num_of_iterations and segments_length > min_length:
	For each segment:
		Compute midpoint
		Displace midpoint
		Update segments
        Reduce displacement bounds

However, before implementing the algorithm we should dig deeper in some of the concepts that have arisen so far. These are mainly:

  • How much should we displace the midpoint?
  • How much should the displacement bounds be reduced after each iteration?

How much should we displace the midpoint?

Sadly, there is no general answer for this question because it greatly depends on two aspects:

  1. The application the algorithm is being used for
  2. The desired effect

Since in this post our scope is terrain generation I will limit the explanation to the effects that this parameter has in this area. However, the ideas that I will explain now can be extrapolated to any other application where the Midpoint Displacement algorithm may be used. As I see it, there are two key considerations that should be taken into account when deciding the initial displacement value.

First of all, we should consider which is the desired type of terrain. Hopefully it makes sense to say that the bigger the mountains we want to generate the bigger the initial displacement value should be and viceversa. With a bit of trial and error it is easy to get an idea of the average profiles generated by different values and how do they look. The point here is that bigger mountains need bigger initial displacement values.

Secondly,  the overall dimensions (width and height) of the generated terrain. The initial displacement should be regarded as a value which depends on the generated terrain dimensions.  What I want to say is that an initial displacement of 5 may be huge when dealing with a 5×5 image but will hardly be noticed in a 1500×1500 image.


The same initial displacement value may be to big for a certain image size but may suit well another image size.

How much should the bounds be reduced after each iteration?

Well, the answer again depends on which is the desired output. It should be intuitive that the smaller the displacement reduction the more jagged the obtained profile will be and viceversa. The two extremes are no displacement reduction at all and setting the displacement to 0 after the first iteration. This two cases can be observed in the figure below.


On the left: no displacement reduction. On the right: Displacement reduction to 0 after first iteration.

Somewhere in between is the displacement reduction that will yield the desired output. There are plenty of ways to reduce the displacement bounds each iteration (linearly, exponentially, logarithmically, etc.) and I encourage you to try different ones and see how the results vary.

What I did was define a standard displacement reduction of 1/2, which means that the displacement is reduced by half each new iteration,  and a displacement decay power such that the displacement reduction is

displacement_reduction = 1/(2^i)


displacement_bounds(k+1) = displacement_bounds(k)*displacement_reduction

were is the current iteration and k+1 the next iteration. We can then talk about the obtained terrain profiles in terms of this decay power i. Below you can see how the algorithm performs for different decay powers.


Obtained profiles for different values of  the displacement decay powers.

Bear in mind that the two factors we just saw, the bounds reduction and initial displacement are related one to the other and that they do not affect the output independently. Smaller initial displacements may look good with smaller decay powers and viceversa. Here we have talked about some guidelines that may help when deciding which values to use but there will be some trial and error until the right parametres for the desired output are found. Finally, the number of iterations is another factor that also affects the output in relation with the initial displacement and the bounds reduction.

Python implementation

Finally it is time to, with all the ideas explained above, code our 2D terrain generator. For this particular implementation I have decided to:

  • Displace only the y coordinate of the midpoints (Second of the two displacement methods explained at the begining).
  • Use symmetric bounds with respect to zero for the displacement (if b is the upper bound then –b will be the lower bound.)
  • Choose the displacement value to be either the upper bound or the lower bound, but never allow values in between.
  • Reduce the bounds after each iteration by multiplying the current bounds by 1/(2^i)

We will have three functions: one that will generate the terrain, one that will draw the generated terrain and one that will handle the above processes.

Before implementing the functions we should first import the modules that we will use:

import os                             # path resolving and image saving
import random                         # midpoint displacement
from PIL import Image, ImageDraw      # image creation and drawing
import bisect                         # working with the sorted list of points

Terrain generation

For the terrain generation we need a function that, given a straight line segment returns the profile of the terrain. I have decided to provide as inputs the initial segment and displacement, the rate of decay or roughness of the displacement and the number of iterations:

# Iterative midpoint vertical displacement
def midpoint_displacement(start, end, roughness, vertical_displacement=None,
    Given a straight line segment specified by a starting point and an endpoint
    in the form of [starting_point_x, starting_point_y] and [endpoint_x, endpoint_y],
    a roughness value > 0, an initial vertical displacement and a number of
    iterations > 0 applies the  midpoint algorithm to the specified segment and
    returns the obtained list of points in the form
    points = [[x_0, y_0],[x_1, y_1],...,[x_n, y_n]]
    # Final number of points = (2^iterations)+1
    if vertical_displacement is None:
        # if no initial displacement is specified set displacement to:
        #  (y_start+y_end)/2
        vertical_displacement = (start[1]+end[1])/2
    # Data structure that stores the points is a list of lists where
    # each sublist represents a point and holds its x and y coordinates:
    # points=[[x_0, y_0],[x_1, y_1],...,[x_n, y_n]]
    #              |          |              |
    #           point 0    point 1        point n
    # The points list is always kept sorted from smallest to biggest x-value
    points = [start, end]
    iteration = 1
    while iteration <= num_of_iterations:
        # Since the list of points will be dynamically updated with the new computed
        # points after each midpoint displacement it is necessary to create a copy
        # of the state at the beginning of the iteration so we can iterate over
        # the original sequence.
        # Tuple type is used for security reasons since they are immutable in Python.
        points_tup = tuple(points)
        for i in range(len(points_tup)-1):
            # Calculate x and y midpoint coordinates:
            # [(x_i+x_(i+1))/2, (y_i+y_(i+1))/2]
            midpoint = list(map(lambda x: (points_tup[i][x]+points_tup[i+1][x])/2,
                                [0, 1]))
            # Displace midpoint y-coordinate
            midpoint[1] += random.choice([-vertical_displacement,
            # Insert the displaced midpoint in the current list of points         
            bisect.insort(points, midpoint)
            # bisect allows to insert an element in a list so that its order
            # is preserved.
            # By default the maintained order is from smallest to biggest list first
            # element which is what we want.
        # Reduce displacement range
        vertical_displacement *= 2 ** (-roughness)
        # update number of iterations
        iteration += 1
    return points

The initial line segment is specified by the coordinates of the points where it begins and ends. Both are a list in the form:

point = [x_coordinate, y_coordinate]

And the output is a list of lists containing all the points that should be connected to obtain the terrain profile in the form:

points = [[x_0, y_0], [x_1, y_1], ..., [x_n, y_n]]

Terrain drawing

For the graphical output we need a function that returns an image of the drawn terrain and that takes as inputs at least the profile generated by the midpoint displacement algorithm. I have also included as inputs the desired width and height of the image and the colors it should use for painting. What I did for drawing several layers of terrain was start with the layer in the background and draw each new layer on top of the previous one.

For drawing each layer I first infer the value of every value in the range (0, image width) based on the assumption that the known points, the ones obtained from the midpoint displacement, are connected with straight lines. Once knowing the value of each value in the range (0, image width) I traverse all the values iteratively and for each x value draw a line from its value to the bottom of the image.

def draw_layers(layers, width, height, color_dict=None):
    # Default color palette
    if color_dict is None:
        color_dict = {'0': (195, 157, 224), '1': (158, 98, 204),
                      '2': (130, 79, 138), '3': (68, 28, 99), '4': (49, 7, 82),
                      '5': (23, 3, 38), '6': (240, 203, 163)}
        # len(color_dict) should be at least: # of layers +1 (background color)
        if len(color_dict) < len(layers)+1:
            raise ValueError("Num of colors should be bigger than the num of layers")

    # Create image into which the terrain will be drawn
    landscape ='RGBA', (width, height), color_dict[str(len(color_dict)-1)])
    landscape_draw = ImageDraw.Draw(landscape)
    # Draw the sun
    landscape_draw.ellipse((50, 25, 100, 75), fill=(255, 255, 255, 255))
    # Sample the y values of all x in image
    final_layers = []
    for layer in layers:
        sampled_layer = []
        for i in range(len(layer)-1):
            sampled_layer += [layer[i]]
            # If x difference is greater than 1
            if layer[i+1][0]-layer[i][0] > 1:
                # Linearly sample the y values in the range x_[i+1]-x_[i]
                # This is done by obtaining the equation of the straight
                # line (in the form of y=m*x+n) that connects two consecutive
                # points
                m = float(layer[i+1][1]-layer[i][1])/(layer[i+1][0]-layer[i][0])
                n = layer[i][1]-m*layer[i][0]
                r = lambda x: m*x+n  # straight line
                for j in range(layer[i][0]+1, layer[i+1][0]):  # for all missing x
                    sampled_layer += [[j, r(j)]]  # Sample points
        final_layers += [sampled_layer]

    final_layers_enum = enumerate(final_layers)
    for final_layer in final_layers_enum:
        # traverse all x values in the layer
        for x in range(len(final_layer[1])-1):
            # for each x value draw a line from its y value to the bottom
            landscape_draw.line((final_layer[1][x][0], height-final_layer[1][x][1],
                                 final_layer[1][x][0], height),

    return landscape


The PIL module (and mostly all the modules that allow working with images) sets the origin of coordinates on the top left corner of the image. Also, the x value increases when moving right and the value when moving down. The values that have to be passed to the function that draws the lines have to be expressed in this system of reference and that is why for drawing the desired line its y values have to be transformed from our reference system (origin lower left) to PIL’s reference system.


Differences between PIL’s reference system and the one we have been using so far.

With these two functions we are now able to actually compute and draw our 2D proceduraly generated terrain.

Our main function

The final step is to define our main function. This function will compute the profiles of the desired number of layers, draw them and save the obtained terrain as a .png image:

def main():
    width = 1000  # Terrain width
    height = 500  # Terrain height
    # Compute different layers of the landscape
    layer_1 = midpoint_displacement([250, 0], [width, 200], 1.4, 20, 12)
    layer_2 = midpoint_displacement([0, 180], [width, 80], 1.2, 30, 12)
    layer_3 = midpoint_displacement([0, 270], [width, 190], 1, 120, 9)
    layer_4 = midpoint_displacement([0, 350], [width, 320], 0.9, 250, 8)

    landscape = draw_layers([layer_4, layer_3, layer_2, layer_1], width, height)'\\testing.png')


To call the main() function when we run the program we finally add the lines:

if __name__ == "__main__":


And we’re done! Now it’s your turn to code your own terrain generator and play with its different parametres for modifying the results (you can also change the colors). If you have any doubts do not hesitate to contact me. You can find the whole code at github.

As a bonus, some more terrain images obtained with the previous code:


Midpoint Displacement 2D generated landscapes.