Author Archives: juangallostra

Converting images to ASCII art (Part 2)

Today’s post is the second and last on how to convert images to ASCII art (in case you missed it and want to quickly catch up, here’s part 1). We already have a grayscale version of the image we want to convert and we have also developed a method that allows the user to adjust the contrast of the image. What we have to implement now is the mapping from the grayscale version of the input image to the ASCII art image. This is what, without further ado, we will discuss below.

flower

Execution flow of the program as explained in part 1.

Mapping pixels to ASCII characters

Since what defines the grey level of a pixel in a grayscale image is its intensity value, we have to find a way that allows us to relate or match ASCII characters and intensity values.

It is obvious then that the inputs of the mapping will be intensity values and the outputs ASCII characters. I hope you agree in that a good way to do this mapping is to:

  1. Associate an intensity value to each of the 95 printable ASCII characters. The “darker” the character, the lower its associated intensity value (remember that 0 is black and 255 white). This means that the intensity value associated to the character “#” should be closer to 0 than the intensity value of the character “-” since “#” is “darker” than “-”. We’ll see what “darker” means in a moment. Note that this step has to be done only once and that it can be done at the initialization of the program.
  2. Iterate over the pixels of the image and, for each pixel’s intensity, look up which is the intensity value from the table created in step 1 that is closer to the current pixel’s intensity.
  3. Once this match is found we simply have to retrieve the ASCII character associated to the intensity and substitute the pixel for the ASCII character.

A simplified version of this process can be seen in the Figure below.

lookup_2

Simple example of the desired mapping.

The pseudocode for the transformation will be then:

For i <= Image Height
	For j <= Image Width                     
                Retrieve grayscale intensity at (i,j) -> G(i,j)
		Find min(abs(G(i,j) – ASCII intensity table))
		Get the associated ASCII character
		Draw the ASCII character at (i,j)

 

Bearing this in mind, what we have to do now before implementing the actual mapping is to develop a method by which we can associate an intensity to each printable ASCII character.

Assigning intensities to ASCII characters

The method I chose for assigning an intensity value to each printable ASCII character is explained below, but let me introduce first an important consideration. This is that, to obtain a good and aesthetic result, the proportions of the original image have to be maintained on the generated ASCII image. To achieve this I decided that each drawn character, independently of its size (“X” needs more space to be drawn that ”.”) would have the same amount of space (a square patch of cxc pixels) assigned in the output image. And that the amount of space assigned to all characters would be that of the character that needs more space to be drawn. By doing this we guarantee that proportions are maintained through the transformation (although the image increases it size by a factor of c2. We’ll see how to solve this later.) We can obtain the size of the patch via:

private static SizeF GetGeneralSize()
{
        SizeF generalsize = new SizeF(0, 0);
        for (int i = 32; i <= 126; i++) // Iterate through contemplated characters calculating necessary width
        {
           char c = Convert.ToChar(i);
           // Create a dummy bitmap just to get a graphics object
           Image img = new Bitmap(1, 1);
           Graphics drawing = Graphics.FromImage(img);

           // Measure the string to see its dimensions using the graphics object
           SizeF textSize = drawing.MeasureString(c.ToString(), System.Drawing.SystemFonts.DefaultFont);
           // Update, if necessary, the max width and height
           if (textSize.Width > generalsize.Width)
               generalsize.Width = textSize.Width;
           if (textSize.Height > generalsize.Height)
               generalsize.Height = textSize.Height;
           // free all resources
           img.Dispose();
           drawing.Dispose();
        }
        // Make sure generalsize is made of integers 
        generalsize.Width = ((int)generalsize.Width);
        generalsize.Height = ((int)generalsize.Height);
        // and that size defines a square to maintain image proportions
        if (generalsize.Width > generalsize.Height)
            generalsize.Height = generalsize.Width;
        else
            generalsize.Width = generalsize.Height;

        return generalsize;
}

 

Where generalsize will hold the size of the patch. Now, knowing the occupied space in the output image of any character also eases our way of implementing a method that associates an intensity to any character. We can define a ratio that measures the greyness of each character by dividing the number of black pixels over the above computed patch size.

Character greyness = (amount of black pixels)/( c2)

Once we have the ratio computed for all printable ASCII characters we just have to remap the values to the range 0-255 so that they can be matched with pixel intensities. The following two functions compute this ratio and perform the above-mentioned mapping. The first one computes the ratio of each character:

private static double GetWeight(char c, SizeF size)
{
        var CharImage = HelperMethods.DrawText(c.ToString(), Color.Black, Color.White, size);

        Bitmap btm = new Bitmap(CharImage);
        double totalsum = 0;

        for (int i = 0; i < btm.Width; i++)
        {
            for (int j = 0; j < btm.Height; j++)
            {
                totalsum = totalsum + (btm.GetPixel(i, j).R 
                                    + btm.GetPixel(i, j).G 
                                    + btm.GetPixel(i, j).B)/3;
             }
        }
        // Weight = (sum of (R+G+B)/3 for all pixels in image) / Area. (Where Area = Width*Height )
        return totalsum / (size.Height * size.Width);
}

 

And this second one, once the all weights have been computed, maps them to the desired range:

private static List LinearMap(List characters)
{
   double max = characters.Max(c => c.Weight);
   double min = characters.Min(c => c.Weight);
   double range = 255;
   // y = mx + n (where y c (0-255))
   double slope = range / (max - min);
   double n = -min * slope;
   foreach(WeightedChar charactertomap in characters)
   {
	   charactertomap.Weight = slope * charactertomap.Weight + n;
   }
   return characters;
}

 

Finally, we can define another function that will be called when the program starts that performs all these tasks and returns a list of all the printable characters with its associated weights already mapped to the range 0-255. The class WeightedChar stores the character, its weight and the image of the character above a white patch of cxc pixels.

public static List GenerateFontWeights() // Collect chars, their Images and weights in a list of WeightedChar
{
	List WeightedChars = new List();

	SizeF commonsize = GetGeneralSize(); // Get standard size (nxn square), which will be common to all CharImages

	for (int i = 32; i <= 126; i++) // Iterate through Chars
	{
		var forweighting = new WeightedChar(); // New object to hold Image, Weight and Char of new character
		char c = Convert.ToChar(i);
		if (!char.IsControl(c))
		{
			forweighting.Weight = GetWeight(c, commonsize); // Get character weight
			forweighting.Character = c.ToString(); // Get character representation (the actual char)
			forweighting.CharacterImage = (Bitmap)HelperMethods.DrawText(c.ToString(), Color.Black, Color.White, commonsize); // Get character Image
		}
		WeightedChars.Add(forweighting);
	}

	WeightedChars = LinearMap(WeightedChars); // Linearly map character weights to be in the range 0-255 -> mapping linearly from: MinCalcWeight - MaxCalcWeight to 0-255; 
											  // This is done to be able to directly map pixels to characters
	return WeightedChars;
}

One character per pixel?

Now, there is at least one problem with the method we have developed thus far. It is that, for a mxn image, the resulting ASCII image dimensions will be, in pixels, mxnxc2, which might be many times bigger than the original image. As a consequence, when the image gets squeezed to fit in the screen the characters might not be seen clearly. This is, from my point of view, something that should not be left ignored, since the whole point is that the characters are visible.

Luckily, this can be easily solved by, instead of performing a one to one map, mapping a neighbourhood of pxq pixels to a single character by, instead of using the intensity of one pixel as the input for the mapping, using the average intensity of the pxq neighbourhood. This is totally a subjective point as the first mapping (one to one) will obviously maintain more details of the original image. So, why not let the user play with this parameters and choose the ones he prefers?

updatedGUI

Updated program GUI to allow the user select the pxq neighbourhood size. I’m by no means a graphical designer, but I tried my best to make it look good.

Taking this point into consideration, the final function that generates the ASCII image can be easily defined as shown below:

public static Image Convert2ASCII(Image BW_Image, int? w, int? h, List<WeightedChar> characters, out List<List<string>> ResultText)
{

	/*
	 * ALGORITHM:
	 * 
	 *  1- Get target Image size (w=Width,h=Height)
	 *  2- Create Result Image with white background and size W = w*character_image_width
	 *                                                        H = h*character_image_height
	 *  3- Create empty string to hold the text   
	 *  
	 *  4- for (int j=0;j=target_Image_Height;j++) --> ALL ROWS 
	 *       5- Create row text string
	 *       for (int i=0;i=target_Image_Width;i++) --> ALL COLUMNS
	 *          6- Get target pixel weight
	 *          7- Get closest weight from character list
	 *          8- Paste character image in position w = i*character_image_width
	 *                                               h = j*character_image_height
	 *            ¡¡ Be careful with coordinate system when placing Images !!
	 *          9- Add (string)character to row text string
	 *       10- Add row text string to text holding string
	 *  11 - return resulting Image & Text
	 */

	if (w == null & h == null)
	{
		w = 1;
		h = 1;
	}
	int width = (int)w;
	int height = (int)h;

	// Be careful when Image.Width/widh or Image.Height/height are not integer values (coherent approximations)
	Image ResultImage = new Bitmap(BW_Image.Width * characters[0].CharacterImage.Width/width, BW_Image.Height * characters[0].CharacterImage.Height/height);
	Graphics drawing = Graphics.FromImage(ResultImage);
	drawing.Clear(Color.White);
	ResultText = new List<List<string>> { };
	Bitmap BlackAndWhite = (Bitmap)BW_Image;

	for (int j = 0; j < BW_Image.Height; j+=height) // ROW
	{
		List<string> RowText = new List<string> { };
		for (int i=0; i <BW_Image.Width; i+=width) // COLUMN
		{
			double targetvalue = 0;

			for (int nj=j; nj<j+height; nj++)
			{
				for (int ni = i; ni < i+width; ni++)
				{
					// Sum all the pixels in neighbourhood and average
					try
					{
						targetvalue += BlackAndWhite.GetPixel(ni, nj).R;
					}
					catch (Exception)
					{
						targetvalue += 128;
					}
				}
			}
			targetvalue /= (width * height);
			WeightedChar closestchar = characters.Where(t=>Math.Abs(t.Weight-targetvalue)==characters.Min(e => Math.Abs(e.Weight - targetvalue))).FirstOrDefault();
			RowText.Add(closestchar.Character);
			drawing.DrawImage(closestchar.CharacterImage, (i/width) * closestchar.CharacterImage.Width, (j/height) * closestchar.CharacterImage.Height);
		}
		ResultText.Add(RowText);
	}
	drawing.Dispose();
	return (Image)ResultImage;    
}

 

The following Figure shows the obtained results for different neighbourhood sizes.

Comparison

The bigger the size of the neighbourhood, the more visible the characters are. However, less detail is preserved.

Final notes

There are many improvements that could be done to make the program more efficient and more functionalities could be implemented. However, I will let those improvements to you, and what has been explained here should be enough to get you started.

Finally, any positive/constructive criticism is welcome and, for those interested, this version of the project can be found here and the code can also be found here.

Converting images to ASCII art (Part 1)

In the last post we discussed a simple method for procedurally generating 2D terrain. In today’s post, we will also generate an image that will be the final output of our program, but with a totally different purpose and method. Today we will try to generate ASCII art from images.  For those of you who are not familiar with ASCII art, what we will achieve will look (more or less) like the image below.

c

Original image (left) and output image (right).

Not as impressive as you thought? If still not yet convinced (I promise this point will be revisited later on) take a look at the image at full resolution and stare at it after moving a few feet away from the screen. And yes, it is only made of ASCII characters! This can be quickly checked by zooming in any section of the image.

zoomin

Zooming a region of the image until characters are visible.

It should be clear from the previous example that the output image has two main characteristics: it is composed solely of ASCII characters and it looks like the original image. This is our main goal. However, we will implement and discuss some extra features or alternative implementations along the way in order to improve the results. Finally, before we embark I would like to point out three things:

  1. The ideas that will be explained can be extended to any kind of process were the resulting image is a composition of other images and were the aim is to resemble a target or input image.
  1. The post is about the algorithms and ideas that make the program work and not about a detailed explanation of the whole program.
  2. The example code presented is written in C#. However, the main ideas can be easily implemented in other programming languages. (Disclaimer: I am currently learning C# and this was my first project – done with learning purposes -. Sorry if the code isn’t as clean or efficient as it could be or if the project architecture isn’t the ideal one. Any suggestions on how to improve it are welcome. I am not teaching how to code in C#.)

Methodology

One of the (first) things I tend to do when developing software is to devise which are the steps the program should take in order to achieve the desired outcome. It is not the only method to develop a general scheme of what the program has to do, but in this case we will work our way backwards. Starting from the ASCII art image we will try to guess from where did it come and how it was obtained.

Clearly there has been a point in the process of generating the ASCII art image where each pixel or region of pixels from the original image has been mapped to a certain character. Since the final output is the ASCII image we won’t be too wrong if we assume that this is the last step of our process.

flow

Last step of the process.

Now we have to deduce if the image used as input for the mapping is the original image or if there has been some pre-processing applied to it before this mapping step. Since the ASCII art image is made only of black characters over a white background, which from a certain distance looks like a grayscale image, a good guess is that a grayscale version of the original image is the image used as a reference for the mapping.

n_full_flow

Getting closer to know the full the process.

Since a grayscale version of the original image can be easily obtained from the original image the final scheme of the process is:

full_flow

Overview of the process.

A more general way of defining the “Convert to grayscale” step would be “Image pre-processing”. This would include, in the case we wanted to perform more operations than just a grayscale conversion, all the different operations and transformations applied to the image before the final mapping of the image to ASCII characters.

Image pre-processing

Grayscale conversion

There are many methods that can be applied to obtain a grayscale version of an RGB image and I will only focus on the one I decided to use. What I applied is a conversion that preserves the luminance (or luminous intensity) of the original image in the grayscale image. Each pixel’s intensity or grey value is simply the weighted sum of the values of its RGB components computed as follows:

I = 0.299·R+0.587·G+0.114·B ≈ 0.3·R+0.59·G+0.11·B

It is not the most efficient way, but in C# this can be easily achieved with:

public static Bitmap Grayscale(Image image)
{
Bitmap btm = new Bitmap(image);
      for (int i = 0; i < btm.Width; i++)
      {
      	for (int j = 0; j < btm.Height; j++)
            {
            	int ser = (int)(btm.GetPixel(i, j).R*0.3 + btm.GetPixel(i, j).G*0.59 + btm.GetPixel(i, j).B*0.11);

                    btm.SetPixel(i, j, Color.FromArgb(ser, ser, ser));
            }
       }
       return btm;
}

 

Were we are simply traversing all the pixels in the original image, computing the weighted sum and then assigning the result as the pixel’s new intensity. Now, we could keep going from here, but since the final results depend largely on the grayscale version of the original image, wouldn’t it be nice if we allowed the user to have some control over the grayscale conversion? One way to do this is to let the user adjust the contrast of the original image and when done, convert it to grayscale. We could let the user adjust directly the contrast of the grayscale image but I preferred to develop a simple interface that only displays two images: the original and the one made of ASCII characters which is why I decided to implement the contrast adjustment on the original image.

preprocess

Updated program execution flow adding the contrast adjustment step.

Contrast adjustment

So, what is the contrast of an image and how can we modify it? Basically, the contrast of an image is the difference between the maximum and minimum pixel intensity, the “amount” of separation between its darkest and brightest areas. It is then determined by the colour and brightness of the different objects or shapes that appear in the image. This means that the higher the contrast, the bigger the difference between pixel intensities, and the easier it will be to recognize the different objects in the image (due to this bigger range of intensities). This can be appreciated in the image below.

contrast_change_photoshop

Contrast variation. The contrast increases from bottom to top on the left column and from top to bottom on the right column. Source: Wikipedia.

Once knowing this it’s time to tackle the big question: How can we modify (increase or decrease) the contrast of an image?

We will achieve this by applying to the image a pixel by pixel operation (as we did when grayscaling) that modifies each pixel’s intensity with the objective of expanding or contracting the range of intensity values or luminance an image has. The value of the output image at a particular pixel will depend only on the value of the pixel located at that same position on the original image. This kind of operations are called in Computer Vision point operations.

po

Schematic of a point operation.

Note that this implies that two pixels that have the same intensity in the original image will be mapped to the same, and usually different from the original, intensity value in the output image. The transformation is then a single variable function

g(x,y) = T( f(x,y) )

Where g(x,y) is the intensity of the pixel (x,y) in the output image, f(x,y) is the intensity of the original image at the same position and T is the applied transformation. The pixel position in the image doesn’t play any role at all and its intensity is the only interesting property when varying the contrast. Point operations can be plotted in the same way single variable functions are. On the x axis, we will have the range of values the input property can take (I) and, on the y axis, the mapped values that result from the transformation (I’). We will make use of this to derive intuitively our contrast adjustment transformation. As an example, how would a transformation that didn’t modify the image at all look like?

equal

Point operations identity transformation.

Yep, that’s right. Each intensity value in the original image would be mapped to its same value in the output image. Recall now that what we want to achieve when increasing the contrast is to make wider the range of intensities in the image. Any ideas on the shape of the transformation that would achieve this? You probably guessed right. If we increase the slope of the line presented above, say double it, any considered range of intensities in the input image will be mapped to a range in the output image twice as big. And if we decrease the slope, say to a half, any considered range of intensities in the input image will half its size in the output image. This can be appreciated in the image below.

ddriving

Deriving our contrast adjustment transformation.

We are getting close… Note that all the lines presented above have a point in common, the origin in this case. This is nice because if we apply the transformation directly to the original image, without creating a copy of it, having a fixed point will allow us to recover the original image. This wouldn’t be possible without having a fixed point. However, is the origin the best fixed point we can get? I don’t think so. Note that increasing the contrast in this way will result in many intensities being mapped to 255, the maximum allowed intensity in images that store each channel’s intensity with 8 bits. I hope that you agree in that it would be better to divide this region of “out of bounds” intensities between the max and min allowed intensities. This would make the loss of information symmetric. This results in choosing 127.5 as the fixed point of the transformation. Then our contrast transformation so far looks like

I’ = (I-127.5)·c + 127.5

where 127.5 can be rounded to 127 or 128.

final

Finally, our contrast adjustment transformation.

Not that difficult eh? See how in the previous figure half of the values “out of bounds” get mapped to 0 and the other half to 255?  The value of c will be the one that decides how much we increase or decrease the contrast. Its value can be decided in many ways. In my case I am using a quadratic function to determine the value of c. A scrollbar in the GUI of the program reads a value between -100 and 100. This value is then mapped to the range [0,2] and squared, so that c finally varies between 0 and 4. Finally, note that once knowing the value of c the mapping of intensities can be already computed so storing the transformations in a lookup table will improve the efficiency and speed of this step.

comp_v

Example of the implemented contrast adjustment in action. For anyone who’s wondering, yeah, on the bottom right image you can see a nrf24L01+ module connected to a Pyboard.

 

The code below, which is in charge of varying the contrast should now be easy to follow. It is based on this BitBank’s  stackoverflow answer.

public unsafe static void AdjustContrast(Bitmap bmp, double contrast)
{
{
      	byte[] contrast_lookup_table = new byte[256];
            double newValue = 0;
            double c = (100.0 + contrast) / 100.0;

            c *= c;

            for (int i = 0; i < 256; i++)
            {
            	newValue = (double)i;
                  newValue /= 255.0;
                  newValue -= 0.5;
                  newValue *= c;
                  newValue += 0.5;
                  newValue *= 255;

                  if (newValue < 0)
                  	newValue = 0;
                  if (newValue > 255)
                  	newValue = 255;
                  contrast_lookup_table[i] = (byte)newValue;
            }

            var bitmapdata = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height),
            System.Drawing.Imaging.ImageLockMode.ReadWrite, System.Drawing.Imaging.PixelFormat.Format32bppArgb);

            int PixelSize = 4;

            for (int y = 0; y < bitmapdata.Height; y++)
            {
            	byte* destPixels = (byte*)bitmapdata.Scan0 + (y * bitmapdata.Stride);
                  for (int x = 0; x < bitmapdata.Width; x++)
                  {
                  	destPixels[x * PixelSize] = contrast_lookup_table[destPixels[x * PixelSize]]; // B
                  	destPixels[x * PixelSize + 1] = contrast_lookup_table[destPixels[x * PixelSize + 1]]; // G
                  	destPixels[x * PixelSize + 2] = contrast_lookup_table[destPixels[x * PixelSize + 2]]; // R
                        
                  }
            }
            bmp.UnlockBits(bitmapdata);
}
}

 

I think that’s enough for today. We now have finished the pre-processing of the image (contrast adjustment + grayscaling) and on the next post we’ll see how to finally get the ASCII image. If you enjoyed it so far be sure to stay tuned for part 2!

PS: If you want to play with the code or take a look at it, it is available hereI wanted to upload the final program (.exe) somewhere so you can download it and play with it. However, I have no idea on the best way to achieve this so it is easy for you to get it. If you have any ideas please tell me. In the meantime, you can reach me and ask for it via e-mail at juangallostra@gmail.com

Update: You can download the compiled program now here. (Thanks to Campbell for pointing out that this could be easily done via Github releases.)

PS2: I know I haven’t revisited yet the point I made at the beginning but I will, for sure, revisit it on the next post.

Landscape generation using midpoint displacement

Today I will present how to implement in Python a simple yet effective algorithm for proceduraly generating 2D landscapes. It is called Midpoint Displacement (or Diamond-square algorithm, which seems less intuitive to me) and, with some tweaking it can also be used for creating rivers, lighting strikes or (fake) graphs. The final output may look like the following image.

testing

Example terrain generated with the presented algorithm.

Algorithm overview

The main idea of the algorithm is as follows: Begin with a straight line segment, compute its midpoint and displace it by a bounded random value. This displacement can be done either by:

  1. Displacing the midpoint in the direction perpendicular to the line segment.
  2. Displacing only the y coordinate value of the midpoint.
disp.png

Different methods for displacing the midpoint.

This first iteration will result in two straight line segments obtained from the displacement of the midpoint of the original segment.  The same process of computing and displacing the midpoint can be then applied to each of this new two segments and it will result in four straight line segments. Then we can repeat the process for each of this four segments to obtain eight and so on. The process can be repeated iteratively or recursively as many times as desired or until the segments cannot be reduced more (for graphical applications this limit would be two pixel’s width segments). The following image may help to clarify what I just said.

disp_iter.png

From top to bottom, successive iterations of the algorithm.

And that’s it! This is the core idea of the midpoint displacement algorithm. In pseudocode it looks like:

Initialize segment
While iterations < num_of_iterations and segments_length > min_length:
	For each segment:
		Compute midpoint
		Displace midpoint
		Update segments
        Reduce displacement bounds
        iterations+1

However, before implementing the algorithm we should dig deeper in some of the concepts that have arisen so far. These are mainly:

  • How much should we displace the midpoint?
  • How much should the displacement bounds be reduced after each iteration?

How much should we displace the midpoint?

Sadly, there is no general answer for this question because it greatly depends on two aspects:

  1. The application the algorithm is being used for
  2. The desired effect

Since in this post our scope is terrain generation I will limit the explanation to the effects that this parameter has in this area. However, the ideas that I will explain now can be extrapolated to any other application where the Midpoint Displacement algorithm may be used. As I see it, there are two key considerations that should be taken into account when deciding the initial displacement value.

First of all, we should consider which is the desired type of terrain. Hopefully it makes sense to say that the bigger the mountains we want to generate the bigger the initial displacement value should be and viceversa. With a bit of trial and error it is easy to get an idea of the average profiles generated by different values and how do they look. The point here is that bigger mountains need bigger initial displacement values.

Secondly,  the overall dimensions (width and height) of the generated terrain. The initial displacement should be regarded as a value which depends on the generated terrain dimensions.  What I want to say is that an initial displacement of 5 may be huge when dealing with a 5×5 image but will hardly be noticed in a 1500×1500 image.

sizes.png

The same initial displacement value may be to big for a certain image size but may suit well another image size.

How much should the bounds be reduced after each iteration?

Well, the answer again depends on which is the desired output. It should be intuitive that the smaller the displacement reduction the more jagged the obtained profile will be and viceversa. The two extremes are no displacement reduction at all and setting the displacement to 0 after the first iteration. This two cases can be observed in the figure below.

extremes.png

On the left: no displacement reduction. On the right: Displacement reduction to 0 after first iteration.

Somewhere in between is the displacement reduction that will yield the desired output. There are plenty of ways to reduce the displacement bounds each iteration (linearly, exponentially, logarithmically, etc.) and I encourage you to try different ones and see how the results vary.

What I did was define a standard displacement reduction of 1/2, which means that the displacement is reduced by half each new iteration,  and a displacement decay power such that the displacement reduction is

displacement_reduction = 1/(2^i)

and

displacement_bounds(k+1) = displacement_bounds(k)*displacement_reduction

were is the current iteration and k+1 the next iteration. We can then talk about the obtained terrain profiles in terms of this decay power i. Below you can see how the algorithm performs for different decay powers.

roughness_values.png

Obtained profiles for different values of  the displacement decay powers.

Bear in mind that the two factors we just saw, the bounds reduction and initial displacement are related one to the other and that they do not affect the output independently. Smaller initial displacements may look good with smaller decay powers and viceversa. Here we have talked about some guidelines that may help when deciding which values to use but there will be some trial and error until the right parametres for the desired output are found. Finally, the number of iterations is another factor that also affects the output in relation with the initial displacement and the bounds reduction.

Python implementation

Finally it is time to, with all the ideas explained above, code our 2D terrain generator. For this particular implementation I have decided to:

  • Displace only the y coordinate of the midpoints (Second of the two displacement methods explained at the begining).
  • Use symmetric bounds with respect to zero for the displacement (if b is the upper bound then –b will be the lower bound.)
  • Choose the displacement value to be either the upper bound or the lower bound, but never allow values in between.
  • Reduce the bounds after each iteration by multiplying the current bounds by 1/(2^i)

We will have three functions: one that will generate the terrain, one that will draw the generated terrain and one that will handle the above processes.

Before implementing the functions we should first import the modules that we will use:

import os                             # path resolving and image saving
import random                         # midpoint displacement
from PIL import Image, ImageDraw      # image creation and drawing
import bisect                         # working with the sorted list of points

Terrain generation

For the terrain generation we need a function that, given a straight line segment returns the profile of the terrain. I have decided to provide as inputs the initial segment and displacement, the rate of decay or roughness of the displacement and the number of iterations:

# Iterative midpoint vertical displacement
def midpoint_displacement(start, end, roughness, vertical_displacement=None,
                          num_of_iterations=16):
    """
    Given a straight line segment specified by a starting point and an endpoint
    in the form of [starting_point_x, starting_point_y] and [endpoint_x, endpoint_y],
    a roughness value > 0, an initial vertical displacement and a number of
    iterations > 0 applies the  midpoint algorithm to the specified segment and
    returns the obtained list of points in the form
    points = [[x_0, y_0],[x_1, y_1],...,[x_n, y_n]]
    """
    # Final number of points = (2^iterations)+1
    if vertical_displacement is None:
        # if no initial displacement is specified set displacement to:
        #  (y_start+y_end)/2
        vertical_displacement = (start[1]+end[1])/2
    # Data structure that stores the points is a list of lists where
    # each sublist represents a point and holds its x and y coordinates:
    # points=[[x_0, y_0],[x_1, y_1],...,[x_n, y_n]]
    #              |          |              |
    #           point 0    point 1        point n
    # The points list is always kept sorted from smallest to biggest x-value
    points = [start, end]
    iteration = 1
    while iteration <= num_of_iterations:
        # Since the list of points will be dynamically updated with the new computed
        # points after each midpoint displacement it is necessary to create a copy
        # of the state at the beginning of the iteration so we can iterate over
        # the original sequence.
        # Tuple type is used for security reasons since they are immutable in Python.
        points_tup = tuple(points)
        for i in range(len(points_tup)-1):
            # Calculate x and y midpoint coordinates:
            # [(x_i+x_(i+1))/2, (y_i+y_(i+1))/2]
            midpoint = list(map(lambda x: (points_tup[i][x]+points_tup[i+1][x])/2,
                                [0, 1]))
            # Displace midpoint y-coordinate
            midpoint[1] += random.choice([-vertical_displacement,
                                          vertical_displacement])
            # Insert the displaced midpoint in the current list of points         
            bisect.insort(points, midpoint)
            # bisect allows to insert an element in a list so that its order
            # is preserved.
            # By default the maintained order is from smallest to biggest list first
            # element which is what we want.
        # Reduce displacement range
        vertical_displacement *= 2 ** (-roughness)
        # update number of iterations
        iteration += 1
    return points

The initial line segment is specified by the coordinates of the points where it begins and ends. Both are a list in the form:

point = [x_coordinate, y_coordinate]

And the output is a list of lists containing all the points that should be connected to obtain the terrain profile in the form:

points = [[x_0, y_0], [x_1, y_1], ..., [x_n, y_n]]

Terrain drawing

For the graphical output we need a function that returns an image of the drawn terrain and that takes as inputs at least the profile generated by the midpoint displacement algorithm. I have also included as inputs the desired width and height of the image and the colors it should use for painting. What I did for drawing several layers of terrain was start with the layer in the background and draw each new layer on top of the previous one.

For drawing each layer I first infer the value of every value in the range (0, image width) based on the assumption that the known points, the ones obtained from the midpoint displacement, are connected with straight lines. Once knowing the value of each value in the range (0, image width) I traverse all the values iteratively and for each x value draw a line from its value to the bottom of the image.

def draw_layers(layers, width, height, color_dict=None):
    # Default color palette
    if color_dict is None:
        color_dict = {'0': (195, 157, 224), '1': (158, 98, 204),
                      '2': (130, 79, 138), '3': (68, 28, 99), '4': (49, 7, 82),
                      '5': (23, 3, 38), '6': (240, 203, 163)}
    else:
        # len(color_dict) should be at least: # of layers +1 (background color)
        if len(color_dict) < len(layers)+1:
            raise ValueError("Num of colors should be bigger than the num of layers")

    # Create image into which the terrain will be drawn
    landscape = Image.new('RGBA', (width, height), color_dict[str(len(color_dict)-1)])
    landscape_draw = ImageDraw.Draw(landscape)
    # Draw the sun
    landscape_draw.ellipse((50, 25, 100, 75), fill=(255, 255, 255, 255))
    # Sample the y values of all x in image
    final_layers = []
    for layer in layers:
        sampled_layer = []
        for i in range(len(layer)-1):
            sampled_layer += [layer[i]]
            # If x difference is greater than 1
            if layer[i+1][0]-layer[i][0] > 1:
                # Linearly sample the y values in the range x_[i+1]-x_[i]
                # This is done by obtaining the equation of the straight
                # line (in the form of y=m*x+n) that connects two consecutive
                # points
                m = float(layer[i+1][1]-layer[i][1])/(layer[i+1][0]-layer[i][0])
                n = layer[i][1]-m*layer[i][0]
                r = lambda x: m*x+n  # straight line
                for j in range(layer[i][0]+1, layer[i+1][0]):  # for all missing x
                    sampled_layer += [[j, r(j)]]  # Sample points
        final_layers += [sampled_layer]

    final_layers_enum = enumerate(final_layers)
    for final_layer in final_layers_enum:
        # traverse all x values in the layer
        for x in range(len(final_layer[1])-1):
            # for each x value draw a line from its y value to the bottom
            landscape_draw.line((final_layer[1][x][0], height-final_layer[1][x][1],
                                 final_layer[1][x][0], height),
                                 color_dict[str(final_layer[0])])

    return landscape

 

The PIL module (and mostly all the modules that allow working with images) sets the origin of coordinates on the top left corner of the image. Also, the x value increases when moving right and the value when moving down. The values that have to be passed to the function that draws the lines have to be expressed in this system of reference and that is why for drawing the desired line its y values have to be transformed from our reference system (origin lower left) to PIL’s reference system.

references.png

Differences between PIL’s reference system and the one we have been using so far.

With these two functions we are now able to actually compute and draw our 2D proceduraly generated terrain.

Our main function

The final step is to define our main function. This function will compute the profiles of the desired number of layers, draw them and save the obtained terrain as a .png image:

def main():
    width = 1000  # Terrain width
    height = 500  # Terrain height
    # Compute different layers of the landscape
    layer_1 = midpoint_displacement([250, 0], [width, 200], 1.4, 20, 12)
    layer_2 = midpoint_displacement([0, 180], [width, 80], 1.2, 30, 12)
    layer_3 = midpoint_displacement([0, 270], [width, 190], 1, 120, 9)
    layer_4 = midpoint_displacement([0, 350], [width, 320], 0.9, 250, 8)

    landscape = draw_layers([layer_4, layer_3, layer_2, layer_1], width, height)

    landscape.save(os.getcwd()+'\\testing.png')

 

To call the main() function when we run the program we finally add the lines:

if __name__ == "__main__":
    main()

 

And we’re done! Now it’s your turn to code your own terrain generator and play with its different parametres for modifying the results (you can also change the colors). If you have any doubts do not hesitate to contact me. You can find the whole code at github.

As a bonus, some more terrain images obtained with the previous code:

results.png

Midpoint Displacement 2D generated landscapes.