Simple HTTP mock JSON server in Python

In this post we will see how to create a simple JSON server in Python to mock external APIs. This can be useful when developing an application that will make use of third-party applications via the provided APIs since it provides a controlled development environment. It gives you control over the data that is received, removes any rate limit the third-party application might have, will not be down when you need to test your application and gives you control over the code being executed when calling the API. This are just some of the benefits a mock server provides.

Below you can find two gifs that show the usage of the server and what we can expect to end with:

Running the mock server

Testing the server

The code I will present is far from a complete solution, in fact it is really simple. It is mainly a proof of concept of the idea that can be used to play around, so if you are looking for a high-end solution please check typicode’s json-server – the inspiration behind this project -. Or you can fork this project from GitHub and add the additional functionalities you need. As a side note, I have no formal programming education, so feel free to suggest any improvements to the code or point out any issues you find on it.

Getting back to our server, to have an idea of the different parts we will have to develop the first question we should ask ourselves is what functionalities do we want our server to have. From the answer to this question we will extract our list of requirements.

Requirements

It goes without saying that we want it to handle HTTP requests. We also want the data being transferred to be in JSON format. A part from that, we want to be able to easily define the available endpoints and the data the server will provide for each of this endpoints. If it is not to much to ask, an easy setup, no extra dependencies and a quick deploy would be ideal. And last but not least, I would like to be able to set the port the server is listening to, setup a fake Hostname and be able to run multiple mock server instances at the same time.

First step: Server configuration and mock data

Before writing the actual server code, we will first define roughly the server setup process. We will also define the expected data format of the files that will be used to configure and load data into the server. By doing this, when we start with the actual server code, we will already have an idea about how to read and process the specified configuration and serve the data. This will, hopefully, make development easier.

Borrowing the idea from typicode’s json-server we will use one file, in JSON format, to define both the available endpoints of the server and the data that will be accessed from each endpoint. Each of the names of the JSON object will be an accessible endpoint and the values will be the data served through that endpoint. Below you can find a generic example showing this format.

{
    "endpoint": {

      [...]

    },

    "endpoint/:param": [

      {

       "field":"value",

       "param":"value",

        [...]

      },
      {

       "field":"value",

       "param":"value",

        [...]

      },

      [...]
    ],

    [...]
}

Both the data and the routing information will be loaded from this file. A complete example of this file can be found here. A simpler example might look something like:

{
    "test": {

        "id": 1,
        "name": "test name",
        "description": "Test endpoint"

    },

    "metadata": 
    {

        "version": "0.1",
        "description": "Test configuration"

    }
}

The previous example defines two accessible endpoints for the server, /test and /metadata, and the data that will be returned when accessing each of this endpoints is

{
    "id": 1,
    "name": "test name",
    "description": "Test endpoint"
}

for /test and

{
    "version": "0.1",
    "description": "Test configuration"
}

for /metadata. I hope this makes things clearer.

Back to the Server

Server initialization

Having defined the data format that will be used to configure the server we are ready to start its development. To handle HTTP requests we will use Python’s built-in library http.server and, more specifically, the classes HTTPServer and BaseHTTPRequestHandler. I’m by no means an expert in the matter, but I’ll share what I did and what I think I understood.

First we will define the bones of our server and request handler classes. We will fill them with code later. The class that will handle the requests will inherit from Python’s http.server BaseHTTPRequestHandler. From the documentation of BaseHTTPRequestHandler:

The handler will parse the request and the headers, then call a method specific to the request type. The method name is constructed from the request. For example, for the request method SPAM, the do_SPAM() method will be called with no arguments. All of the relevant information is stored in instance variables of the handler.

This means that in order to process POST and GET requests we should define a do_POST() and a do_GET() method. A part from that, we also want to store the data and accessible endpoints so that we can use them when processing requests. With that in mind, our barebones HTTP handler is presented below:

class SimpleServerHandler(BaseHTTPRequestHandler):

    db = None
    routes = []
    data = {}

    def do_POST(self):
        pass

    def do_GET(self):
        pass

Once we have the handler in place, we can define our actual server class that will use this handler to process requests. This server class should inherit from Python’s http.serve HTTPServer class and receive at least the handler and the server address as parameters, since they are required by HTTPServe. Since we want to define a specific set of available endpoints and data, we will add a third parameter which is the file from which to extract this information. Our server class will then look like:

class SimpleServer(HTTPServer):
    """
    Server class for which the rounting and data are extracted from
    a configuration file, dbfile.   
    """
    def __init__(self, server_address, handler, dbfile):
        handler.db = dbfile
        handler.routes, handler.data = load_routes_and_data(dbfile)
        super(SimpleServer, self).__init__(server_address, handler)

The server address and the handler class are passed directly to our server’s super class HTTPServer init method, since it is the class that knows how to deal with them. The third parameter, which is the config file (dbfile), is used to specifu the file from which to load the routes and data that the handler class will have to consider when processing requests. The data from the file is loaded and set to the appropriate properties of the handler class. This will make it possible for us to, later on when implementing the do_GET() and do_POST() methods, be able to access and take into account the specified data and routes.

Note that this values are being set on a class level, so all instances of the handler_class will share the same values.

The function used to load the routing and the data is shown below. It loads and returns a list and a dictionary. The list contains the keys of the JSON object defined in the configuration file dbfile. We will use this list to check if the request endpoint is a valid endpoint. The dictionary keys are the available endpoints and the value for each key is the data under that endpoint.

def load_routes_and_data(dbfile):
    """
    Get the list of supported routes
    """
    # build route handler from datafile
    with open(dbfile, 'r') as f:
        data = json.load(f)
    return list(data.keys()), data

We will add some more functionalities later, but this is enough to get the ball rolling. To get everything up and running we define a run_server function that, given a specified configuration, gets the server running with the specified configuration:

def run_server(
    server_class=SimpleServer,
    handler_class=SimpleServerHandler,
    port=80,
    file='db.json',
    url=None
):
    """
    Run the server forever listening at the specified port
    """
    log_url = url if url else 'localhost'
    log_port = f':{port}/' if port!=80 else SEPARATOR 
    log_msg = f'Running at http://{log_url}{log_port}'
    server_address = ('', int(port))
    httpd = server_class(server_address, handler_class, config_file)

    print('Starting Server...')
    print(f'Listening to connections on port: {port}')
    print(f'Routing and data will be extracted from: {config_file}')
    print(log_msg)
    
    httpd.serve_forever()

This is the function to call when executing the script.

ifĀ __name__Ā ==Ā "__main__":
   run_server()

Request handlers

GET handler

Our simple class will grow from the starting point shown below:

We will start with the general outline of the GET handler and afterwards move on to the POST handler. As said before, the GET handler will be the do_GET method. As I see it, this process should mainly do 3 things:

  1. Extract the target endpoint (and its parameters) from the request. If it doesn’t match any available route, return a 404 status code.
  2. If the endpoint matches an available route then get the data under that route. Here we also check if the route accepts a parameter and, if so, check if it was specified in the request. If it was specified, then the data is filtered.
  3. Send back the data or the appropiate error.

It is not that much, and what I ended up with is shown below.

# Handle HTTP requests
def do_GET(self):
    """
    Handle GET requests
    """
    # check if the request path matches any of the endpoints
    for endpoint in self.routes:
        endpoint_path, param = self._get_route_and_params(endpoint)

        # if the endpoint being tested is not part of the request url
        # there's no need for further processing, skip to the next one
        if endpoint_path not in self.path:
            continue

        # if there is no parameter but an endpoint was matched, return all the data
        if self.path.endswith(SEPARATOR + endpoint_path):
            data = self.data[
                self._get_data_key(
                    endpoint_path, 
                    param
                )
            ]
            self._send_response(code=OK, data=data)
            return

        # else a parameter value has been included in the request
        _, param_val = self.path.rsplit(SEPARATOR, 1)

        # try to get value
        data_to_send = [
            i for i in self.data[
                self._get_data_key(
                    endpoint_path, 
                    param
                )
            ] if str(i[param]) == str(param_val)]
        if len(data_to_send) == 0:
            continue
        self._send_response(code=OK, data=data_to_send)
        return

    # Nothing matched the request
    self._send_response(code=NOT_FOUND)

One thing that I would like to work on and its not currently supported is specifying, for a given parameter, multiple values or a range of values.

As a final note, I decided to standarise server responses, and this is done via the _API_response method, wich receives the response code and response data (optional) as parameters.

def _API_response(self, code, data=None):
    """
    Perpare the API response
    """
    resp_code = [status_code for status_code in HTTPStatus if status_code.value == code]
    if len(resp_code):
        resp_code = resp_code[0]
        resp = {
            'result': {
                'code': resp_code.value,
                'message': resp_code.phrase,
                'description': resp_code.description
            }
        }
    if data:
        resp['data'] = data
    return resp

This can be easily changed to suit your needs.

POST Handler

Ok, now that we are able to handle GET requests, let’s hop on to the code we need to handle POST requests. The general scheme is similar in both situations. In this case we have to:

  1. Check if the endpoint is a valid route. If it is not, send a not found response.
  2. If the endpoint is valid, load the data in the request.
  3. Validate the data. If invalid, send a Bad Request response.
  4. If the data is valid, append it to the current data and write it to the file so that it persists. Answer with an OK status code.
def do_POST(self):
    """
    Handle POST requests
    """
    valid_path = False
    for endpoint in self.routes:

        endpoint_path, param = self._get_route_and_params(endpoint)
        
        if self.path.endswith(SEPARATOR + endpoint_path):
                valid_path = True
                status_code = OK
                # read post data
                post_data = json.loads(
                    self.rfile.read(
                        int(self.headers.get('Content-Length'))
                    ).decode("UTF-8"))

                try:
                    current_data = self.data[
                        self._get_data_key(endpoint, None)
                    ]
                    status_code = self._validate_request(
                        param,
                        post_data, 
                        current_data
                    )

                    if status_code != OK:
                        self._send_response(code=status_code)
                        return

                    # If valid, add object to list
                    post_data[ID] = self._generate_next_id(current_data)
                    self.data[endpoint] = current_data + [post_data]

                    with open(self.db, 'w') as f:
                        f.write(json.dumps(self.data))
                except:
                    status_code = BAD_REQUEST

                self._send_response(code=status_code)

    if not valid_path:
        self._send_response(code=NOT_FOUND)

I think the process is easy to follow and the only thing that might be worth stopping for a second is the data validation process. This is done via the _validate_request() method, whose contents are presented below.

    def _validate_request(self, param, post_data, current_data):
        """
        Validate the body of a POST request.
        """
        if type(current_data) != list or type(post_data) != dict:
            return CONFLICT
        # there is no parameter, no problem
        if not param or param == ID: 
            return OK
        # There is a parameter
        if not post_data.get(param, ''): # Check if post body contains the parameter
            return BAD_REQUEST
        return OK

This method validates the body of the POST request by performing a set of sanity checks. Ideally this will guarantee a consistent dataset. By consistent I mean that the allowed operations (such as searching data by parameter value) can still be performed after the POST. There are two main checks here that aim to achieve this:

  1. That the target endpoint accepts POST requests. This is checked by looking at the type of the data that is stored on the accessed endpoint. If it is a list, it means it accepts adding new items. If it is not, then no items can be added.
  2. If the target endpoint of the POST request accepts a parameter, make sure that the request body contains that parameter. This guarantees that later on searches by parameter value will not fail.

If this checks pass, then a new ID is generated for the item and it is added to the current data set. Default IDs are integers and are determined as the current larges ID + 1. This behaviour can easily be changed by overwriting the _generate_next_id method.

The combination of these two methods and the initialization shown at the begining are enough to get the server running. However, there are still a few features we can implement to make it more complete.

Hostname specification

This is a nice feature that will enable us to fake not only the server data and endpoints but also the url the server can be accessed at. To do so we will add a new class whose responsibility is the addition and deletion of hostnames. This class will have to simple methods, add_host and remove_host. The code presented below is developed specifically for Windows machines. Its only task is to include and remove the specified hostnames to the system hosts file.

class HostHandler():
    """
    Class that handles the addition of a host definition to the hosts file.
    This is useful to enable using a different URL than localhost to the mock
    server. Be aware though that, by default, any new host will be mapped to
    localhost, so the way to have multiple mock server instances serving different
    content is by specifying a different port.
    """
    def __init__(self, path_to_hosts=r'C:\Windows\System32\drivers\etc\hosts', address=LOCALHOST, hostname=None):
        self.default_path = path_to_hosts
        self.hostname = hostname
        self.address = address
        self.content = '{}\t{}'.format(self.address, self.hostname)

    def add_host(self, path=None):
        """
        Write the contents of this host definition to the provided path
        """
        if path is None:
            path = self.default_path
        with open(path, 'a') as hosts_file:
            hosts_file.write('\n' + self.content + '\n')

    def remove_host(self, path=None):
        """
        Remove this host from the hosts file
        """
        if path is None:
            path = self.default_path
        content_with_removed_host = []
        remove_next = False
        with open(path, 'r') as hosts_file:
            for line in hosts_file.read().split('\n'):
                if line != self.content and not remove_next:
                    content_with_removed_host += [line]
                else:
                    content_with_removed_host[:-1] # previous line was a blank line
                    remove_next = not remove_next # next line is a blank line
        with open(path, 'w') as hosts_file:
            hosts_file.write('\n'.join(content_with_removed_host))

Wrapping it up in a nice CLI

To make it easy to use we will provide a simple CLI that enables to set the configuration parameters: hostname, port and configuration file. This can be easily achieved with argparse and a simple function:

def parse_args():
    """
    Process command line arguments and return them as a dict
    """
    parser = argparse.ArgumentParser()
    parser.add_argument('-p', '--port', help='Specify the desired port, defaults to port 80')
    parser.add_argument('-f', '--file', help='File from which to extract routing and data, defaults to db.json')
    parser.add_argument('-u', '--url', help='Set a fake url for the server, defaults to localhost')
    
    return dict({k: v for k, v in vars(parser.parse_args()).items() if v is not None})

which will then be called in the initialization. The initialization code then looks like:

if __name__ == "__main__":
    args = parse_args()

    if args.get(URL, ''):
        host = HostHandler(hostname=args.get(URL))
        host.add_host()

    try:
        if args:
            run_server(**args)
        else:
            run_server()
    except KeyboardInterrupt:
        # gracefully end program and remove entries from hosts files
        if args.get(URL, ''):
            host.remove_host()

Final notes and Discussion

The whole code of the project can be found here. It is far from complete and improvements are welcome. It can also be a good starting point if you are looking for something simple that you can get your hands on and make it suit your needs.

1 Comment

Leave a Comment