Introduction to AI image generation (Stable Diffusion)

Futuristic city generated using InvokeAI

For saying that I work in technology, I feel embarrassingly late to this party. I was recently transfixed by posts on Mastodon that showed images generated by Midjourney. I’d never heard of Midjourney. This started me off down a rabbit hole.

A few metres down the rabbit hole, I read about InvokeAI, an open-source alternative to Midjourney. A few metres more and I discovered that I would be able to run InvokeAI on my PC, which is equipped with an NVIDIA GeForce RTX 3060 graphics card.

That’s how I found myself installing and running InvokeAI, despite still knowing virtually nothing about any facet of AI, let alone image generation. Faced with the InvokeAI web interface, the first question becomes “What do all these knobs and buttons do? What are “CFG Scale”, “Sampler” and “Steps”? What is the difference between the “Models”?

In case you find yourself in the same position, here’s a handy guide.

What is Midjourney?

Midjourney is an independent research lab that produces a proprietary artificial intelligence program under the same name that creates images from textual prompts. It is powered by AI and machine learning algorithms and uses text prompts and parameters to generate images. It can be used for both creative and practical applications, such as creating custom artwork and logos, or visualising data. Midjourney is constantly learning and improving, and can be accessed through the Discord chat app by messaging the Midjourney Bot.

What is Stable Diffusion?

Stable Diffusion is an image-generating model developed by Stability AI. It is powered by artificial intelligence and machine learning algorithms, and uses text prompts and parameters to generate images. It can be used for both creative and practical applications, such as creating custom artwork and logos, or visualising data. Stable Diffusion is open-source, meaning anyone can access and use the model without cost. The model has a relatively good understanding of contemporary artistic styles, making it a popular choice for creative applications.

Stable Diffusion is based on the concept of a CFG Scale (see below), which is a mathematical representation of the complexity of a system, which can be used to analyze the behavior of the system over time. The CFG Scale is used to measure the stability of a system, which is an important factor in understanding how the system evolves over time.

The Stable Diffusion algorithm uses a “sampler” (see below) to collect data from the system at different points in time. The algorithm then uses a model to analyse the data collected from the sampler.

What’s InvokeAI then?

InvokeAI is an implementation of Stable Diffusion. It is powered by artificial intelligence and machine learning algorithms, and uses text prompts and parameters to generate images. It is optimised for efficiency, and can generate images with relatively low VRAM requirements. It supports the use of custom models.

The InvokeAI web interface offers lots of parameters. The rest of this article is dedicated to explaining those parameters.

InvokeAI‘s web interface

CFG Scale

The CFG Scale, or Classifier-Free Guidance Scale, is a parameter in the Stable Diffusion image generation model. It controls how much the image generation process is guided by the initial input, and how much it is random. (It determines the amount of noise in the generated image.) A high CFG Scale value produces output more closely resembling the text prompt.

The CFG Scale in InvokeAI’s web interface

A low CFG Scale value in Stable Diffusion can result in the output image having a lower fidelity and quality. It can for example lead to the model generating extra arms and legs! But it also increases the diversity and creativity of the result.

Increasing the CFG Scale value can result in higher quality, as well as a more dynamic colour range. It can also lead to more detailed images with a higher resolution. That said, high CFG values can lead to unrealistic-looking images.

The best CFG Scale value range for Stable Diffusion is generally between 7 and 11. Higher values of 50 to 100 are recommended for good outpainting results.

The Steps parameter

The Steps parameter of Stable Diffusion defines the number of inference steps used in the model. It dictates the amount of detail that is produced when generating images from text descriptions.

Steps parameter in InvokeAI’s web interface

Generally, increasing the number of steps will produce higher quality images, but this will come at the expense of slower inference. The optimal number of steps will depend on the dataset and task at hand. In most cases, images will converge on 30-50 steps and will not change significantly with higher steps.

The Sampler parameter

InvokeAI provides several different samplers that can be used to generate samples from a given data distribution. The k_euler sampler uses the Euler-Maruyama algorithm to generate samples from a given distribution. The k_dmp_2 sampler uses a second-order differential equation to generate samples. Each of the samplers has its own advantages and disadvantages, and experimentation will be rewarded.

Sampler selection in InvokeAI’s web interface

The sampler is used to select regions of an image and generate a diffusion map that conditions the output of the model. The diffusion map is then used to generate a detailed image conditioned on text descriptions. The sampler also allows for a more efficient and stable training process, as it reduces the number of steps needed to generate a result. Additionally, the sampler can upscale samples from the diffusion model, allowing for better results when generating smaller images.

Prompts

Prompts in Stable Diffusion are phrases or words used to generate AI-generated images. They are used to provide the AI model with guidance on what type of image to create. Prompts can range from abstract concepts to specific objects or scenes. They can be weighted to emphasise certain keywords.

Example prompts include “a picture of a black cat on a kitchen top”, “Paintings of Landscapes”, “Style cue: Steampunk / Clockpunk”, “a beautiful sunset over a beach” or “a futuristic cityscape.” Using very specific prompts, helps the AI to generate more accurate and detailed images.

Different models*

Confusingly, you can use different models with InvokeIA. The most commonly used is the Stable Diffusion Model. InvokeAI can use a range of other models for image processing and manipulation tasks. InvokeAI also provides tools for creating custom models, allowing users to create their own models that can be used in combination with the existing models.

InvokeAI supports a variety of models, from classic Machine Learning algorithms such as Random Forest and Logistic Regression to deep learning models such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). The main differences between these models are the type of data that they process and the types of result they produce. For example, Random Forest and Logistic Regression are good at predicting classifications, while CNNs and RNNs are better at recognizing patterns in images and text. Additionally, CNNs and RNNs can be used for more complex tasks such as image recognition, text generation, and language translation.

Sneaky disclaimer

So here’s my disclaimer. I’ve pulled a slightly sneaky trick with this blog post. I’ve recruited AI to explain AI. That seemed logical. I used AI text generation services YouChat and Perplexity to explain Stable Diffusion, using prompts like “Explain the CFG Scale in Stable Diffusion.”

That feels like cheating. But I did at least sanity-check the results and edit them before posting here. Some of the answers were in fact wrong. And it was no quicker writing the blog post this way. But roughly 80% of the text in this blog post was AI-generated. How does that make you feel? Tell me in the comments!

*The AI-based text-generators really struggled to explain the difference between the various models that work with InvokeAI. My suggestion is to install some and experiment!

Querying GitHub Projects V2 with GraphQL in Laravel

GitHub and GraphQL logos

Note: I previously wrote about using plain PHP to query GitHub Projects V2. In this post I offer some tips for querying using Laravel.

GitHub’s new Projects are not accessible via the older REST API. Working with them programmatically involves learning some GraphQL, which can be a headache, the first time you encounter it. Here’s my approach, using Laravel.

Authentication

Get a GitHub personal access token, limiting its permissions as appropriate to your app. Then add to your .env:

GH_TOKEN=[your token]

HTTP requests

We can standardise GitHub GraphQL queries by using an HTTP facade macro. In app/Providers/AppServiceProvider.php, add to the boot() method:

    /**
     * Make request to GitHub using the GraphQL API.
     *
     * $query - a GraphQL query string
     * $variables - an array of GraphQL variables
     *
     * Call like: $data = Http::githubGraphQL($query, $variables)->throw()->json()['data'];
     */
    Http::macro('githubGraphQL', function (string $query, array $variables) {
        return Http::withHeaders([
            'Accept' => 'application/vnd.github+json',
            'Authorization' => 'Bearer ' . env('GH_TOKEN')
        ])->post('https://api.github.com/graphql', [
            'query' => $query,
            'variables' => $variables,
        ]);
    });

GitHubProjects model

This sample app/Models/GitHubProjects.php shows you the sort of query you can run:

    <?php

namespace App\Models;

use Illuminate\Database\Eloquent\Factories\HasFactory;
use Illuminate\Database\Eloquent\Model;
use Illuminate\Support\Facades\Http;

class GitHubProjects extends Model
{
    use HasFactory;

    /**
     * getFirstN - get first N projects for the GitHub organisation, ordered by
     * title
     *
     * @param integer $count - number of projects to return
     * @return array
     */
    public static function getFirstN(int $count = 20): array
    {
        $projects = Http::githubGraphQL(
            <<<EOD
            query getFirstN(\$count: Int) {
              organization(login: "MyOrg") {
                projectsV2(first: \$count, orderBy: {field: TITLE, direction: ASC}) {
                  nodes {
                    number
                    id
                    title
                  }
                }
              }
            }
EOD,
            [ "count" => $count ]
        )->throw()->json()['data'];
        return $projects['organization']['projectsV2']['nodes'];
    }


    /**
     * getFirstNIssues
     *
     * @param integer $projectNumber
     * @param integer $count
     * @return array
     */
    public static function getFirstNIssues(int $projectNumber = 1, int $count = 20): array
    {
        $projectId = GitHubProjects::getIdByNumber($projectNumber);
        $issues = Http::githubGraphQL(
            <<<EOD
            # Exclamation mark since projectId is required
            query getFirstNIssues(\$projectId: ID!, \$count: Int) {
              node(id: \$projectId) {
                ... on ProjectV2 {
                  items(first: \$count) {
                    nodes {
                      id
                      fieldValues(first: 8) {
                        nodes {
                          ... on ProjectV2ItemFieldTextValue {
                            text
                            field {
                              ... on ProjectV2FieldCommon {
                                name
                              }
                            }
                          }
                          ... on ProjectV2ItemFieldDateValue {
                            date
                            field {
                              ... on ProjectV2FieldCommon {
                                name
                              }
                            }
                          }
                          ... on ProjectV2ItemFieldSingleSelectValue {
                            name
                            field {
                              ... on ProjectV2FieldCommon {
                                name
                              }
                            }
                          }
                        }
                      }
                      content {
                        ... on DraftIssue {
                          title
                          body
                        }
                        ... on Issue {
                          title
                          assignees(first: 10) {
                            nodes {
                              login
                            }
                          }
                        }
                        ... on PullRequest {
                          title
                          assignees(first: 10) {
                            nodes {
                              login
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
EOD,
            [
              "projectId" => $projectId,
              "count" => $count
            ]
        )->throw()->json()['data'];
        return $issues['node']['items']['nodes'];
    }


    /**
     * getByTitle - search for projects by title
     *
     * @param string $title - search for
     * @return array
     */
    public static function getByTitle(string $title): array
    {
        $projects = Http::githubGraphQL(
            <<<EOD
            query getByName(\$title: String!) {
                organization(login: "MyOrg") {
                    projectsV2(
                        first: 20
                        orderBy: { field: TITLE, direction: ASC }
                        query: \$title
                    ) {
                        nodes {
                            number
                            id
                            title
                        }
                    }
                }
            }
              
EOD,
            [ "title" => $title ]
        )->throw()->json()['data'];
        return $projects['organization']['projectsV2']['nodes'];
    }


    /**
     * getIdByNumber
     *
     * @param integer $number - the integer identifier of the project
     * @return string - the internal GitHub project ID (e.g. PVT_kwDOBWQiz84AH53W)
     */
    private static function getIdByNumber(int $number = 1)
    {
        $project = Http::githubGraphQL(
            <<<EOD
            # Exclamation mark since number is required
            query getIdByNumber(\$number: Int!) {
              organization(login: "MyOrg") {
                projectV2(number: \$number) {
                  id
                }
              }
            }
EOD,
            ['number' => $number]
        )->throw()->json()['data'];
        return $project['organization']['projectV2']['id'];
    }


    /**
     * getProjectFields - get all the (custom) fields associated to a project
     *
     * @param string $projectID - GitHub's reference for the project
     * @return array
     */
    public static function getProjectFields(string $projectID): array
    {
          $fields = Http::githubGraphQL(
            <<<EOD
            query getProjectFields(\$node: ID!) {
              node(id: \$node) {
                ... on ProjectV2 {
                  fields(first: 20) {
                    nodes {
                      ... on ProjectV2Field {
                        id
                        name
                      }
                      ... on ProjectV2IterationField {
                        id
                        name
                        configuration {
                          iterations {
                            startDate
                            id
                          }
                        }
                      }
                      ... on ProjectV2SingleSelectField {
                        id
                        name
                        options {
                          id
                          name
                        }
                      }
                    }
                  }
                }
              }
            }
              
EOD,
            [ "node" => $projectID ]
        )->throw()->json()['data'];
        return $fields['node']['fields']['nodes'];
    }
}

Controller

This is a very basic sample controller method to get you started. In practice you’ll use views:

public function index()
{
    /*
        Search for "MyProject" in projects
        Project number is $projects[0]['number'];
        Project name is $projects[0]['title'];
        Project ID is $projects[0]['id'];
    */
    define('BR', "<br />\n");
    $projects = GitHubProjects::getByTitle('MyProject');

    if (isset($projects[0])) {
        $projectNumber = $projects[0]['number'];
        $issues = GitHubProjects::getFirstNIssues($projectNumber, 50);
        echo "<h1>First 50 issues in MyProject project</h1>\n";
        foreach ($issues as $issue) {
            echo '<b>ID:</b> ' . $issue['id'] . BR;
            echo '<b>Title:</b> ' . $issue['content']['title'] . BR;
            if (isset($issue['content']['assignees']['nodes'][0])) {
                echo '<b>Assignee:</b> ' . $issue['content']['assignees']['nodes'][0]['login'] . BR;
            }
            // Field types
            foreach ($issue['fieldValues']['nodes'] as $field) {
                if (isset($field['text']) && $field['field']['name'] != "Title") {
                    echo '<b>' . $field['field']['name'] . ':</b> ' . $field['text'] . BR;
                }
                if (isset($field['date'])) {
                    echo '<b>' . $field['field']['name'] . ':</b> ' . $field['date'] . BR;
                }
                if (isset($field['name'])) {
                    echo '<b>' . $field['field']['name'] . ':</b> ' . $field['name'] . BR;
                }
            }
            echo BR;
        }
    } else {
        echo "No projects found";
    }
}

Useful resources

The following are invaluable for working with GitHub’s GraphQL API:

Non-alcoholic dairy-free syllabub recipe

At last, thanks to Oatly, we can have a dairy-free syllabub that’s just as delicious! This version went down well with my son James. 😀

Serves: 6 (or can stretch to 8)

Ingredients

  • 250ml Whippable Creamy Oat by Oatly
  • 55g white sugar
  • Juice of one lemon
  • Juice of two oranges
  • 100ml sparkling grape juice

Method

  1. Add the sugar to the juices and warm until the sugar is dissolved completely
  2. Allow the juice to cool completely
  3. Meanwhile, whip the cream until it thickens and forms peaks (an electric whisk is best)
  4. Add the sparkling grape juice to the sugared juice
  5. Gradually fold the juice into the cream
  6. Pour into tall-stemmed glasses and chill well

Note: the longer you chill the dessert, the more juice will separate out – which you may or may not like! Top with grated orange rind, if desired.

Alcoholic alternative

For an alcoholic version, replace the orange juice and sparkling grape juice with 150ml sweet white wine or Madeira.

Nutritional information

Approximate figures (if divided into six portions):

  • 135kcal
  • 10g fat of which saturates, 9g
  • 21g carbohydrate of which sugars, 11g
  • 1.2g fibre
  • 0.7g protein
  • 0.1g salt

More recipes

If you liked this recipe, you might want to buy a dairy-free recipe book, and support this site in the process! (Affiliate links are no additional cost to you.)

Querying GitHub Projects v2 with GraphQL in PHP

Update: you may prefer my post about doing this in Laravel.

GitHub’s new Projects are not accessible via the older REST API. Working with them programmatically involves learning some GraphQL, which can be a headache, the first time you encounter it. Here’s my approach, using PHP.

Set up cURL

You can certainly use an HTTP request library, but sometimes it’s easiest to get your hands dirty with cURL. Here’s the setup:

$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
    'Accept: application/vnd.github+json',
    'Authorization: Bearer '.$_ENV['GH_TOKEN']
));

// Fake user agent, to keep GitHub happy
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0');

// Dev settings (PHP built-in server can't validate)
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);

The “dev settings” are there since I’ve been using PHP’s built-in webserver during development. Don’t take those into production!

Generic API class

This class provides a simple wrapper around REST API and GraphQL calls. I’m passing in the cURL handler and calling most methods statically, since I’m not yet modelling the underlying objects.

class GitHub
{
    /**
     * API
     * Make request to GitHub using the standard REST API.
     * See: https://docs.github.com/en/rest
     *
     * @param [CurlHandle Object] $ch - curl handle
     * @param [string] $endpoint - the REST endpoint
     * @return array|stdClass
     */
    public static function api(\CurlHandle $ch, string $endpoint)
    {
        curl_setopt($ch, CURLOPT_URL, 'https://api.github.com/'.$endpoint);
        curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');

        $content = curl_exec($ch);
        if (curl_errno($ch)) {
            die("Error:".curl_error($ch));
        }
        return json_decode($content);
    }

    /**
     * GraphQL
     * Make request to GitHub using the newer GraphQL API.
     * See: https://docs.github.com/en/graphql/reference
     * GraphQL Explorer: https://docs.github.com/en/graphql/overview/explorer
     * GraphQL Formatter: https://jsonformatter.org/graphql-formatter
     *
     * @param CurlHandle $ch
     * @param string $query - a GraphQL query string
     * @param array $variables - an array of GraphQL variables
     * @return StdClass
     */
    public static function graphQL(\CurlHandle $ch, string $query, array $variables)
    {
        curl_setopt($ch, CURLOPT_URL, 'https://api.github.com/graphql');
        curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
        curl_setopt(
            $ch,
            CURLOPT_POSTFIELDS,
            json_encode(['query' => $query, 'variables' => $variables])
        );

        $content = curl_exec($ch);
        if (curl_errno($ch)) {
            die("Error:".curl_error($ch));
        }
        return json_decode($content);
    }
}

Projects class

Various simple methods for obtaining information about projects. I say “simple” but the GraphQL queries take some mastering.

class GitHubProjects
{
    /**
     * getFirstN - get first N projects for the GitHub organisation, ordered by
     * title
     *
     * @param \CurlHandle $ch
     * @param integer $count - number of projects to return
     * @return array
     */
    public static function getFirstN(\CurlHandle $ch, int $count = 20): array
    {
        $projects = GitHub::graphQL(
            $ch,
            <<<EOD
            query getFirstN(\$count: Int) {
              organization(login: "MyOrg") {
                projectsV2(first: \$count, orderBy: {field: TITLE, direction: ASC}) {
                  nodes {
                    number
                    id
                    title
                  }
                }
              }
            }
EOD,
            [ "count" => $count ]
        );
        return $projects->data->organization->projectsV2->nodes;
    }


    /**
     * getFirstNIssues
     *
     * @param \CurlHandle $ch
     * @param integer $projectNumber
     * @param integer $count
     * @return array
     */
    public static function getFirstNIssues(\CurlHandle $ch, int $projectNumber = 1, int $count = 20): array
    {
        $projectId = GitHubProjects::getIdByNumber($ch, $projectNumber);
        $issues = GitHub::graphQL(
            $ch,
            <<<EOD
            # Exclamation mark since projectId is required
            query getFirstNIssues(\$projectId: ID!, \$count: Int) {
              node(id: \$projectId) {
                ... on ProjectV2 {
                  items(first: \$count) {
                    nodes {
                      id
                      fieldValues(first: 8) {
                        nodes {
                          ... on ProjectV2ItemFieldTextValue {
                            text
                            field {
                              ... on ProjectV2FieldCommon {
                                name
                              }
                            }
                          }
                          ... on ProjectV2ItemFieldDateValue {
                            date
                            field {
                              ... on ProjectV2FieldCommon {
                                name
                              }
                            }
                          }
                          ... on ProjectV2ItemFieldSingleSelectValue {
                            name
                            field {
                              ... on ProjectV2FieldCommon {
                                name
                              }
                            }
                          }
                        }
                      }
                      content {
                        ... on DraftIssue {
                          title
                          body
                        }
                        ... on Issue {
                          title
                          assignees(first: 10) {
                            nodes {
                              login
                            }
                          }
                        }
                        ... on PullRequest {
                          title
                          assignees(first: 10) {
                            nodes {
                              login
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
EOD,
            [
              "projectId" => $projectId,
              "count" => $count
            ]
        );
        return $issues->data->node->items->nodes;
    }

    
    /**
     * getByTitle - search for projects by title
     *
     * @param \CurlHandle $ch
     * @param string $title - search for
     * @return array
     */
    public static function getByTitle(\CurlHandle $ch, string $title): array
    {
        $projects = GitHub::graphQL(
            $ch,
            <<<EOD
            query getByName(\$title: String!) {
                organization(login: "MyOrg") {
                    projectsV2(
                        first: 20
                        orderBy: { field: TITLE, direction: ASC }
                        query: \$title
                    ) {
                        nodes {
                            number
                            id
                            title
                        }
                    }
                }
            }
              
EOD,
            [ "title" => $title ]
        );
        return $projects->data->organization->projectsV2->nodes;
    }


    /**
     * getIdByNumber
     *
     * @param \CurlHandle $ch
     * @param integer $number - the integer identifier of the project
     * @return string - the internal GitHub project ID (e.g. PVT_kwDOBWQiz84AH53W)
     */
    private static function getIdByNumber(\CurlHandle $ch, int $number = 1)
    {
        $project = GitHub::graphQL(
            $ch,
            <<<EOD
            # Exclamation mark since number is required
            query getIdByNumber(\$number: Int!) {
              organization(login: "MyOrg") {
                projectV2(number: \$number) {
                  id
                }
              }
            }
EOD,
            ['number' => $number]
        );
        return $project->data->organization->projectV2->id;
    }
}

Using it

I have not included namespacing or autoloading here. In broad terms, it’s simple to use the Projects class:

$projects = Model\GitHubProjects::getByTitle($ch, 'MyProject');
if (is_array($projects) && !empty($projects)) {
    $project = $projects[0];
    var_dump($project);
}

Excel – correctly sort IP addresses

This post is probably for pedants only, who care passionately about correctly sorting IP addresses in an Excel spreadsheet. This approach uses pure functions – no VBA. I prefer it to some other approaches because, frankly, they sail right over my head.

Let’s start with a column of IP addresses – like this one:

Excel tables are lovely, for working with data like this. If you convert your data to a table, you get to use named column references, which we’ll see in a moment. Go to Insert > Table and you get something like this:

You can’t sort this column meaningfully, as-is. We need an additional column, which we’ll use to transform the contents of the IP column.

And then in any of the rows in that column, we enter this formula:

=
IF(0,"##### FIRST OCTET #####","") &
TEXT(
  LEFT(
    [@IP],
    FIND(
      CHAR(134),
      SUBSTITUTE(
        [@IP],
        ".",
        CHAR(134),
        1
      )
    ) - 1
  ),
  "000"
)
& "." &
IF(0,"##### SECOND OCTET #####","") &
TEXT(
  MID(
    [@IP],
    FIND(
      CHAR(134),
      SUBSTITUTE(
        [@IP],
        ".",
        CHAR(134),
        1
      )
    ) + 1,
    FIND(
      CHAR(134),
      SUBSTITUTE(
        [@IP],
        ".",
        CHAR(134),
        2
      )
    )
    -
    FIND(
      CHAR(134),
      SUBSTITUTE(
        [@IP],
        ".",
        CHAR(134),
        1
      )
    )
  ),
  "000"
)
& "." &
IF(0,"##### THIRD OCTET #####","") &
TEXT(
  MID(
    [@IP],
    FIND(
      CHAR(134),
      SUBSTITUTE(
        [@IP],
        ".",
        CHAR(134),
        2
      )
    ) + 1,
    FIND(
      CHAR(134),
      SUBSTITUTE(
        [@IP],
        ".",
        CHAR(134),
        3
      )
    )
    -
    FIND(
      CHAR(134),
      SUBSTITUTE(
        [@IP],
        ".",
        CHAR(134),
        2
      )
    )
  ),
  "000"
)
& "." &
IF(0,"##### FOURTH OCTET #####","") &
TEXT(
  MID(
    [@IP],
    FIND(
      CHAR(134),
      SUBSTITUTE(
        [@IP],
        ".",
        CHAR(134),
        3
      )
    ) + 1,
    
    IF(
      ISERROR(
        FIND("/",[@IP])
      ),
      LEN([@IP]),
      FIND("/",[@IP]) - 1
    )    
    -
    FIND(
      CHAR(134),
      SUBSTITUTE(
        [@IP],
        ".",
        CHAR(134),
        3
      )
    )
  ),
  "000"
)
&
IF(0,"##### CIDR #####","") &
IF(
  ISERROR(FIND("/",[@IP])),
  "",
  RIGHT(
    [@IP],
    LEN([@IP]) - FIND("/",[@IP]) + 1
  )
)

You end up with this, on which you can now perform an alphabetical (A-Z) sort:

If you like, you can hide that column, so you don’t need to look at its hideousness. Then whenever you need to resort, go to Data > Sort.

Some things to mention about this formula:

  • [@IP] is the named column reference I referred to previously.
  • I edited this formula in a code editor (Notepad++), so I could nicely indent and keep track of opened and closed parenthesis. This makes life much easier, when writing long formulae! There’s one gotcha – Notepad++ by default uses tabs rather than spaces, which breaks Excel. Make sure there are no tab characters in your indentation.
  • The IF(0,"##### THIRD OCTET #####","") stuff is a hack, which allows you to insert a comment into a text-based formula. The 0 evaluates to FALSE, so it returns the function’s third parameter – an empty string. The second parameter is where I place my comment. Handy!
  • Excel doesn’t have a function to find the position of the nth occurrence of a string. So there’s a nifty two-step hack for this, which is not my original idea. First, we use the SUBSTITUTE() function, which can substitute a character for the nth occurrence of some text. We search for the nth occurrence of the full stop (“.”) and replace it with CHAR(134) – the dagger symbol (†). Then we find the position of that CHAR(134), to feed into the LEFT()/MID()/RIGHT() functions.
  • The formula handles CIDR notation.

Script to clone a VM with free VMware ESXi

clones

Many people run free versions of ESXi, particularly in lab environments. Unfortunately with the free version of ESXi, the VMware API is read-only. This limits (or complicates) automation.

I was looking for a way to clone guest VMs with the minimum of effort. This script, which took inspiration from many sources on the internet, is the result. It takes advantage of the fact that although the API is limited, there are plenty of actions you can take via SSH, including calls to vim-cmd.

<#

.SYNOPSIS
    Clones a VM,

.DESCRIPTION
    This script:

    - Retrieves a list of VMs attached to the host
    - Enables the user to choose which VM to clone
    - Clones the VM

    It must be run on a Windows machine that can connect to the virtual host.

    This depends on the Posh-SSH and PowerCLI modules, so from an elevated
    PowerShell prompt, run:

        Install-Module PoSH-SSH
        Install-Module VMware.PowerCLI

    For free ESXi, the VMware API is read-only. That limits what we can do with
    PowerCLI. Instead, we run certain commands through SSH. You will therefore
	need to enable SSH on the ESXi host before running this script.
	
	The script only handles simple hosts with datastores under /vmfs. And it
	clones to the same datastore as the donor VM. Your setup and requirements
	may be more complex. Adjust the script to suit.

.EXAMPLE
    From a PowerShell prompt:

      .\New-GuestClone.ps1 -ESXiHost 192.168.101.100

.COMPONENT
    VMware scripts

.NOTES
    This release:

        Version: 1.0
        Date:    8 July 2021
        Author:  Rob Pomeroy

    Version history:

        1.0 - 8 July 2021 - first release

#>
param
(
    [Parameter(Mandatory = $true, Position = 0)][String]$ESXiHost
)

####################
## INITIALISATION ##
####################

# Load necessary modules
Write-Host Loading PowerShell modules...
Import-Module PoSH-SSH
Import-Module VMware.PowerCLI

# Change to the directory where this script is running
Push-Location -Path ([System.IO.Path]::GetDirectoryName($PSCommandPath))


#################
## CREDENTIALS ##
#################

# Check for the creds directory; create it if it doesn't exist
If(-not (Test-Path -Path '.\creds' -PathType Container)) {
    New-Item -Path '.\creds' -ItemType Directory | Out-Null
}

# Looks for credentials file for the VMware host. Passwords are stored encrypted
# and will only work for the user and machine on which they're stored.
$credsFile = ('.\creds\' + $ESXiHost + '.creds')
If(-not (Test-Path -Path $credsFile)) {
    # Request credentials
    $creds = Get-Credential -Message "Enter root password for VMware host $ESXiHost" -User root
    $creds.Password | ConvertFrom-SecureString | Set-Content $credsFile
}
$ESXICredential = New-Object System.Management.Automation.PSCredential( `
    "root", `
    (Get-Content $credsFile | ConvertTo-SecureString)
)


#########################
## List VMs (PowerCLI) ##
#########################
#
# Disable HTTPS certificate check (not strictly needed if you use -Force) in
# later calls.
Set-PowerCLIConfiguration -InvalidCertificateAction Ignore -Confirm:$false | Out-Null

# Connect to the ESXi server
Connect-VIServer -Server $ESXiHost -Protocol https -Credential $ESXICredential -Force | Out-Null
If(-not $?) {
    Throw "Connection to ESXi failed. If password issue, delete $credsFile and try again."
}

# Get all VMs, sorted by name
$guests = (Get-VM -Server $ESXiHost | Sort-Object)

# Work out how much we need to left-pad the array index, when outputting
$padWidth = ([string]($guests.Count - 1)).Length

# Output the list of VMs, with array index padded so it lines up nicely
Write-Host ("Existing VMs (" + $guests.Count + "), sorted by name:")
for ( $i = 0; $i -lt $guests.count; $i++)
{
    If($guests[$i].PowerState -eq "PoweredOn") {
        Write-Host -ForegroundColor Red ("[" + "$i".PadLeft($padWidth, ' ') + "](ON) : " + $guests[$i].Name) 
    } Else {
        Write-Host ("[" + "$i".PadLeft($padWidth, ' ') + "](off): " + $guests[$i].Name) 
    }
}
Write-Host


##########################
## Choose a VM to clone ##
##########################

$chosenVM = 0
do {
    $inputValid = [int]::TryParse((Read-Host 'Enter the [number] of the VM to clone (the donor)'), [ref]$chosenVM)
    if($chosenVM -lt 0 -or $chosenVM -ge $guests.Count) {
        $inputValid = $false
    }
    if (-not $inputValid) {
        Write-Host ("Must be a number in the range 0 to " + ($guests.Count - 1).ToString() + ". Try again.")
    }
} while (-not $inputValid)

# Check the VM is powered off
if($guests[$chosenVM].PowerState -ne "PoweredOff") {
    Throw "ERROR: VM must be powered off before cloning"
}

# Get VM's datastore, directory and VMX; we assume this is at /vmfs/volumes
If(-not ($guests[$chosenVM].ExtensionData.Config.Files.VmPathName -match '\[(.*)\] ([^\/]*)\/(.*)')) {
    Throw "ERROR: Could not calculate the datastore"
}
$VMdatastore = $Matches[1]
$VMdirectory = $Matches[2]
$VMXlocation = ("/vmfs/volumes/" + $VMdatastore + "/" + $VMdirectory + "/" + $Matches[3])
$VMdisks     = $guests[$chosenVM] | Get-HardDisk


###############################
## File test (PoSH-SSH SFTP) ##
###############################

# Clear any open SFTP sessions
Get-SFTPSession | Remove-SFTPSession | Out-Null

# Start a new SFTP session
(New-SFTPSession -Computername $ESXiHost -Credential $ESXICredential -Acceptkey -Force -WarningAction SilentlyContinue) | Out-Null

# Test that we can locate the VMX file
If(-not (Test-SFTPPath -SessionId 0 -Path $VMXlocation)) {
    Throw "ERROR: Cannot find donor VM's VMX file"
}


#################
## New VM name ##
#################

$validInput = $false
While(-not $validInput) {
    $newVMname = Read-Host "Enter the name of the new VM"
    $newVMdirectory = ("/vmfs/volumes/" + $VMdatastore + "/" + $newVMname)

    # Check if the directory already exists
    If(Test-SFTPPath -SessionId 0 -Path $newVMdirectory) {
        $ynTest = $false
        While(-not $ynTest) {
            $yn = (Read-Host "A directory already exists with that name. Continue? [Y/N]").ToUpper()
            if (($yn -ne 'Y') -and ($yn -ne 'N')) {
                Write-Host "ERROR: enter Y or N"
            } else {
                $ynTest = $true
            }
        }
        if($yn -eq 'Y') {
            $validInput = $true
        } else {
            Write-Host "You will need to choose a different VM name."
        }
    } else {
        If($newVMdirectory.Length -lt 1) {
            Write-Host "ERROR: enter a name"
        } else {
            $validInput = $true

            # Create the directory
            New-SFTPItem -SessionId 0 -Path $newVMdirectory -ItemType Directory | Out-Null
        }
    }
}


###################################
## Copy & transform the VMX file ##
###################################

# Clear all previous SSH sessions
Get-SSHSession | Remove-SSHSession | Out-Null

# Connect via SSH to the VMware host
(New-SSHSession -Computername $ESXiHost -Credential $ESXICredential -Acceptkey -Force -WarningAction SilentlyContinue) | Out-Null

# Replace VM name in new VMX file
Write-Host "Cloning the VMX file..."
$newVMXlocation = $newVMdirectory + '/' + $newVMname + '.vmx'
$command = ('sed -e "s/' + $VMdirectory + '/' + $newVMname + '/g" "' + $VMXlocation + '" > "' + $newVMXlocation + '"')
($commandResult = Invoke-SSHCommand -Index 0 -Command $command) | Out-Null

# Set the display name correctly (might be wrong if donor VM name didn't match directory name)
$find    = 'displayName \= ".*"'
$replace = 'displayName = "' + $newVMname + '"'
$command = ("sed -i 's/$find/$replace/' '$newVMXlocation'")
($commandResult = Invoke-SSHCommand -Index 0 -Command $command) | Out-Null

# Blank the MAC address for adapter 1
$find    = 'ethernet0.generatedAddress \= ".*"'
$replace = 'ethernet0.generatedAddress = ""'
$command = ("sed -i 's/$find/$replace/' '$newVMXlocation'")
($commandResult = Invoke-SSHCommand -Index 0 -Command $command) | Out-Null


#####################
## Clone the VMDKs ##
#####################

Write-Host "Please be patient while cloning disks. This can take some time!"
foreach($VMdisk in $VMdisks) {
    # Extract the filename
    $VMdisk.Filename -match "([^/]*\.vmdk)" | Out-Null
    $oldDisk = ("/vmfs/volumes/" + $VMdatastore + "/" + $VMdirectory + "/" + $Matches[1])
    $newDisk = ($newVMdirectory + "/" + ($Matches[1] -replace $VMdirectory, $newVMname))

    # Clone the disk
    $command = ('/bin/vmkfstools -i "' + $oldDisk + '" -d thin "' + $newDisk + '"')
    Write-Host "Cloning disk $oldDisk to $newDisk with command:"
    Write-Host $command
    # Set a timeout of 10 minutes/600 seconds for the disk to clone
    ($commandResult = Invoke-SSHCommand -Index 0 -Command $command -TimeOut 600) | Out-Null
    #Write-Host $commandResult.Output
   
}


########################
## Register the clone ##
########################

Write-Host "Registering the clone..."
$command = ('vim-cmd solo/register "' + $newVMXlocation + '"')
($commandResult = Invoke-SSHCommand -Index 0 -Command $command) | Out-Null
#Write-Host $commandResult.Output


##########
## TIDY ##
##########

# Close all connections to the ESXi host
Disconnect-VIServer -Server $ESXiHost -Force -Confirm:$false
Get-SSHSession | Remove-SSHSession | Out-Null
Get-SFTPSession | Remove-SFTPSession | Out-Null

# Return to previous directory
Pop-Location

You can download the latest version of the script from my GitHub repository.

Cover photo by Dynamic Wang on Unsplash

Using a Canon EOS 60D as a webcam

Canon EOS 60D
At the time of writing, the excellent EOS 90D is the modern equivalent of my 60D. I really want a Sony Alpha though! In the interests of transparency: these are affiliate links. See my affiliate disclosure page for an explanation.

Necessity is the mother of invention. In the midst of the trauma and struggles of coronavirus, one positive theme has consistently emerged: innovation. In particular, the explosive rise of home working, podcasting and vlogging has resulted in significant improvements in associated technology.

So when I recently started researching ways to raise my webcam game (for conference calls and church meetings), I was spoilt for choice. Since I’m a fan of both cost-efficiency and quality, I was particularly interested in seeing what could be achieved with my existing equipment, including my faithful DSLR camera – a Canon EOS 60D.

The last time I looked into this, the main way to take a feed from this camera model, was through its HDMI port and a separate video capture device (which I don’t own). But Canon has pulled an innovation blinder, releasing and improving its EOS Webcam Utility and ensuring that it not only works with the latest hardware, but also with such aging models as my ten-year-old 60D. Oh Canon, I love you.

“There must be a catch,” I thought, as I read reports of camera sensors overheating, or timing out after 30 minutes. So with no great expectations, I downloaded and tested the software.

Oh. My. Word. Did I mention Canon how I love you?

Screenshot of a Zoom session using the EOS webcam utility, my Canon EOS 60D and a 17-55 f/2.8 lens.

With zero effort and no tweaking of camera settings, the improvement was immediately visible. From the screenshot you can see I need to work on contrast and lighting. But for a first test, this made me very happy.

Did the camera overheat or timeout? No. I ran the session for about two hours. The camera was slightly warm at the end. Granted it’s not a hot summer’s day here in the UK (is it ever!) but it looks to me like this setup would work all day, every day. The only snag is battery life. So instead of being on the market for a superior webcam, I’m instead on the market for an external power supply (like Canon’s ACK-E6). Clones of the OEM adapter are available for about £23. Bargain.

Photo of EOS 60D courtesy of John Torcasio, CC BY-SA 3.0, via Wikimedia Commons.

SOLVED: first-time login problems when enforcing MFA with AWS

If you’re reading this page, you probably already have an MFA strategy sorted. But for those still making decisions: I love my YubiKey 5 NFC. I use it constantly to log into AWS, to secure my GitHub account, to protect my “I would be completely ruined if it were hacked” email account, etc. My wife was extremely puzzled when I asked for one for my birthday. She didn’t really know what it was she was buying me. What a great present though!

In the interests of transparency: this is an affiliate link. See my affiliate disclosure page for an explanation.

AWS has a tutorial about enforcing MFA for all users. The general thrust of the article is to create a policy that allows users without MFA to do nothing other than log in and set up MFA. Having enabled and logged in using MFA, other permissions become available to the user (according to whatever other permissions are assigned).

This works well apart from one snag: having created a user, and set the flag forcing the user to change password on first login, the user cannot log in. Instead the user is greeted with the following error:

Either user is not authorized to perform iam:ChangePassword or entered password does not comply with account password policy set by administrator

The problem lies in a policy statement called “DenyAllExceptListedIfNoMFA”. As its name suggests, for a user without MFA, this blocks all bar the specified actions. In AWS’s recommended policy, the section effectively allows the following actions:

"iam:CreateVirtualMFADevice",
"iam:EnableMFADevice",
"iam:GetUser",
"iam:ListMFADevices",
"iam:ListVirtualMFADevices",
"iam:ResyncMFADevice",
"sts:GetSessionToken"

You’ll notice that those actions don’t include anything about changing a password! So without MFA already enabled on your account, there’s no way to change your password when first logging on (if “force password change” is enabled). The trick is to add two more permissions:

"iam:ChangePassword",
"iam:CreateLoginProfile",

For a user that has not yet logged into the AWS console, this will allow creation of the user’s login profile and setting a new password.

SOLVED: Windows 10 forbidden port bind

Angry budgie - Photo by Егор Камелев on Unsplash

Ever have this problem, launching a Docker container (in this case, Nginx on port 8000)?

Error: Unable to start container: Error response from daemon: Ports are not available: listen tcp 0.0.0.0:8000: bind: An attempt was made to access a socket in a way forbidden by its access permissions.

or maybe this problem, trying to run PHP’s built-in webserver?

php -S localhost:8080
[Fri Sep 11 09:00:09 2020] Failed to listen on localhost:8080 (reason: An attempt was made to access a socket in a way forbidden by its access permissions.)

Who to trust?

Like me, you may already have read many “solutions”, on a whole bunch of spammy websites. The “fixes” are often no more than workarounds – and in some cases, pretty bad workarounds, at that. Such as:

  • Disable VPN
  • Disable Internet Connection Sharing
  • Disable third party firewall
  • Disable antivirus (for goodness’ sake!)

More sensibly, use (e.g.) netstat to find out if something has already bound to the port.

None of these helped in my case. (Well I didn’t try disabling my antivirus or firewall, because c’mon!) Nothing was bound to the ports in question. I couldn’t disable ICS because I’m using its capabilities to provide NAT routing for Hyper-V networks.

The cause

It turns out the problem is down to Docker and Hyper-V reserving a shed load of ports. You can verify if this is the case for you by running the following command (which despite advice elsewhere on the internet does not need to be in an elevated PowerShell prompt; plain old no-privileges cmd will do):

netsh interface ipv4 show excludedportrange protocol=tcp

In my case, I could see that a lot of ports were reserved, between 1128 and 55437:

Start Port End Port
---------- --------
1128 1227
1228 1327
1328 1427
1428 1527
1528 1627
1628 1727
1728 1827
1828 1927
1928 2027
[snip]
50000 50059 *
53610 53709
53710 53809
54210 54309
54610 54709
54710 54809
54910 55009
55113 55212
55214 55313
55338 55437

* - Administered port exclusions.

I confirmed that this is the issue by picking a port that hadn’t been reserved:

php -S localhost:50080
[Fri Sep 11 09:12:21 2020] PHP 7.4.8 Development Server (http://localhost:50080) started

(For me, the PHP web server would also start quite happily on port 80, incidentally. But you probably shouldn’t do that!)

People who have identified this issue tend to recommend disabling Hyper-V, excluding whatever ports you need and re-enabling Hyper-V. I’m nervous of that approach however, having spent a lot of time configuring Hyper-V networking and having seen this approach nuke networking in the past.

If you’re happy taking that approach, I suggest reading and understanding this Microsoft article. Personally, I prefer to approach this as follows.

Find gaps in port reservations

Hyper-V and Docker between them seem to reserve different sets of ports on each reboot. Helpful. You can look for gaps in the port reservations using the following method, but note that these gaps will not persist, without other measures. Here’s how to find the gaps:

  1. Run the netsh command above.
  2. Copy and paste the output into Notepad++ and use search and replace (in regular expression mode) to turn all the spaces into tabs – replace ( +) with \t.
  3. Copy and paste the result into Excel (which will now put all the ports nicely into cells.
  4. Use an Excel formula to identify gaps in the reserved ranges: =IF(A4=(B3+1), "continuous", "## " & TEXT(A4-B3-1, "0") & " PORT GAP ##")
    List of reserved port ranges, showing any gaps
  5. Where “PORT GAP” appears, there is a gap between the end port on that line and the start port on the next (this would be 2115-2379 in the example above, which is 265 ports, inclusive).

As you can see, this approach does find you an available port (unless something else has bound to it):

php -S localhost:2115
[Fri Sep 11 09:36:58 2020] PHP 7.4.8 Development Server (http://localhost:2115) started

The fix: reserve your own ports

Well, two can play that game. Once you’ve found a gap, you can permanently reserve it for your own use. I found the largest gap between 12970 and 49670, so decided to reserve a memorable slice of ports: 20000- 21000. The appropriate incantation follows, which does need to be elevated this time. Swap port numbers and range to suit your environment and requirements:

netsh int ipv4 add excludedportrange protocol=tcp startport=20000 numberofports=1000 store=persistent

You will see that the range is now showing as administratively reserved (indicated by the asterisk):

netsh interface ipv4 show excludedportrange protocol=tcp

Protocol tcp Port Exclusion Ranges
Start Port End Port
---------- --------
1215 1314
…
20000 20999 *
…
51490 51589
- Administered port exclusions.

And once again, I can use a port within my preferred range:

php -S localhost:20080
[Fri Sep 11 10:37:23 2020] PHP 7.4.8 Development Server (http://localhost:20080) started

This exclusion persists between reboots and protects your range from being stolen by Hyper-V or anything else.

Angry budgie featured photo by Егор Камелев on Unsplash

Hyper-V virtual switch creation woes

man rubbing his temples in frustration
UPDATE: You may wish to leap straight to the comment below from Craig S., in which he recommends a workaround for this problem. Many subsequent commenters have tried his suggestion, with success. Thanks, Craig!

For my own part, I moved to using an internal switch configured for NAT and run a pfSense VM in that network, for DHCP. Without doubt, Craig’s solution is easier to implement!

If you found this post helpful, you might also be interested in John Savill’s book, Mastering Windows Server 2016 Hyper-V. In the interests of transparency: this is an affiliate link. See my affiliate disclosure page for an explanation.

High-end laptop hardware does not approach enterprise-grade server quality. That’s my takeaway.

I’ve been wrangling with Hyper-V on a very nice ultrabook that has 32GB of RAM and a Core i7 processor (quad-core). Highly portable and useful for running multiple VMs, which indeed was the idea.

This ultrabook is also now sporting the 2004 release of Windows 10. Security-obsessed folks like me have taken a keen interest in the application sandbox features.

Since sandboxing is all about virtualisation, it made sense, I reasoned, to use Hyper-V rather than any other virtualisation platform. It’s native to Windows and this way there should be fewer potential conflicts between hypervisors.

And it was all going well until I attempted to create an “external” (bridged) switch. The creation process failed with the following unhelpful Virtual Switch Manager errors:

  • “Adding ports to the switch ‘[switch name]‘ failed. The operation failed because the object was not found.”
  • “Failed while adding virtual Ethernet switch connections. Ethernet port ‘{[insert long GUID here]}’ bind failed: Cannot create a file when that file already exists. (0x800700B7).”

Worse than that, whatever process had partially completed could not easily be reversed – clicking ‘cancel’ did not back out any changes. Rather, it left my laptop with networking utterly ruined.

I confirmed this by repeated attempts to create a vSwitch, following resetting the network stack and removing/reinstalling Hyper-V. Same results every time. If you find yourself in this mess, utter the incantation “netcfg -d” at an elevated command prompt and reboot. (You may also need to remove and re-add WAN miniport devices as described here, since this process can break existing L2TP VPN connections.)

Is there a way to fix this problem? I believe not, at present. It’s almost certainly connected to the attempted use of a WiFi adapter which Hyper-V can’t always support. My adapter is an Intel Wi-Fi 6 AX200, FWIW. No matter how expensive the laptop, you can’t assume that the hardware will provide across-the-board support for virtualisation. For that, you really need a decent-quality server-grade NIC.

All is not lost. Although it’s not possible to bridge the Wi-Fi adapter, the default network (which offers NAT) works fine. And internal virtual networks (which aren’t hardware bound) are also unaffected. Granted, this means you can’t route externally into your VMs, but that’s not the end of the world, particularly if you’re au fait with port forwarding (read also the comments at that link).

(Yes, you could also use an external network interface for this, though that reduces your laptop’s portability of course. Quality matters for that too – YMMV.)

If you have cracked this problem or can provide any further thoughts or guidance, please do let me know in the comments!

This article’s featured photo by Siavash Ghanbari on Unsplash