Speak Slowly, no Vowels, pls – Solutions

In my previous post, I described how I devised a programming problem for an internal company contest with the help of the ubiquitous ChatGPT. Also, ChatGPT provided a solution for the problem as part of the development process. Even more interestingly the language model provided a fictional context for justifying the problem.

The task was to write a function removeVowels( string text ) which takes an arbitrary text (arbitrary as long as it contains no uppercase letters) and returns the same string where vowels have been removed. Given the string “hello world”, the result should be “hll wrld”.

The implementation must not have:

  • loops
  • if statement
  • list comprehension

If you want to give it a try before reading the solution, stop here. Otherwise, follow me.

It’s not Cheating If It’s a Library

Library solutions usually are simple and very readable. It may feel like cheating since you are delegating the “hard work” to some hidden code written by someone else, but, in general, using a library is part of the correct mindset of the programmer – don’t try to reinvent the wheel when you can rely on mature and battle-proven code.

Matteo‘s C# solution and one of Nicola‘s python solutions use the repeated application of the “Replace” method onto the input string to put an empty string in place of a vowel. Here is the C# solution, the python one is very similar.

public class VowelRemover
{
    public string WithoutVowels(string input)
    {
        return input
            .Replace("a", String.Empty)
            .Replace("e", String.Empty)
            .Replace("i", String.Empty)
            .Replace("o", String.Empty)
            .Replace("u", String.Empty);

    }
}

Another useful library tool is the python translation table, as illustrated in another solution from Nicola (just a bit more verbose than Luigi‘s solution based on the same library functions):

def without_vowels_and_without_regex_3(s):
    vowels = "aeiouAEIOU"
    table = str.maketrans('', '', vowels)
    no_vowels_string = s.translate(table)
    return no_vowels_string

the maketrans function creates a translation table, each character in the first argument is replaced by the corresponding character in the second argument (in our case this functionality is not used and Nicola provides two empty strings); the third argument contains all the characters to remove from the input string.

Regular expressions are possibly one of the first options when it comes to searching and replacing text… it usually involves quite some headache and bad words shouted to the screen, before getting it proper.

def without_vowels_but_with_regex(s):
    no_vowels_string = re.sub('[aeiouAEIOU]', '', s)
    return no_vowels_string

This is Nicola‘s solution with regexp. The sub method search for a portion of the text contains in the third argument to match with the expression contained in its first argument and replace the matching text with the content of the second argument (possibly easier done than said). Paolo submitted the very same solution wrapped in a try/except statement (I’m not sure if it is needed or if it is a bit of defensive programming).

If you want to scream a bit less when developing regular expressions, you may use an online tool such as regex 101.

Before Jumping to an HLL…

Before jumping to a High-Level Language, Luca resorted to an ingenious shell script –

#!/bin/sh 
 
# Pass all arguments, take first 100 chars, delete unwanted stuff  
echo $@ | dd bs=1 count=100 2>/dev/null | tr -d 'aeiou'

Indeed I appreciate bash scripting – they are clean and tidy to the point. Possibly this is the shortest solution (including a length check for free).

Dissecting for those that are not fluent in bashism –

  1. echo $@ – takes all the command line arguments of the script and sends them to stdout. In this case, the stdout is captured by the pipe ‘|’ and sent to the next command:
  2. dd bs=1 count=100 2>/dev/null – dd is a very general and powerful data copying tool. In this case, Luca instructs dd to consider blocks of size 1, and take at most 100 blocks from standard input. Since no other options are provided the blocks are copied to stdout. The odd-looking “2>/dev/null” means that the shell should transfer all the errors printed by dd to the void. In other words, it makes dd print no error. The pipe transfers data to
  3. tr -d 'aeiou' – this is where the translation happens. tr can be used both by changing a character into another and for removing characters from its stdin, like in this case.

One Week in the Lab Saves 20′ in the Library

There is a dark side, a mermaid song that lures the programmers – it is the roll your own version, with the dubious promise of doing better or faster or with no bugs, in no time, but also to find new and unexplored paths and solution and boldly go where no programmer has gone before.

Contests like this make a safe opportunity to get on board for this kind of trek.

C++ Template Galore

Domenico proposed a compile-time solution based on template unrolling –

/**
 * @brief Solution to riddle "Elevators Everywhere"
 * 
 * @author Domenico Iezzi
 * @date 08.02.2023
 * 
 * This solution is based on the idea of compile-time string parsing.
 * Probably is hardly applicable in a real-world context, but I found it
 * interesting.
 * 
 * Since it is not possible to use a const char* as a template parameter,
 * the solution is to use a string container and recursively unpack all the 
 * characters.
 * 
 * The idea of this template string unpacking is taken from the following
 * stack overflow answer:
 * 
 * - https://stackoverflow.com/a/15863804
 * 
 * Then to filter out vowels I created a template specialization of the class
 * with the specific vowel to "skip" the recursion step.
 */

#include <array>
#include <cstring>
#include <string>
#include <string_view>
#include <type_traits>
#include <iostream>

constexpr size_t constexpr_strlen(const char *str)
{
    return *str == '\0' ? 0 : 1 + constexpr_strlen(str + 1);
}

struct StringHolder {
    constexpr static const char* get() {
        return "lamiastringa";
    }
};

template<char... chars>
struct PackString {
    static constexpr std::array<char, sizeof...(chars)> g_array{chars...};
};

template<typename container, size_t len, char... chars>
struct StringUnroll {
    using result = typename StringUnroll<container, len - 1, container::get()[len - 1],
                     chars...>::result;
};
template<typename container, size_t len, char... chars>
struct StringUnroll<container, len, 'a', chars...> {
    using result = typename StringUnroll<container, len - 1, container::get()[len - 1],
                     chars...>::result;
};
template<typename container, size_t len, char... chars>
struct StringUnroll<container, len, 'e', chars...> {
    using result = typename StringUnroll<container, len - 1, container::get()[len - 1],
                     chars...>::result;
};
template<typename container, size_t len, char... chars>
struct StringUnroll<container, len, 'i', chars...> {
    using result = typename StringUnroll<container, len - 1, container::get()[len - 1],
                     chars...>::result;
};
template<typename container, size_t len, char... chars>
struct StringUnroll<container, len, 'o', chars...> {
    using result = typename StringUnroll<container, len - 1, container::get()[len - 1],
                     chars...>::result;
};
template<typename container, size_t len, char... chars>
struct StringUnroll<container, len, 'u', chars...> {
    using result = typename StringUnroll<container, len - 1, container::get()[len - 1],
                     chars...>::result;
};
template<typename container, char... chars>
struct StringUnroll<container, 0, chars...> {
    using result = PackString<chars...>;
};

using start_unroll = StringUnroll<StringHolder, constexpr_strlen(StringHolder::get())>;

int main()
{
    std::string_view final_str{start_unroll::result::g_array.data(), start_unroll::result::g_array.size()};
    std::cout << "String without vowels: " <<  final_str << std::endl;
    return 0;
}

C++ was easy 25 years ago, but then they lost their sane mind. Nowadays C++ let the programmer express really powerful and complex concepts at the cost of sanity points.

Starting from the end – the final_str is the result of the computation and it is defined as a string view over the resulting string. The view is constructed by using a reference to the first character and the count of characters to use.

Both these arguments are computed by the StringUnroll template (which is aliased as start_unroll). StringUnroll is a variadic template, meaning that it may be instantiated with any number of arguments. Via partial specialization, the variadic parameter pack is processed one character at a time. There are five specializations dedicated to ignoring vowels, these specializations act as a pattern-matching engine, and if a character matches, the corresponding specialization is used.

If you are interested in this solution, I suggest taking a look at the StackOverflow question pointed to by Domenico.

I’ll Be Back – Charminator

This solution by Paolo deserves a nomination for the best name. Kudos! For sake of simplicity and brevity, I’m rearranging his solution a bit –

#include <string.h>
#include <stdint.h>

static char const *charsToDelPtr = NULL;

typedef void (*fn_handler_t)(char* src);

static void hastaLaVistaBaby(char* src);
static void obliterateString(char* src);

static const fn_handler_t fnHandler[] =
{
    hastaLaVistaBaby,
    obliterateString
};

static uint32_t getMax(uint32_t val1, uint32_t val2)
{
    return ((val1>val2)*val1 + (val1<=val2)*val2);
}

/**
 * Removes the first occurence of the selected set of characters from input
 * string, if found
 */
static void obliterateString(char * const src)
{
    size_t const charsToKeepNum = strcspn(src, charsToDelPtr);
    size_t const charsToDelNum  = strspn(src, charsToDelPtr);
    size_t const charsLeftNum   = strlen(&src[getMax(charsToDelNum, charsToKeepNum)]);
    size_t const charsToCopyNum = (0U != charsToDelNum)*(charsLeftNum+1);
    size_t const nextIndex = charsToKeepNum*(0U == charsToDelNum);

    /** Overwriting the characters to delete, if any, with the remaining elements.
     */  
    memmove(&src[0], &src[charsToDelNum], charsToCopyNum);

    /** if string is not empty, the remaining part of the string is processed;
     * else, the loop breaking function is called
     */
    (fnHandler[(0U < charsLeftNum)])(&src[nextIndex]);
}

/**
 * No operation. Dummy function used just to break the loop
 */
static void hastaLaVistaBaby(char * const src)
{
    (void)src;
}

void withoutVowels(char * const string)
{
    charsToDelPtr = "aeiou";
    (fnHandler[1])(&string[0]);
}

This solution is very interesting for several reasons –

  1. the use of library functions strspn and strcspn – to find the first occurrence of a vowel or a consonant in the string;
  2. the implementation of the getMax function which manages to compute the maximum of two integers without using the if statement / ternary operator (see also the contest on this subject);
  3. the use of a “computed” jump to select the proper function to complete the operation.

Note the use of char* const argument type which means that the function may change the content to which the argument is pointing, but cannot change the argument itself.

A Very Exclusive Or

Yet again in C, Luca‘s second submission follows –

#include <stdio.h> 
#include <string.h> 

#define NOTVOW(c,v) (((c)^v)>0) 
 
void done(char *, const char *, int); 
void next(char *, const char *, int); 
 
void done(char *d, const char *s, int sc) 
{ 
    /* Just end recursion */ 
    return; 
} 
 
void next(char *d, const char *s, int sc) 
{ 
    /* Put the current char on destination and move to next ... */ 
    *d=s[sc++]; 
    /* ... but keep it only if it is none of the vowels */ 
    d += (NOTVOW(*d,'a') & NOTVOW(*d,'e') & NOTVOW(*d,'i') & NOTVOW(*d,'o') 
& NOTVOW(*d,'u'))>0; 
 
    /* Let's recurse on the same function ... */ 
    void (*next_p)(char *d, const char *s, int sc) = &next; 
    /* ... but change it after 100 chars or EOS */ 
    next_p -= ((sc>=99) || s[sc]==0)*(&next-&done); 
 
    return next_p(d, s, sc); 
} 
 
int main(int argc, const char **argv) { 
    char d[101] = {0}; /* Worst case */ 
    next(d, (const char *) argv[1], 0); 
    printf("%s\n", d); 
    return 0; 
}

Luca approaches the loop using the recursion. In order to terminate the recursion a fearless subtraction between two function pointers (!) is employed. In other words, he sets a function target to next() and changes it by the delta between next() and done() addresses. I guess everything is fine until ptrdiff_t behaves properly when stirred with int.

The check for vowels is performed using the bit xor. Xoring two numbers yields 0 if and only if they are the same number. Comparing the result against zero is an effective way to discriminate whether the required character is met or not.

Good Old Karnaugh

This is my own submission. I started from the idea of finding a way to identify vowels in some exotic ways and remembered the Karnaugh Map we used to draw at the Uni. Consider a circuit with n inputs and one output. You have a truth table that defines the output value that corresponds to every possible input vector. Now the problem you (may) face is, what is the minimal circuit I need to implement this function?

Karnaugh maps, let you group inputs together so that a two levels network can be used to synthesize the function.

Back to the problem – the 8 inputs are the bits composing the character and the output must differentiate whether the character is a vowel or a consonant.

Here is the binary representation of the vowels – the corresponding truth table yields 1 for all these patterns and 0 for all the others.

ABCDE
01100001a
01100101e
01101001i
01101111o
01110101u

Since the differences are only in the lower five bits, I’ll ignore the upper 3.

The map of Kavanaugh is just a different layout of the truth table, with a special ordering of rows and columns:

AB/CDE000001011010110111101100
1000000010
0001000010
0101000100
1100000000

As you may notice, going from one column (or row) to the next one or to the previous one, only one bit changes in the inputs.

In order to simplify I reordered the rows (but you can read the map rolling over the edges like a toroid, or a PacMan world) so that there are 3 clusters of 1s:

  • column 001, rows 0x
  • column 111, row 01
  • column 101, row x0

So the equivalent function is defined as:

~A~C~D E | ~B C~D E | ~A B C D E

(where I used the ~ character to indicate the negation). Here is my complete solution –

#include <iostream>

bool isVowel( char c )
{
    return (c & 0b01110111) == 0b01100001 ||
           (c & 0b01101111) == 0b01100101 ||
           (c & 0b01111111) == 0b01101111;
}

bool removeVowels( std::string_view in, std::string& out )
{
    return in.size() == 0 ||
             (isVowel(in[0]) && removeVowels(in.substr(1), out)) ||
             (removeVowels(in.substr(1), out += in[0]));

}

std::string removeVowels( std::string_view in )
{
    std::string result;
    removeVowels(in, result);
    return result;
}


int main() {
    std::cout << removeVowels("elevators everywhere") << std::endl;
    return 0;
}

In function removeVowels(in, out) I used the logical expression shortcuts both for deciding whether append the character to the result and whether to terminate the recursion.

Lo and Behold – Haskell

Yes, I have a soft spot for functional programming, and receiving a Haskell submission was a heartwarmer. Luca strikes back with his third vowel remover –

import Data.Bits 
import Data.Char 
import Data.Bool 
 
filterVowel :: Char -> [Char] 
filterVowel c =  
    replicate ( fromEnum ( all ( /= c ) "aeiou" ) ) c 
 
withoutVowel100 :: [Char] -> [Char] 
withoutVowel100 s = filterVowel (head s) ++ withoutVowel (tail s) 
 
withoutVowel :: [Char] -> [Char] 
withoutVowel [] = [] 
withoutVowel s = withoutVowel100 ( take 100 s ) 

At first sight, Haskell is always a bit disorienting if you don’t know it. And I don’t know enough Haskell to feel at home.

Notably, the first line of a function defines the function signature – filterVowel, withoutVowel100 and withoutVowel are the three functions. As you may expect Char is a type to represent a character, while [Char] is a list of characters. So filterVowel :: Char -> [Char] describes a function that accepts a character and returns a list of characters.

From the line below the function is defined providing a name for the argument, the “body” of the function starts after the ‘=’ sign.

Let’s start from the outmost function – withoutVowel. It takes a list of characters and returns a list of characters. It has two definitions – one that applies when the list is empty (useful to terminate recursion), the other for non-empty lists. In Haskell functions are invoked without the parenthesis – i.e. to invoke f on argument x, you type f x (and not f(x) as you would in more traditional languages). So take 100 s is invoking the function take with arguments 100 and s (this is not really a precise description, but would do for this explanation). take 100 s – returns the first 100 items from s (or the entire s if it contains less than 100 items). The result is passed to the function withoutVowel100.

withoutVowel100 is the loop engine of the transformation – applying the filter to the head of the input sequence and recursively iterating on the tail (note that withoutVowel and withoutVowel100 are co-recursive, speaking of which… co-recursive is also a wonderful podcast speaking about programmers, programming, and even functional programming, highly recommended).

Now filterVowel is the core of the transformation and the more complex function of the pack. Let’s start from the outside: replicate ??? c. This means taking character c and replicate it ??? times. So it is enough to get 0 or 1 in place of ??? to perform the filtering. More precisely we need a 0 when c is a vowel and 1 when it is not.

fromEnum a b (in this case) converts a boolean value a into an integer – 0 for False and 1 for True.

Expression ( all ( /= c ) "aeiou" ) is the last step in understanding the code. all is a function that takes a predicate (i.e. a function that returns a boolean), applies it to the elements of a set, and returns true if the predicate is true for all of them or false if there is at least one element that doesn’t satisfy the predicate.

The predicate is /= c. This is a nice feature of Haskell – all functions accept just one argument. If you need more than a single argument, just make a function that returns a function that accepts the second argument. So the inequality operator /= accepts a single argument (c in this case) and returns a function that accepts an argument, compares it with c , and returns True if they are different. The name of this technique is called currying and in other languages, you would write in a very different way:

auto notEqual( char c ) {
  return [c]( char x ){ return c != x; }
}

...
if( notEqual( 'a' )( 'b' ) )...

If you like C++ or:

def notEqual( c: Char )( x: Char ): Boolean = c != x
...
if notEqual('a')('b') ...

If you prefer Scala.

As you can see, C++ code does not scale that well, always better than C, in which you can’t write code like this at all.

Conclusions

When I asked ChatGPT to invent a programming riddle I had very high expectations… It was quite the beginning of my experience with this AI and bought into the hype.

After a few interactions, I got something that could actually work as an interesting programming riddle, though I was a bit skeptical that many different solutions could be found. I had to say that human creativity still beats generative model languages and got several smart solutions to the problem.

Have you found an alternative approach to the solution?

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.