From Perl to Go: A Developer's Journey to Mastering Data and Concurrency

From Perl to Go: A Developer's Journey to Mastering Data and Concurrency

The Evolution of a Programmer

I embarked on my programming journey around the age of 10. The year was 1985, and my weapon of choice was Commodore Basic. I had received a Commodore VIC-20 for Christmas that year. At first, I just played the cartridge games like Gorf and Galaxian. In the box of the computer was a subscription card for Compute! magazine. My mom ordered the magazine, and a couple of months later, the magazine came. I fell in love with it instantly. Every issue had a program for the VIC 20, along with the same program for other machines like the Apple II+, the different iterations of Atari home computers, and a few others.

In most cases, my mom would type in the programs while I was at school, and I would debug them when I got home. There were almost always bugs when she was finished as those types of programs were big and, in a lot of cases, used PEEK and POKE statements to work directly with memory locations. Not to mention the last 100 lines or more of code may be nothing but data for the program so you were just typing something like 250 DATA "BLAH BLAH BLAH" or whatever method your language used. These were great times, saving our programs to a cassette tape.

I was deeply interested in that for a few years, then laid off of computers for a few years. Then, one day, my uncle decided to upgrade the accounting system he used for his business and gave me an Epson QX-10 and a copy of Microsoft Basic. I did all kinds of things with that computer. It was so far outdated by the time I got it, you couldn't buy software or peripherals for it. But, with a copy of Basic and a lot of time, that is what I did with it the most.

Over the years, I dabbled in GW-Basic, Turbo C, Turbo Pascal, Deplhi, Visual Basic, enough ASM to be dangerous, and really whatever else I could get my hands on. I liked tinkering with any language but had yet to discover Perl. In 2001, that changed.

Enter the Professional World

When I hit the professional world of Linux in 2001, Perl was one of the first tools I learned; I learned it even before I wrote my first Bash script, and I still prefer Perl over Bash for simple jobs. With its motto, "There's more than one way to do it," Perl became my trusty sidekick. It was flexible, powerful, and perfect for my sysadmin tasks and web development projects. For years, Perl was my go-to language, and I wielded it with finesse, using its powerful text processing capabilities and somewhat quirky yet lovable syntax. Perl is so versatile that it is often called the "Swiss army chainsaw" and the "duct tape that holds the Internet together". But with the ability to do speed-of-light prototyping and development with Perl comes a few downsides that you don't realize until you need your program to do really heavy lifting.

As the complexity of my projects grew, I found myself dabbling in PHP and JavaScript. PHP for server-side scripting, and JavaScript, which evolved into Node.js, for more dynamic and interactive web applications. On a few rare occasions, touches of C# and tons of HTML, CSS, and browser Javascript. These languages served me well, each adding a new layer to my understanding of programming paradigms.

The Go Challenge

Then came the need for something more efficient, something that could easily handle concurrent tasks and provide the robustness required for modern applications. Enter Go.

Go, with its promise of simplicity and performance, seemed like the perfect fit. But as I delved deeper into this new language, I encountered concepts that were both familiar and alien. Structs, pointers, and references in Go were not like anything I had dealt with in Perl or JavaScript. Though I said earlier that I liked to tinker with any language, I never said I was any good at them. I understood how they worked and could easily modify existing programs, but writing from scratch in any structured language such as C++ or even JAVA was not my strong suit. So, pointers and structs were generally new to me. Perl references are as easy as cake. I knew what structs were, but having to slow down and figure out my data before throwing it at a program was frustrating when I had spent so many years writing Wild West code.

Structs: The Building Blocks

In Go, structs are the fundamental building blocks for creating complex data structures. Unlike Perl, where data structures are more fluid and less strictly defined, Go requires you to explicitly define the types of data you are working with. This was a significant shift from the dynamic typing I was accustomed to in Perl and JavaScript.

In Perl, you might use a hash to represent a complex data structure:

my %modem = (
    MACAddress => '00:11:22:33:44:55',
    IPAddress  => '192.168.1.1',
    Status     => 'active'
);

In Go, you use a struct, which provides a more rigid and clear definition:

type Modem struct {
    MACAddress string
    IPAddress  string
    Status     string
}

modem := Modem{
    MACAddress: "00:11:22:33:44:55",
    IPAddress:  "192.168.1.1",
    Status:     "active",
}

The explicit nature of structs ensures that each instance of a Modem has the same fields, reducing the likelihood of runtime errors due to missing or misspelled keys. Unlike Perl, when working with Go, you cannot assign a value to an object unless it is defined in the struct. Additionally, you cannot assign a variable of a different type than what is defined in the struct. If you define an int64, you cannot put a string there or anything else other than an int64.

At first, I hated this. I had to scrutinize my functions and my data to ensure everything matched what I was telling the struct it would be. Nearly all data is dirty, and with Perl, it was easy to skip over or overlook these discrepancies since the variable would accept whatever you threw at it. While you can use type checks in Go to skip over incorrect data, you cannot assign it if the type is incorrect. However, the more I worked with it, the more I started to appreciate this strictness. It forced me to look deeper at my data, understand it better, and build only the structs I needed. When moving from a loosely typed language to a strictly typed one, you start thinking differently about data management and structure.

And then we need to find, and use our strictly engineered data, pointers to the rescue.

Pointers: The Direct Address

Pointers in Go was another paradigm shift. In Perl, we use references to create complex data structures like arrays of hashes or hashes of arrays. JavaScript variables, too, are references to objects or primitive values. But Go’s pointers are a different beast altogether.

In Perl, you might pass a reference to a subroutine to modify a complex data structure:

sub update_status {
    my ($modem_ref, $status) = @_;
    $modem_ref->{Status} = $status;
}

update_status(\%modem, 'inactive');

In Go, you use pointers to achieve similar behavior but with more explicit control over memory management:

func updateStatus(modem *Modem, status string) {
    modem.Status = status
}

updateStatus(&modem, "inactive")

The *Modem parameter is a pointer to a Modem struct. This means that any changes made to modem.Status within the function affect the original Modem instance. Understanding pointers and their dereferencing was crucial in mastering Go's memory management and efficient data handling. One of the advantages of using pointers is that you can also create a new instance of the Modem structure that is separate from the original. In my current project, I use this method for in-memory calculations between interfaces, ensuring that both sets of data are structured exactly the same.

References: Dynamic and Fluid

Perl’s references are dynamically typed, allowing for flexible and rapid development. You can easily create complex nested data structures without much upfront definition. However, this flexibility can lead to less predictable code and harder-to-track bugs.

As mentioned above, any of the keys in the Perl hash reference below can hold any type of value you want to throw at it. Int, string, hex, octal, another hash, another array, you can put whatever you want whenever you want. This is a blessing and a curse.

my $modem_ref = {
    MACAddress     => '00:11:22:33:44:55',
    IPAddress      => '192.168.1.1',
    Status         => 'active',
    ModemDetails   => {
          Manufacturer => 'Netgear',
          Model        => 'CM500',
          Firmware     => 'V1.01.14'
          BWUsed       => 'Whatever I want to put here'
    }
};

In contrast, Go’s approach with pointers and structs enforces a more disciplined and predictable code structure. This rigidity might seem cumbersome at first, but it leads to more maintainable and error-resistant code.

// Define a nested struct for ModemDetails
type ModemDetails struct {
    Manufacturer string
    Model        string
    Firmware     string
    BWUSed       int64
}

// Define the Modem struct with a nested ModemDetails struct
type Modem struct {
    MACAddress  string
    IPAddress   string
    Status      string
    Details     ModemDetails // Add the nested struct here
}

func main() {
    // Initialize a modem with nested struct details
    modem := &Modem{
        MACAddress: "00:11:22:33:44:55",
        IPAddress:  "192.168.1.1",
        Status:     "active",
        Details: ModemDetails{
            Manufacturer: "Netgear",
            Model:        "CM500",
            Firmware:     "V1.01.14",
            BWUsed:       2304829754
        },
    }

Explanation:

  • A ModemDetails struct is defined to hold additional details about the modem.
  • The Modem struct is updated to include a Details field of type ModemDetails.
  • The main function initializes a Modem instance with nested ModemDetails and demonstrates how to access the nested fields.

As you can see, Go’s approach with pointers and structs enforces a more disciplined and predictable code structure. This rigidity might seem cumbersome at first, but it leads to more maintainable and error-resistant code. In contrast, Perl is more flexible; for instance, it doesn't care what you put in hash keys. As you can see in the Perl example, I put a gibberish string in the BWUsed key in the Perl hash. Unless you’ve implemented your own type checks, Perl will proceed without complaint. Go, however, would immediately raise an error if you tried to assign a string to an int64 field. This strict type enforcement is crucial, especially when ingesting data into systems like OpenSearch, which expect specific types. If you try to ingest a Perl hash with a mismatched type into an OpenSearch index that expects an integer, it will reject the document or potentially crash your program. These are important considerations regardless of the language you are using. Always think about all aspects of your code, from input to output.

Goroutines vs. Perl Forking

One of the most significant advantages of Go over Perl is its concurrency model. Perl uses forking to handle concurrent tasks, which can be resource-intensive and complex to manage. Forking creates a separate process for each task, requiring significant overhead for process creation and communication.

In Perl, forking is done using the fork function. A fork call is processor-expensive because it makes a call to the system kernel to create a child process that is a duplicate of the parent process. The child process runs as a separate instance with its own memory space. This is a Unix-based approach, where the operating system handles creating and managing these processes.

When the operating system has to handle the forking process at the kernel, there are many context switches. These generally involve flushing pipelines, duplicating memory, and jumping between different kernel protection rings. More or less, your ring 3 program has to request permission from the kernel to execute a command in ring 0 and then copy the entire program into a new memory space. This is expensive.

Below is a simple Perl example.

use Parallel::ForkManager;

my $pm = Parallel::ForkManager->new(10); # up to 10 processes

foreach my $modem (@modems) {
    $pm->start and next; # do the fork
    check_modem_status($modem);
    $pm->finish; # terminate the child process
}

$pm->wait_all_children;

While this works, it can quickly become unwieldy for large-scale applications. Managing child processes and inter-process communication adds complexity and overhead. Each forked process has its own memory space, so any communication between parent and child processes must go through Inter-Process Communication (IPC) mechanisms like pipes, message queues, or shared memory. These mechanisms can be slow and add significant overhead to the system.

As mentioned above, the OS has to handle the scheduling and context switching between processes, which further reduces efficiency. Forking and threading were not Perl's primary design goals, originally created for text processing and system administration tasks. As a result, Perl's concurrency capabilities are not as advanced or efficient as those in languages specifically designed with concurrency in mind.

In contrast, Go’s concurrency model is lightweight and built into the language. Goroutines allow you to run functions concurrently with minimal overhead, managed by the Go runtime. Goroutines are not OS-level threads; they are managed by the Go runtime, which multiplexes thousands of goroutines onto a smaller number of OS threads.

func checkModemStatus(modem Modem, wg *sync.WaitGroup) {
    defer wg.Done()
    // Check modem status
}

var wg sync.WaitGroup

for _, modem := range modems {
    wg.Add(1)
    go checkModemStatus(modem, &wg)
}

wg.Wait()

Goroutines are much more efficient than creating new processes. Compared to OS-level threads, they consume less memory and have lower startup times. The Go runtime handles the scheduling of goroutines, optimizing their execution and resource usage. This makes Go particularly well-suited for applications that require high concurrency and scalability. Unlike the Perl forking, which has to request permission from the kernel to execute a ring 0 command, Go's Go routines run entirely in userspace, ring 3.

In summary, while Perl’s forking model works for handling concurrent tasks, it is not as efficient or scalable as Go’s goroutines. The complexity and overhead of managing child processes and inter-process communication in Perl can become a bottleneck in large-scale applications. Go’s concurrency model, with its lightweight goroutines and efficient runtime scheduling, offers a more robust solution for modern, high-performance applications.

The Project in Action

Yesterday, I reached a significant milestone in my project: getting the pre-calculated CMTS interface and operational data writing into OpenSearch. I built some Grafana dashboards around this data, providing real-time insights into the network's performance.

Today, I have started working on tracking about 2300 modems. This task is more challenging because we need to keep a memory object for each interface on each modem. The Go program calculates differences and operations per second for bandwidth, packets, power differences, and other dynamic data from the CMTS interfaces and modems. I chose to do the calculations in the software instead of writing complex queries. This program runs as a daemon with built-in cleanup and refresh timers to avoid tracking interfaces that are down or modems that have left the system.

A Comparative Reflection

Perl’s flexibility allowed for rapid development, but with that flexibility came the risk of less disciplined coding practices. Even with use strict;, the language did not enforce the same level of strictness that Go does. In Go, type safety and explicit data structures promote cleaner, more maintainable code.

JavaScript’s dynamic nature is powerful, especially for web development, but it can lead to unpredictable behavior if not carefully managed. Go’s statically typed nature brings predictability and reliability, essential for building scalable systems.

The Journey Continues

Digging more deeply into Go has been like rediscovering the essence of programming. The concepts of structs, pointers, and type safety have reinvigorated my approach to coding. They have challenged me to think more deeply about how I structure and manage data, leading to more efficient and robust solutions.

As I continue this journey, I look forward to sharing more insights and experiences. The landscape of programming languages is vast and varied, and each new language offers a unique perspective that enriches our understanding and skills.

--Bryan