Vikram Aggarwal

Sunday, July 28, 2024

Book Review: "On The Clock" by Emily Guendelsberger

Summary: "On The Clock" by Emily Guendelsberger tells you about the lives of hourly wage workers. It is a stunning book. Please read it.

Most of the books I read come recommended from friends or book reviews. Most of them get glowing reviews, so it is no surprise when the books are great.

Every so often I pick something up at random, and am pleasantly surprised at its quality. Very rarely, I pick something at random and it is profound and enlightening. This book was a random pick, I expected little.

"On The Clock" by Emily Guendelsberger is profound, enlightening and much more. Emily (I'm Californian, we're on first name basis with everyone) writes about the lives of front-line wage workers in three different companies. She works for a month at an Amazon fulfillment center, then a month at a customer call center, then a month at McDonalds, all in the US. She is honest about her past as a journalist, and observes her life, and the lives of her co-workers.

A lot of ink is spilled on the unfairness of capitalism, most of it from a safe distance, and from a high perch. Most people talking about low-paying jobs have never worked in those jobs themselves, or worked in those jobs decades ago, when conditions were better. CEOs who talk about how well their companies are run have never worked as wage-workers in any company, let alone their own companies. Most consumers have never done these jobs. Emily is the first person who braves these jobs to tell you how terrible they can be. Here's a sample of her message, via an article she wrote on the topic of fast-food worker burnout. The burnout is both metaphorical and literal. A 2015 survey found that 79% of fast-food employees had been physically burnt on the job in the prior year, some more than once.

Burnout is a trendy, rich-person luxury. In these jobs you see far worse. You see income insecurity up close. You see employees choosing to work when sick, you see employees support each other during difficult times, and the unending misery of the work. Workers pushed to their physical limit, on their feet, and walking all day. But hey, single-packages of painkillers are provided by the employer. For free! Or workers stretched to capacity, handling support call after support call, with short breaks in the middle, handling verbally abusive callers. Overworked, pushed, paid too little. No health insurance, no money and no time, with their life revolving around their job. No savings, no retirement, it is a bleak life, and one that most of us don't have an experience with.

Ok, so obvious question, "If these jobs are so bad, why don't these workers do something else...?" The short answer is that all jobs are like this at the bottom. No employer has to provide wholesome, nourishing jobs, so they don't. You can choose between a physically demanding job, or a verbally abusive job, but they are all terrible in some way. Attrition is high, workers are expected to

I'd bet you hate your cell-phone or Internet provider? Why haven't you switched providers? Because the other company is just as bad, or you don't have a choice of providers. Same concept.

This book made me ponder and lament, well past I had read it. I am reminded of Emily's idea of doing a tour-of-duty as a wage worker. Every manager, every CEO, every elected representative should have do a contemporary hourly job. Having flipped burgers in 1980 is not enough, the US employment market has come a long way since then. This book is similar to "Nickel and Dimed", by Barbara Ehrenreich, which is mentioned in the introduction. Things are different this time around, as more companies outsource their dirty work. So while Emily works for McDonalds, Amazon and AT&T, she is never directly employed by them. This way the companies can, correctly, claim that they treat their employees well. The Amazon delivery person is rarely employed by Amazon too. It's the invisible tier of jobs, too shitty to even mention in a company's press reports. Everyone wants a photo-op with "Real America", and then go back to their mansion, their enormous paycheck and their private jet.

This book has the potential to shape the national debate on capitalism, income disparity, and difference in compensation between front-line workers and CEOs.

Please read this book, it is a revelation.

Monday, July 08, 2024

The Oversubscribed American Child

Kids in the Bay Area are oversubscribed, both in summer holidays and on evenings and weekends of school-days. During the school year, children spend time in after-school activities, coming home shortly after 5pm, when the parents are done with their work. Sorry, I misspoke. Children don't even come home themselves, they are driven back by parents.

In the evenings, kids are busy with soccer, football, basketball, piano, chess, martial arts. Sometimes all of the above. The parents are shuttling the kid back and forth to these events, and are themselves unavailable for any social commitments of their own. As a result, the parents social life is restricted to meeting the parents of other kids at all these events.

All these activities are enriching. When you were a child, you did not have access to a tennis court, and your child loves tennis. There is no harm in this one activity. Then again, your child also has an interest in singing, dance, basketball, and swimming. Which activity can we deny them, possibly closing the door to a life-long joy?

Near the swimming pool, you see parents coaxing, encouraging, threatening, cheering as the kid swims laps. The 10 year old is getting ready for the swim team. This happens thrice a week, after school, or before school. There are swim meets, there are practices, and the parent tugs along for the ride.

Sure, the kids have friends, but how deep can a friendship be when you are only available on specific time slots, a few times a week. The swim kids are the kids' friends. Your child needs to make the swim team, to retain their social life.

No longer a tiger mom, there is now a tiger child. In fact, tigers all around: the parents are themselves ambitious, chasing a promotion at work, and a soccer victory for the child, sometimes from the same soccer field. Where do you draw a line between the ambition and desires of the child and that of the parent?

A family just moved in and their kid would play on a few occasions with other children in the neighborhood. A few weeks later, the school year starts. In the evening, this kid was invited to join in a game. Alas, the kid is busy, and is only available from 4:30-5:30pm. On Fridays. That was the only time in the entire week that this kid is free to be a child. My full-time job keeps me busy, and yet, I have more free time.

Ah, but you are independent, and think for yourself. Your kids are free-range. Tell me, how free-range can your children be, when all other children are at camp? During summer vacations, your children are bored stiff at home; as everyone around them is busy all day, unavailable, away at soccer camp, swimming camp, then music camp, week after week for the whole summer? You child begs you to enroll them in the same camp as their friend, that is the only time they can both be together. Parents mail out schedules to see if kids' friends can enroll in similar camps over the summer. Stay free-range and your child has nobody to play with. Enroll your child in some camp, and at least they'll have some social life. And all that enrichment.

Soon, there are no children to be seen. Can you can tell if school is in session from walking around your neighborhood? Does anything change from May to July?

Everyone knows the suburbs are criminally empty. Distances are vast, and every friend is twenty minutes away on foot. The library, the ice-cream shop, ... everything is far. Your child cannot cross that six-lane arterial road by themselves! No bike lanes, and anyway they'll get hit by an SUV that cannot see them. Soon enough, they have to be driven. There are some lucky cities and some fortunate locations, but the vast majority is not navigable by 12-year olds.

"When I grew up", ... starts every critique. This is the usual refrain because parents have access to just two childhoods: their own, and their kids. All conclusions are drawn from these two data-points, until a grandparent comes by to cast a tie-breaking third. Was your generation different from this? What about the grandparents?

I have no doubt that this will lead to many world-class artists, musicians, and athletes. The next Michael Phelps is swimming near you, as their parent yells from shore. But this small amount of Great comes at the cost of a lot of Good. That one Olympic winner will cost us a generation of lost childhoods. Thousands of children will grow up to be detached from others: lonely, anxious, uncomfortable in their skin. Cellphones, social media, TV, gaming will get blamed. Why are kids so lonely these days? Because they don't have friends, and have never learned to cultivate real life friendships. They were ferried around in the backs of SUVs, in comfortable car seats, from the chess club to the soccer meet, and then to ballet. Everything was regimented, sanitized, clinical.

The real world is messy. Being an adult requires comfort with this mess. True friendships grow over time, born in experimentation, nurtured by shared experiences and bad collective judgement. You make a friend when you help the kid with a sprained ankle. When you disagree and fight over how you were treated. You realize you have a true friend when he helps you jump the fence and lies about it. You don't need 'play dates', you need play. You grow to be an adult through the risk-free life of a child, discovering the contours of the world.

Sunday, June 23, 2024

Encrypted, private github repositories

Github allows unlimited private repositories. Sometimes, though, you need additional privacy. I store many of my passwords and other sensitive information inside plain text files, and would like to version them with git and then upload them to github. Even with private repositories, these text files are available to github employees, stored in cleartext in backups, available to law enforcement, etc. I'd like these files to be stored with encryption at Github, to ensure that only I have access to these files. While I could do this with my own git instance, I like the availability, backups and worldwide reach of github, and I want to avoid maintaining one more thing.

This tutorial shows you how to have encrypted repositories at Github. In an encrypted repository, the contents of the repository at Github are encrypted with GnuPG. On all machines outside your own, these files are encrypted and secure. Nobody, not even employees at Github can examine the contents of the files. Without your gpg secret key, the entire repository at github is opaque to people.

The repositories can be private or public. For an encrypted repository, there is little difference between a public and a private repository, as the contents are not visible without your secret gpg key. The benefits to having a private repository is that nobody can tell the repository exists, and you have an additional hurdle of an ssh key to download the repository. The benefit of a public repository is that you can have many contributors, and you can git clone the repository from anywhere, as far as you have the gpg secret key to decrypt the contents.

For this mini-tutorial, you should already have git working with ssh keys. If you don't, follow these instructions. This tutorial is for Linux.

1. Packages: On Linux, install the relevant packages:

apt install git-remote-gcrypt gnupg

2. Github repository: Create a private repository in github. I created an example public repository here though you might want to create a private repository.

export PROJECT=username/reponame.git

3. Create gpg keys: if you don't already have one using

gpg --full-generate-key

Remember the passphrase used here. If you lose/forget this passphrase, you will not be able to decrypt this repository, ever. The keyring is usually stored in .gnupg/pubring.kbx

List the keys, and write down the key-id:

gpg --list-keys

export KEY_ID=<what you got above>

For example, my KEY_ID is BAB59DAD24A79C69D4D3E5451482830DA6C99CAA.

4. Configure git: Go over to your existing git repository (or initialize one, and then add files to it)

Specify this ID as the one that you will encrypt with:

git config --global user.signingkey $KEY_ID

And as the participant allowed to contribute to this project

git config remote.origin.gcrypt-participants $KEY_ID

5. Remote: Add the upstream remote like this:

git remote add origin gcrypt::git@github.com:$PROJECT

git push origin master

That should prompt you for your gpg password you entered when creating your key. Now, you have a git repository that is hosted at github, and the contents are encrypted. This is an example of what you should see. At this point, the repository could be private or public, since nobody can read the contents unless they have your GPG key.

6. Clone: To clone this repository on another machine, remember to add gcrypt:: before the git@github.com, and the new machine should have both the ssh keys (to clone a private repository) and the gnupg keys (to decrypt the contents of the repository)

git clone gcrypt::git@github.com:$PROJECT

For example, I do:

git clone gcrypt::git@github.com:youngelf/example-encrypted-repository.git

Done right, it should request the gpg password for the private key, and show you this output from gpg

gpg: Good signature from "Your name <email@example.com>"

7. git pull, git add, git push

At this point, this is a normal git repository, you can use git commands. I find it more convenient to use the git clone'd directory, as it sets the configuration for branch.master.merge and branch.master.remote correctly.

Friday, August 12, 2022

Python, pip, venv, package managers

Upgrading an old computer from one Ubuntu LTS version to the next: 20.04 to 22.04,...

I came across an annoying error when running do-release-upgrade:

AttributeError: 'UbuntuDistroInfo' object has no attribute 'get_all'

This shows up in other forms, folks online complain about errors of the kind:

AttributeError: 'DistUpgradeController' object has no attribute 'tasks'

Now the usual reaction is to blame Ubuntu's release. Free Software, and it's lack of reliability, ...

Except that I've upgraded several computers from 20.04 to 22.04, so I know that the process works well. Quite unlike my usual nature, I decided to investigate.

The fix: pip uninstall distro_info

I'll spare you the full investigation, because life is short. The end-result is that this is caused by Python's incredibly brittle package management. On an Ubuntu system, you have both packages installed by the package manager (dpkg/apt) and by pip. And pip can install software for the whole system, when run by root. In this case, the package 'distro-info' was installed by root using pip, and that overrides whatever distro_info object exists in Python. Since the Ubuntu installer expects certain behavior from the class UbuntuDistroInfo, the package install fails early.

The fix, again: pip uninstall distro_info

Having solved it, let's reflect on the incredibly broken nature of Python packaging. Why was a user unable to upgrade their Ubuntu system, and why was pip at the bottom of the mess?

First, the package namespace should be unique. The package distro_info should either not conflict, or Ubuntu's package manager should create a unique name that doesn't conflict. It should be called 'ubuntu_distro_info'. If these are maintained by the same team, then new versions should be very careful removing methods from previous versions. Software versioning is complicated, but there are many lessons learned over the years.

Second, the parallel universes that exist between pip and dpkg/apt are a mess. root should not be allowed to pip install packages.

Third, venv is a crutch. Package maintenance is difficult, and dependencies are tied specific version of packages, and so you need virtual environments to create each parallel universe. This gets the job done, but pushes a lot of maintenance headache to the end-user of the packages. Now, in addition to the dpkg/pip mess, you individual pip messes in subdirectories scattered all over your filesystem.

Fourth, python versus python3. In Debian-based systems, python and python3 have different namespaces for packages, and this further confounds the issue. So if you install python-libraryname, you might also have to install python3-libraryname.

Finally, python errors are broken. This is a pet peeve of mine. Python's errors are the bottom of the stack, and the full stack trace. But these are completely unhelpful in explaining what might be involved. If package versions are in flux all the time, python software should start off verifying versions and sanity. Imagine if this failure had happened half-way in the install process, leading to a broken machine. Fragile systems need defensive programming.

All this leads to an incredibly brittle software setup: where packages are frozen in time (in venv directories), but also no way of querying what versions of software exist on the system. Once an environment is set-up, there is little confidence that it will continue working.

Sunday, February 27, 2022

Christone "Kingfish" Ingram

I recently heard an artist called Christone Ingram, who goes by the stage name "Kingfish". He plays the blues on guitar. It blew me away: the guitaring, the singing. After many years, I have come across a contemporary artist who is awe-inspiring.

It all began when I chanced upon an album called "662" by someone holding a Stratocaster. I wasn't expecting much. What could have been just another blues musician turned out to be a genius of our time, for each song was delightful.

If you like blues, or classic rock, give it a listen.
This Youtube video, for example, gives you some idea of his level of skill. You can also preview his music on his website: the album I heard is called "662", and that's his second album. His first album is called "Kingfish".

I find 'influencers' hollow. Many current influencers are popular solely because they're popular. Some of them might be good looking, but they have no skill beyond that accident of birth. For every influencer that takes our attention, there are real artists, folks with skill, folks working hard. We ought to devote our time on folks who have skill, who advance art.

Sunday, February 20, 2022

Book Review: Systems Performance 2nd ed, by Brendan Gregg

Summary: "Systems Performance", by Brenden Gregg covers end-to-end performance for Linux-based systems. If you run Linux software, you will learn a lot from this book.

From its rough and loose beginnings, Linux has become a force in the commercial world. Linux is the most pervasive, most readily available system that you can experiment with. Starting from the $10 Raspberry Pi to the multi-million dollar Top 500 supercomputers, Linux runs on everything: laptops, desktops, phones, cloud instances.

Despite widespread adoption, there is little documentation to get a thorough understanding of system performance. I routinely see veteran engineers struggle with performance bottlenecks. Folks revert to running 'top', and trying to infer everything from its limited output. The easy answer is to over-provision hardware or cloud instances to cover up sloppy performance. A better answer is to get a solid understanding of end-to-end performance; to find and eliminate bottlenecks.

"Systems Performance", by Brenden Gregg covers the entire area of end-to-end performance of all components: CPU, RAM, network, block devices. The second edition of this book is focussed on Linux, and covers many tools and utilities that are critical to understanding every level of the stack. If you have written any software on Linux, or intend to write any software on Linux, you need a copy.

First, the good:

There is an overview at the beginning, and then a deep-dive on specific system resources (CPU, RAM, block devices, network). You read the overview to understand the system at the top-level, and based on your system and bottlenecks, you can read the in-depth sections.
There's coverage of pre-BPF tools (perf, sar, ftrace) in addition to the newer BPF-era tools like bcc and bpftrace. 'perf' probes are easier to use, and available on more architectures, for instance. BPF-based tools can be a slog to install, or might not have good support on fringe architectures and older kernels. No single tool can cover every need, and good engineers need to understand the full tool landscape. This book provides a wide overview of most tools.
The book provides a methodical look of the full system, with tools targeting individual levels of the system components (example diagram). This process helps isolate the problem to the correct component.

The not-so-good:

The book is repetitive. Since it expects some readers will start reading a deep-dive, it repeats the USE methodology at the start of most chapters. Folks reading it cover-to-cover will find themselves wondering if they have seen the material already.
Print quality is worse than the previous edition. The fonts are thin and dim, the pages bleed through, and the graphs need more contrast. The first edition was a high quality printed book, and the second edition is worse in this department. Since this is a reference book, a physical copy is better than an ebook. You will mark pages, put sticky notes, and highlight tools that are more pertinent to your work. Luckily, the binding holds up to heavy use.
I really wish the third edition comes with better print quality, and is hard-bound.

Every software engineer should be familiar with end-to-end performance: how to think about it, how to locate trouble spots, and how to improve the system. This book will give you a firm foundation of performance that should help on most desktop, server, and cloud systems.

You will probably not get this understanding from a scattershot reading of online documentation and Stack Overflow articles. Online articles are limited in scope and accuracy, and don't provide a comprehensive view of how to think about performance. This topic deserves a book-length treatment.

Image Courtesy: Brenden Gregg