GitHub Copilot and Childlike Wonder

Around 9 months ago I was impressed when GitHub Copilot was making the rounds on HN and Reddit, so I decided to enroll in the technical preview on my personal computer. I suppose my application finally came to term as I was finally able to see auto-completions for the first time today while fiddling with some JavaScript. Given that my personal projects have generated an embarrassing amount of dust in the last few months (new project idea: dustoff, a tool that will periodically send you friendly reminders to work on personal stuff ideally without pointing a blunderbuss at you… hmm yes on second thought I think I know why these projects gather dust), I’m not exactly sure when the access kicked in. Now that I have it, my mind has been blown.

The first time I was truly impressed by the overloaded concept that is “artificial intelligence” occured in the late 2000s when I learned about Cleverbot. Other than moving to a less serif-y font like Google did in 2015, this site’s design has mostly stayed the same. I’m fairly sure the same can be said about the data and algorithm working behind the scenes as well. An article referred to in the accompanying Wikipedia page serves as the best background I could find for how this system works, touting a jaw dropping corpus of 3 billion conversational interactions. I imagine this data can be purchased by inquisitive researchers and supplements the website’s (polite) advertising revenue. Of course when I first experienced Cleverbot as kid, I hadn’t the faintest clue about how the algorithm actually worked or how the owners might make money to cover the cost of the website. It was magic and filled me with childlike wonder.

Now that I am less of a child and more of an adult (still a ways to go!), I am starting to understand more about what people mean when they say “childlike wonder”. Age sometimes colors early memories and experiences in a shroud of amazement and awe. I remember when I was a kid climbing up the rickety wooden stairs in my grandparent’s house; the space of the new environment seemed to stretch infinitely upwards. The novelty of bedrooms on the third floor or a tucked away closet felt exciting and ripe for exploration. So too did talking to Cleverbot about the weather, its own sentience, and favorite colors. Even though I quickly reached the actual limits of these things (my grandparents did not own infinite bedrooms and Cleverbot can only parrot brief quips as it loses the conversation faster than a disinterested date), that wonder stays with me.

The next time I had a magical experience with AI was when someone on Reddit recommended AI Dungeon, a text adventure game driven by GPT-3. My childhood fantasy of being able to build and live in my own universe was finally realized. I fought ogres and bandits, cast spells, and collaborated with the AI to bend reality into a story of my liking. The text adventure format lends itself particularly well to GPT’s output, and I found myself amused by the video game’s return to its ancestral home in Zork. I am excited to see what future games and entertainment will be produced as these models continue to improve. As with most of these projects, the limits become obvious once you look closely at the superficial prose and ensemble of pre-seeded plots and worlds. Contrary to some poetry and ghostwritten famous authors, this blog post features no machine written words out of my desire not to trick you (and because I can’t figure out how to easily test out GPT-3 outside of AI Dungeon, probably for the best considering its potential for abuse).

In his own words, GPT-3 seems posed to put a NYT columnist out to pasture. Today, while playing around with GitHub Copilot (which uses a modified version of GPT-3) I found within myself a strange tension between childlike wonder and adultlike nervousness. I admired the AI’s ability to write out real, mostly functional code from conversational comments. While the layers of technologies that constitute the modern computing experience rapidly devolve into magic even for me, I was astonished to see the layer I occupy as a programmer become abstracted away by a machine before my eyes with every press of the autocomplete tab. However, programming and software engineering are different things (after all some Googlers say “software engineering is programming integrated over time”) and just as GPT-3 fails to produce a consistent storyline so too does Copilot fail to produce complicated software. Copilot requires explicit instructions breaking down a complicated problem into logical chunks regurgitated from GitHub’s immense training material. Perhaps Copilot can be simply considered a natural language interface to StackOverflow and searching through code?

Just like my grandparent’s home did not really stretch hundreds of floors into the sky, the AI in our lives today has its limits once you look closely. In spite of this, it fills me with delight and childlike wonder to be alive and part of the endless exploration of a new frontier.

Addendum: the “copiloted” sample that inspired this post

I wrote the prompt commented above each function and then “tabbed” until Copilot finished. This exact code actually produces the below incorrect output because it fetched the top posts instead of top subreddits and the display_name key doesn’t exist for these.

(50) [undefined, undefined, undefined, ... undefined, undefined, undefined]

However, if we step in and change display_name to subreddit to get the associated subreddit of the post, we get the below correct output, although correctness can be debatable as the definition of “top subreddits” is ambiguous. Side note: Redditors really like to emphasize their communities with the F-word.

(50) ['movies', 'nextfuckinglevel', 'aww', 'memes', 'MadeMeSmile', 'Unexpected', 'meirl', 'oddlysatisfying', 'nextfuckinglevel', 'WatchPeopleDieInside', 'nextfuckinglevel', 'worldnews', 'WhitePeopleTwitter', 'MadeMeSmile', 'oddlyterrifying', 'politics', 'shitposting', 'memes', 'rareinsults', 'antiwork', 'pics', 'Damnthatsinteresting', 'meirl', 'BlackPeopleTwitter', 'HolUp', 'aww', 'shitposting', 'aww', 'ThatsInsane', 'HumansBeingBros', 'lotrmemes', 'technology', 'WhitePeopleTwitter', 'funny', 'place', 'shitposting', 'PoliticalHumor', 'dankmemes', 'tumblr', 'funny', 'dankmemes', 'Showerthoughts', 'OldSchoolCool', 'HolUp', 'memes', 'ukraine', 'clevercomebacks', 'CrazyFuckingVideos', 'gaming', 'AnimalsBeingDerps']

Copilot did a great job of stamping out the boilerplate that has always been an integral part of writing code. This particular code definitely needed some handholding and editing, and should not be run in production (it would be better to use the actual Reddit API and its error handling is nonexistent). I see this becoming a useful addition to the programmer’s toolbox in the future. To quote some of the above subreddit names, Copilot is “nextfuckinglevel” but also “oddlyterrifying”.

Fetch the Top 100 Subreddits

// fetch the top 100 subreddits
const fetchSubreddits = async () => {
  // async function
  const response = await fetch(
    // await the fetch
    "https://www.reddit.com/r/all/top.json?sort=top&t=day&limit=100"
  ); // fetch the url  and wait for it to return   the response
  const data = await response.json(); // wait for the response to be json and store it in data
  return data.data.children; // return the data
};

// print the top 50 of the subreddits
const printSubreddits = async () => {
  // async function to print the subreddits
  const subreddits = await fetchSubreddits(); // await the fetchSubreddits function
  const topSubreddits = subreddits.map(
    (subreddit) => subreddit.data.display_name
  ); // map the subreddits to display_name
  console.log(topSubreddits.slice(0, 50)); // print the top 50 subreddits
};

// console.log the top 50 subreddits
printSubreddits();

Addendum: the “copiloted” sample that inspired this post#

Fetch the Top 100 Subreddits#

Addendum: the “copiloted” sample that inspired this post

Fetch the Top 100 Subreddits