Working with ChatGPT on my blog

No I am not asking ChatGPT to write blog posts for me, I will never do that. I have added “Written by Human not by AI” logo to my blog, which means “Use this badge if your article, including blog posts, essays, research, emails, and other text-based content, contains less than 10% of AI output.”

I asked ChatGPT’s help to add that logo though, to figure out where to insert some HTML tag with a particular image asset. This is an example of how I am working with ChatGPT to make improvements to my blog(thus it’s titled “working” not “worked”). And in this blog post, I will describe how this working relationship look like, the benefits, and the catches.

It helps me add new features to my blog

Here “it” means “Working with ChatGPT”, it doesn’t mean “ChatGPT”.

My blog is based on Jekyll, a blogging Framework, with a CSS Library called Minimal Mistakes, hosted on Github Pages. In 2020 I wrote about the history and all the updates ended up this setting. Since then, I have not made any major changes: it’s not because I don’t have ideas. I’ll touch the reasons later.

In the wake of ChatGPT’s success I decided to ask ChatGPT to help me with such feature requests. My idea is: since I have blog posts either written in Chinese or in English, I want to display miscellaneous buttons and labels on the post page according to language. E.g. at the end of this page you will see “Tags” and “Updated” labels, “Previous” and “Next” buttons, etc. For a post written in Chinese I would like to display “标签”, “更新时间”, “上一页”, “下一页” instead. So I asked:

Can I set locale at individual page level rather than site level? I tried to use zh_CN for one of the post but its HTML still has there

ChatGPT said:

To set the locale for an individual page, you can add a lang attribute to the HTML tag in the page’s layout file.

And they provided example code, explained the code, and kept going:

Additionally, you can set the Open Graph og:locale meta tag for an individual page by adding it to the page’s front matter.

Of course they gave example code and explanations.

But the code does not magically work: for the first idea, I didn’t know where to change individual post’s HTML(it’s managed my Minimal Mistakes somewhere); for the second idea, I quickly tried: and it didn’t work. So this is rather the starting of a conversation, and aligns with people’s experience with ChatGPT: you can’t expect it works magically, but you need to iterate on prompting.

As you can see, the feature is now implemented, not because of ChatGPT ultimately gave me answer, but rather I leveraged ChatGPT’s help to figure out the ultimate solution. So here brings my next point.

It serves as a pair programmer

Here “It” means ChatGPT.

From the starting conversation, I kept using ChatGPT as a peer programmer and kept finding the proper next steps.

Based on my current blog setting, my framework already handled certain jobs for me, this helps me to keep my maintenance effort minimal. The downside though, is sometimes I do not know where certain implementation is happening. To figure that out, I would have to do online search, read documents(if any), and deploy the sites to try, with no guarantee to work: that’s the reason I did not update my blog for a while. Back to the initial conversation, although the immediate answer doesn’t work, it gave me some hints: since ChatGPT mentioned “page’s layout file”, I probably need to figure out where the layout file is defined.

So firstly I asked ChatGPT directly about it, but made sure I asked it with as much information as I could:

You asked me to add a lang attribute to the HTML tag in the page’s layout file, I cannot find such layout file. I am using a plugin called jekyll-archives, and what I usually do it to just set some metadata like this: … Is there a way to tell this layout to use a different locale?

Then I got this answer:

If you’re using a layout file that’s provided by a plugin or theme, you may need to check the documentation for that plugin or theme to see if it supports customizing the lang attribute for individual pages.

So, ChatGPT didn’t know the answer, but “check the documentation” part was indeed the next step I would like to try. So I went ahead checking the documents of Minimal Mistakes because I believe it has some code somewhere that controls my page layout. And this time, I found some critical information that for some reason I did not realize in the past:

When installing the theme as a Ruby Gem its layouts, includes, stylesheets, and other assets are all bundled in the gem. Meaning they’re not easily visible in your project. Each of these files can be modified, but you’ll need to copy the default version into your project first. For example, if you wanted to modify the default single layout, you’d start by copying it to _layouts/single.html.

Then I took a look at the source code of _layouts/single.html and this line of code caught my eyes here:

<h2 class="page__related-title">{{ site.data.ui-text[site.locale].related_label | default: "You May Also Enjoy" }}</h2>

There is site.data.ui-text[site.locale]! Apparently this piece of code suggests, in theory, I can pass in either language as key, to the site.data.ui-text dictionary, and get text accordingly! So where is that site.data.ui-text defined? This time I did not have to ask ChatGPT but just found it on Minimal Mistake’s Github repository here. It already has translations of different UI components.

At this point my action items became much more clear:

  1. I need to copy over the layouts files to my local repository, and make changes such that instead of using site.locale, use page’s locale.
  2. Figure out a way to know each blog post’s language.

At the end of the post I will summarize how it was done. But before that I have another point to make.

It misleads you if you do not ask in a proper way

I decided to followup on first action item myself and consult ChatGPT for the second. Since I have some existing Python script to do certain blog posts manipulations, I asked ChatGPT:

I need a piece of Python code to check if a file is English or Chinese, basically: given a file path, check the content of file, if the content is mostly Chinese, return ‘zh’, otherwise return ‘en’, can you write it for me?

Of course ChatGPT is ready to help and quickly wrote some code:

import langid

def detect_language(file_path):
    with open(file_path, 'r') as f:
        text = f.read()
        lang, _ = langid.classify(text)
        if lang == 'zh':
            return 'zh'
        else:
            return 'en'

So far so good! And I did not know langid library before, I learned something new, yay! But when I tried to embed the code into my script it didn’t work, so I asked a followup:

It seems not working: for every file in my folder, it has same score even though the content is different: is it only reading the first line of file? How to force langid read the full file and classify?

Here is ChatGPT’s answer:

By default, langid reads only the first 1000 bytes of a file to classify its language. If you want to read the full file, you can use the classify method with the full=True parameter

I tried to follow the suggestion by calling langid.classify(content, full=True), and I immediately got exception: TypeError: classify() got an unexpected keyword argument 'full'. The fact is, classify function doesn’t have a full parameter and ChatGPT just made it up: likely because ChatGPT was thinking following my prompt that “is it only reading the first line of file? How to force langid read the full file and classify?”. langid’s source code shows full parameter indeed does not exist.

The real issue, later I figured out, is my own mistake. I embedded ChatGPT’s code into my script and my code became:

with open(filePathStr) as f:
    post = frontmatter.load(f)
    text = f.read()
    lang, score = langid.classify(text)
    print(f"Detected file {filePath.name} lang is {lang}, score is {score}")

When post = frontmatter.load(f) was called, it read the entire file content and left the file pointer at the end of the file, thus the real fix is to add f.seek(0)(to move the file pointer back to the beginning) before text = f.read(). But I did not thought about it, and had a hypothesis, gave the hypothesis to ChatGPT, and ended up with a made-up API. So if I just copy-paste my buggy code and the unexpected results I saw, I should be able to get a correct answer in first try. That’s why proper prompting matters a lot.

Summary: ChatGPT is helpful for side projects

After all, it is pleasant experience to work with ChatGPT on side projects like a personal blog. For bloggers who have some code to write and maintain, ChatGPT helps you understand those code(so that you can make a stale blog active again) and helps you write new code to improve it.

For people who happen to have following situation like mine and want the answer. I am also sharing my solution:

Situation: I have a Jekyll based blog, hosted on GitHub Pages, uses Minimal Mistakes, have blog posts in different languages, want to show labels and buttons based on blog posts’ languages.

Solution:

  1. From Minimal Mistake’s GEM installation path(gem info minimal-mistakes-jekyll to get the installed path), copy several files, including: _layouts/single.html, _includes/comments.html, _includes/post_pagination.html,_includes/footer.html, etc, over to the corresponding folders(_layouts, _includes) into your local blog folder.
  2. In those files, replace site.data.ui-text[site.locale] by site.data.ui-text[page.lang].
  3. For each blog post, add a front matter to indicate its language like lang:en, lang:zh etc. I can share my Python script to go through each post to backfill later.

Leave a comment