Singing a New Tune in AI – Prompt Tuning

We are all well aware that AI is the hottest topic everywhere. You couldn’t turn around in 2023 without hearing someone talk about it, even if they didn’t know what the heck it was, or is. People were excited, or afraid, or some healthy combination of both.

From developers and engineers to kids and grandmas, everyone wanted to know a thing or two about AI and what it can or cannot do. In my line of work, people were either certain it would take away all of our jobs or certain that it was the pathway to thousands of new jobs.

Naturally I can’t say for certain what the future holds for us as tech writers, but I can say this – we as human beings are awful at predicting what new technologies can do. We nearly always get it wrong.

When the television first arrived, there were far more who claimed it was a fad than those who thought it would become a staple of our lives. The general consensus was that it was a mere flash in the pan, and it would never last more than a few years. People simply couldn’t believe that a square that brought images into our homes would become a thing that eventually brought those images to us in every room of our homes, twenty four hours a day, offering news and entertainment, delivering everything we needed all day and all night. They couldn’t fathom that televisions would be so crystal clear and so inexpensive that every holiday season the purchase of a bigger, better, flatter, thinner television would be a mere afterthought.

And yet here we are.

So now that we’ve got that out of the way, on to total world domination!

But seriously.

If you aren’t already using AI, or at least Gen AI in the form of something like Chat GPT, where are you, even? At least have a little play around with the thing. Ask it to write a haiku. Let it make an outline for your next presentation. Geez, it’s not the enemy.

In fact, it’s so much not the enemy that it can help you outline your book (like I’ve done), revise a paragraph (like I’ve done), or tweak your speech (like I have done many, many times). The only thing you really need to understand here is that you are, indeed, smarter than the LLM. Well, mostly.

The LLM, or large language model, does have access to a significantly grander corpus of text than you can recall at any given moment. That’s why you are less likely to win on Jeopardy than if you were to compete against it. It’s also why it might be true that an LLM competitor might make some stuff up, or fill in some fuzzy details if you ask it to write a cute story about your uncle Jeffrey for the annual Holiday story-off. (What? Your family does not actually have an annual story-off? Well, get crackin’ because those are truly fun times…fun times…). The LLM knows nothing specific about your uncle Jeffrey, but does know a fair bit about, say, the functioning of a carburetor if you need to draft a paragraph about that.

The very, very human part is that you must have expertise in how to “tune” the prompt you offer to the LLM in the first place. And the second place. And the third place!

Prompt tuning is a technique that allows you to adapt LLMs to new tasks by training a small number of parameters. The prompt text is added to guide the LLM towards the output you want, and has gained quite a lot of attention in the LLM world because it is both efficient and flexible. So let’s talk more specifically about what it is, and what it does.

Prompt tuning offers a more efficient approach when compared to fine tuning entirety of the LLM. This results in faster adaptation as you move along. Second, it’s flexible in that you can apply tuning to a wide variety of tasks including NLP (natural language processing), image classification, and even generating code. With prompt tuning, you can inspect the parameters of your prompt to better understand how the LLM is guided towards the intended outputs. This helps us to understand how the model is making decisions along the path.

The biggest obstacle when getting started is probably designing an effective prompt at the outset. To design an effective prompt, it is vital to consider the context and structure of the language in the first place. You must imagine a plethora of considerations before just plugging in a prompt willy-nilly, hoping to cover a lot of territory. Writing an overly complex prompt in hopes of winnowing it down later might seem like a good idea, but in reality what you’ll get is a lot of confusion, resulting in more work for yourself and less efficiency for the LLM.

For example, if you work for a dress designer that creates clothing for petite women and you want to gather specific insights about waist size, but don’t want irrelevant details like shoulder width or arm length and competing companies, you might try writing a prompt to gather information. The challenge is to write a broad enough prompt, asking the AI model for information about your focus area (petite dresses), while filtering out information that is unrelated and avoiding details about competitors in the field.

Good Prompt/Bad Prompt

Bad prompt: “Tell me everything about petite women’s dresses, sizes 0 through 6, 4 feet tall to 5 feet 4 inches, 95 lbs to 125 lbs, slender build by American and European designers, and their products XYZ, made in ABC countries from X and Y materials.”

This prompt covers too many facets and is too long and complex for the model to return valuable information or to handle efficiently. IT may not understand the nuances with so many variables.

A better prompt: “Give me insights about petite women’s dresses. Focus on sizes 0 to 6, thin body, without focusing on specific designers or fabrics.”

In the latter example, you are concise and explicit, while requesting information about your area of interest, setting clear boundaries (no focus on designers or fabrics), and making it easier for the model to filter.

Even with the second prompt, there is the risk of something called “overfitting,” which is too large or too specific. This will lead you to refine the prompt to add or remove detail. Overfitting can lead to generalization or added detail, depending on which direction you need to modify.

You can begin a prompt tune with something like “Tell me about petite dresses. Provide information about sizes and fit.” It is then possible to add levels of detail that the LLM may add so that you can refine the parameters as the LLM learns the context you seek.

For example, “Tell me about petite dresses and their common characteristics.” This allows you to scale the prompt to understand the training data available, its accuracy, and to efficiently adapt your prompt without risking hallucination.

Overcoming Tuning Challenges

Although it can seem complex to train a model this way, it gets easier and easier. Trust me on this. There are a few simple steps to follow, and you’ll get there in no time.

  1. Identify the primary request. What is the most important piece of information you need from the model?
  2. Break it into small bites. If your initial prompt contains multiple parts or requests, break it into smaller components. Each of those components should address only one specific task.
  3. Prioritize. Identify which pieces of information are most important and which are secondary. Focus on the essential details in the primary prompt.
  4. Clarity is key. Avoid jargon or ambiguity, and definitely avoid overly technical language.
  5. As Strunk and White say, “omit needless words.” Any unnecessary context is just that – unnecessary.
  6. Avoid double negatives. Complex negations confuse the model. Use positive language to say what you want.
  7. Specify constraints. If you have specific constraints, such as avoiding certain references, state those clearly in the prompt.
  8. Human-test. Ask a person to see if what you wrote is clear. We can get pretty myopic about these things!

The TL;DR

Prompt tuning is all about making LLMs behave better on specific tasks. Creating soft prompts to interact with them is the starting point to what will be an evolving process and quickly teaching them to adapt and learn, which is what we want overall. The point of AI is to eliminate redundancies to allow us, the humans, to perform the tasks we enjoy and to be truly creative.

Prompt tuning is not without its challenges and limitations, as with anything. I could get into the really deep stuff here, but this is a blog with a beer pun in it, so I just won’t. Generally speaking (and that is what I do here), prompt tuning is a very powerful tool to improve the performance of LLMs on very specific (not general) tasks. We need to be aware of the challenges associated with it, like the hill we climb with interpretability, and the reality that organizations that need to fine-tune a whole lot should probably look deeply at vector databases and pipelines. That, my friends, I will leave to folks far smarter than I.

Cheers!