Deep Seeking – Writing in a Whole New Model

How Did DeepSeek Catch Up at a Fraction of the Cost?

Like everyone else, I was more than a little surprised by Deep Seek. I’d been plugging along using the other models and tools for artificial intelligence writing, planning, and crafting. I also still think the best writing tool is my brain and my hands, but there you have it.

Still, I teach and explore the world of AI and was fascinated by what just happened in the world of Large Language Models and the boom-snap of Deep Seek and its ability to tank the Nvidia market overnight. So, as one of the limited pool of tech writers paying attention to how these things take place, I figured I’d give a brief explanation and share a few thoughts here, with a nod to the folks who taught me a thing or two along the way.

DeepSeek’s leap comes down to four major innovations (and some smaller ones). Here’s the lowdown:

  1. They distilled from a Leading Model, for starters
    DeepSeek likely distilled its model from an existing one—most likely Meta’s Llama 3, though they could have accessed OpenAI’s GPT-4 or Anthropic’s Claude. Distillation involves training a new model using an existing one, much like OpenAI’s GPT-4 Turbo, which provides solid performance at lower costs by leveraging GPT-4 as a teacher.
    This approach slashes the time and cost of creating a training set, but it has limits. Since your starting point is always someone else’s previous release, leapfrogging the competition becomes far more challenging. If DeepSeek used OpenAI or Anthropic’s models, it would violate their terms of service—but proving that is notoriously difficult.
  2. Inference via Cache Compression
    DeepSeek slashed the cost of inference (the process of generating its responses) by compressing the cache that the model uses to make predictions. This breakthrough was clever but it wasn’t entirely unexpected. It’s a technique that others probably would have figured out pretty soon. More importantly, DeepSeek published the method openly, so now the entire industry can benefit from their efforts.
  3. Mixture of Experts Architecture
    Unlike traditional LLMs, which load the entire model during training and inference, DeepSeek adopted a “Mixture of Experts” approach. This uses a guided predictive algorithm to activate only the necessary parts of the model for specific tasks.
    DeepSeek needs 95% fewer GPUs than Meta because, for each token, they train only 5% of the parameters. This innovation radically lowers costs and makes the model far more efficient.Cheaper
  4. Reasoning Model Without Human Supervision
    DeepSeek didn’t just match leading LLMs like OpenAI’s GPT-4 in raw capability; they also developed a reasoning model on par with OpenAI’s o1. Reasoning models combine LLMs with techniques like CoT, enabling them to correct errors and make logical inferences—qualities predictive models lack.
    OpenAI’s approach relied on reinforcement learning guided by human feedback. DeepSeek trained their model on math, code, and logic problems, using two reward functions—one for correct answers and one for answers with a clear thought process. Instead of supervising every step, they encouraged the model to try multiple approaches and grade itself. This method allowed the model to develop reasoning capabilities independently.

The TL;DR 

  1. Innovation Beats Regulation
    DeepSeek’s success is a reminder that competition should focus on technological progress, not regulatory barriers. Innovation wins.
  2. Lower Costs All Around
    DeepSeek’s architecture dramatically reduces the cost of running advanced AI models—in both dollars and energy consumption. Their open-source, permissively licensed approach means the whole industry, including competitors, benefits.
  3. Models Are Becoming Commodities
    As models become cheaper and more widely available, the value will shift toward building smarter applications rather than simply integrating a flashy “AI button.” Product managers who can creatively apply AI will drive the next wave of real-world impact.
  4. NVIDIA’s Moat Is Shrinking
    DeepSeek’s GPU efficiency challenges assumptions about NVIDIA’s dominance. If others adopt similar techniques, the grip of chipmakers on the AI economy may loosen significantly.

*For some DeepSeek FAQs from folks who know a lot more than me, go here

Tech Mentor: Or, How to Help Your Young Writer Grow Up

Photo by Star of the Sea on Unsplash

The difference between good writing and bad writing is more than simply reader enjoyment or understanding. It is career advancement or publication rejection.

Seem overstated? It’s not.

In the technical circles I move in, I watch decent writers get held back every day, every month, every year from being promoted or advanced all because they are just that: decent. They are not great or exceptional, visionary or learned. They didn’t have anyone come along and nurture them from decent to excellent.

And that’s a darn shame, because a decent writer has a fair-to-middling chance of becoming really, really good if they have the right teachers along the way.

My mother was an English teacher, and her mother before her. But they are not the people with whom I credit my exemplary grammar, although they contributed a fair amount of dinner table chat about language and style. I can say with no fear of correction that I became a much better writer than my mother. In truth, my sixth grade grammar teacher, Marcia Saiers, taught me all about comma placement. Her husband Tom taught me in ninth, eleventh, and twelfth grade and from him I learned nearly everything else about composition and style. He was an exacting instructor.

But then I went to college, and whoo daisy my options expanded.

Then I got to work, and my technical mentors hit me like a squid to the face.

Writing communicates. It holds power that no other means of transferring information has. I’ve published too many essays and articles to count. Some are technical, others personal, but each one conveys information that I believe to my very core is essential to the reader.

Bold statement, I know,

How did those writings become…”essential?”

Those two teachers I mentioned, the Mrs. and Mr. Saiers – their approach to teaching me writing was steeped in what I refer to as “classical redlining.” That is to say, they took red pen to white paper and made it appear as though they had opened a vein. I submitted much writing to them and they marked it up, showing me error after error, and I corrected those errors until my writing met their exacting standards. They adhered to the Strunk and White school of thought, and I learned by making those copyedits time after time after time. No contractions, limited use of first and second person. Rare, if ever, beginning a sentence with a preposition, ever and always “omit needless words.”

These two mentors consistently made local copyedits, those that were specific to my work, bringing my narrative into hyperfocus, showing me how to defend and support my writing.

This is what I do for younger writers now. This is how I help younger writers learn to synthesize their verbose thoughts into the crystallized necessity and economy of words that readers are hungry for and little more.

My mentors gave me structural edits. They taught me effective logical argument. In fact, when I took a philosophy course in undergrad, I was prepared more than ever to grasp fallacy and linear argument. New writers will make two recognizable errors: first, they will write start to finish, and second, they will include details of their journey. I learned to become the technical writer who understands that no one cares about the labor; everyone just wants to see the baby.

In mentoring a new writer, I cannot overstate the importance of side-by-side writing, even though most will balk at the idea. Writing closely together, handing off writing as a collaborative act is a process of genius for technical writers. A bottom-up writing approach like this saves countless sessions of redlining merely by watching the thought process unfold. A younger writer gets to see a more seasoned writer’s approach, and vice-versa.

When I have mentored younger writers, I love to see how they approach a process and by seeing them write, I can note various flaws in their structure and rather than making corrections at the end, we can decide together how to implement a better flow or argument. We work through revisions in real time. I do some writing directly in front of the mentee, and they can see how I think, what choices I make, and often I articulate why as I go.

Side by side co-writing takes time. It’s not fast, and it isn’t pretty. But I have found that when I do this, I end up with a better writer mentee in a few weeks rather than a few months, which is a pretty grate rate.

Learning effective technical writing is difficult and takes practice, as does any skill. There are layers and nuances, perhaps more than many professions. There are ingredients and customizations just as there are to a good, freshly baked pie.

When you are ready to dig in, though, the elements that blend together well are distinct and sharp and well-defined. And it all tastes pretty good.

Four Ways Gen AI Will Help Tech Writers in 2024

Image courtesy of Nick Morrison on Unsplash

We’re a few weeks into 2024 and now that we’re past the “best of” lists and the “year in review” lists, there’s a bit of time to take a look at what we might find most productive in the coming quarters. I always enjoy taking a few deep breaths once the frenzy of the year-start fades to look at what I might find truly useful as the daffodils peek their heads up in the park and I begin to feel the true momentum of the year take shape. I kick off the winter blues and roll my sleeves up to dig in.

This year, my company is more sure than ever that we will harness the ever-growing powers of AI as it enters a more mature phase, and I for one am grateful for the embracing of that stance. Artificial Intelligence, especially generative AI, is no longer just a buzzword. Companies are beginning to sift through the hype to discover what is really providing value and what was just marketing hubris. I’m glad of it. I suppose I am glad because I work for a company that knows the difference and tends to put their weight behind it. Sure, we have some applications that are neither artificial nor intelligent, but we own up to the ones that are data and machine learning, and we boast about the ones that have weight to throw around. And that is fun and ambitious.

So I enjoy rolling up my sleeves to explore the ways that Generative AI might improve my work, and I enjoy tossing out the ways that it is a big, fat dud. So let’s look at the four ways I think it might actually improve what I do.

Drafting

First drafts are the things that take up the bulk of any writer’s time, whether tech writer or fiction writer. We know what it is that we want to say, we just aren’t sure how to get started. It’s not writer’s block per se, it’s just the words in our brains don’t spill out on to the page as quickly as we’d like or as smoothly as they should. Our years of training are well suited to editing and refining, which we are likely to have already begun in our heads. It’s the typing that can be painful because it simply isn’t as fast as our thought processes.

But a Generative AI tool is as fast. Faster, even. If we can spin up a good prompt, we can get some of the process started, and that is often enough to grease the wheels and we can take it from there.

Most likely the way we will be able to effectively harness what Gen AI really has to offer is to become adept at crafting and refining the prompts to get us started. I began writing chapters of my book recently by asking ChatGPT to help me organize the table of contents, then I asked it to revise that TOC because it wasn’t quite the way I liked it. The actual TOC looks nothing like it did when good ol’ AI first started it, but what was nice is that it gave me liftoff. I knew in my head what I wanted to do, but the way my creative brain was working was to think of the book as the whole, not its parts. That is awful for a writer. As the saying goes, “you can’t boil the ocean.” Indeed you can’t. And while AI wrote not a single word of this post, that’s because I have been able to think of its component parts. With my book, I was thinking of the whole thing, start to finish, and I needed a hand in visualizing what the chapters might look like. By asking AI to help me out with that, I was able to think of the pieces rather than the whole. It was only then that I could sit down and begin to write.

With that in mind, we as creators will be able to ask Gen AI not to write for us, but to draft for us, so that we can get out of the muck that is stuck in our brains and begin to refine and edit and revise after the foundation is laid. What is left behind as detritus will be much like what we left behind us when we were drafting ourselves – an unrecognizable skeleton of tossed aside word salads. It requires our fine touch to render the work useful and beautiful. Gen AI will have helped us get there faster, much the same way moving from horse to car allowed us to move from place to place more quickly, too.

Automating Q&A

As a tech writer, I cannot even begin to tell you how many times I have been asked to write an FAQ section for documentation.

My answer to this request is always the same. I will not.

I will not write an FAQ because if there are frequently asked questions, I should answer those within the documentation. Sheesh!

And yet.

There remain some questions that indeed, users will face time and again that are unlikely to be answered in the documentation. I get it. No matter how much of a purist I am, I understand that there are frequently asked questions that need frequent answers.

So a technical writer can collate those questions and help develop the chatbot that integrates with a product’s or company’s knowledge repository to answer those repetitive questions effectively. This bot can instantly provide responses in a way that no live writer could, through a call-and-response mechanism that is smoother than static documentation. By leveraging AI, we can ask our tools to ingest data and interpret our users’ needs and provide output that is both relevant and accurate. This can enhance the user’s experience by responding immediately whereas a human might require time to look something up, or might be off-hours or otherwise detained.

AI-powered chatbots are efficient, interactive, and fast. But they serve only after technical writers feed them all they need in terms of information. It’s vital to remember that our AI tools are only as good as the data they are given to start with.

Data

Technical writers will also be very well served by the data that AI can give us. Back when I was in graduate school, we worked very hard to manually enter volumes of information to build dictionaries of strings of phrases and what they meant in general syntax. It was difficult and time consuming. Today, though, thanks to AI-powered tools, we could have completed that project in a fraction of the time.

How AI data will benefit tech writers is that it will allow us to gather truly valuable insights into our users’ behaviors and we can stop making what we once knew were educated guesses about their pathways. Tools we have now measure user engagement, search pattern behaviors, and even the effectiveness of our content through click rates, bounce patterns, andmore, and they serve it all up in nice, tidy numbers. We can use this to optimize our content and identify the gaps we see and close them quickly.

Leveraging analytics data allows tech writers now to focus on the content our users access often, the content that interests them the most, and the keywords they search – and do or do not find – with greatest success. Plus, we can now see just what stymies our users and all of this will empower us to create the most targeted content that will greatly improve the user experience.

Optimization

More than just a buzzword, content optimization is probably the most powerful benefit of AI-enabled tech writing. Asking a good AI tool to analyze technical documentation to provide feedback about readability against a style guide and structural rules is probably the best thing we could ask for. Sure, we’ve got things like grammarly and acrolinx to do spot-checking. That’s all well and good. But now AI can help us enhance accessibility, it can help us tailor content for specific audiences, it can allow or disallow entire phrases and terms. We can advance our proofreading capabilities far more than we could before by loading in custom dictionaries and guides.

Tools like grammarly are basic, but with advancing NLP processing tools, the arsenal we have just grows. We can pass our work through a domain-specific tool that offers a profession-specific set of guidelines and lets us examine with a microscope things like acronyms, codes, and more.

The time savings there is not perfect, of course. We’ll still need to copyedit. But imagine, if you will, that every first or second pass of all of our text now is done in mere seconds. All typos are caught by AI so that we can clean up and polish with an editor’s eye not just a proofreader’s.

Sit back with a cup of coffee and really read, not just grab a pencil and mark up a piece of paper the way we used to.

My, that’s nice.

Leadership Language

Choose Your Words Wisely

Image courtesy of Brooke Lark on Unsplash

Two employees are asked to evaluate their work at their annual review. One describes their accomplishments as “outstanding,” noting that they have exceeded plan. The other gives a full report listing data points and gives the team credit for achieving success throughout the year.

Which employee was male and which was female?

Study after study shows us that as women, we tend to describe our abilities less favorably than our equally proficient male counterparts. We describe ourselves less often as as “proficient” or “skilled” on resumes and we display less confidence even on our resumes when we have similar degrees and certifications.

Scholars Exley and Kessler studied long and hard to reckon that women come in at nearly 13 points lower than men on willingness to brag about themselves. They do this even when they know they are exactly equal. How do we know? Well, Exley and Kessler offered up a straightforward math and science test with the same questions, and then asked men and women who got the same score how they thought they did. They asked them to rate themselves on a Likert scale. (On a scale of 1 to 100, how well do you think you did?) Women who got the same score ranked themselves on average 13 points lower. They continued to do this even when they knew the score they got.

Fascinating.

We just don’t seem to be all that comfortable saying, “Hey, I’m pretty good at stuff, and I have the score to prove it.”

A new study by Van Epps, Hart, and Schweitzer provides us with some great insight into how to overcome this tendency, and how to level the field a bit. We would be wise to learn it, and use it. It’s called “Dual Promotion.” And I’m here for it 100%.

Dual promotion is all about not just talking about our own capabilities, but elevating each other while we are at it. It turns out that women aren’t altogether good at bragging – we know that, we get that. (And yes, I know this is not universal. Some women show off, some are great in the limelight. Others are wallflowers and won’t say a kind word about themselves if paid a million. Right? Nothing is standard. Moving on.)

Dual promotion benefits more than just ourselves. Nice, right? There are three primary ways we can do this. The first is easy, and makes sense. If you have a team that you work with that helps get you to a goal or inspired you to get there, you give them a nod to the contribution they made to your success.

Allow me to demonstrate. My daughter studied classical vocal performance in undergrad. When she gave her senior recital, she gave a lovely thank-you speech at the conclusion and was sure to thank not only the musicians who accompanied her, but her student-friend who meticulously dedicated hours to arranging a Spanish folk song for her in operatic style that became her finale. This guy was not a part of the performance. He had already given his own concert, but she named him specifically and spoke about the fact that it was his talent that made her own talent shine brighter, then she went on to take several more bows. She made it clear that singing that song was her own thing. She made his music come alive. Without her voice, his arrangement was still folk music, but her operatic voice made a street song something it had never been before. And yet… you see how they both were vehicles for each other? She did not yield ground for her own achievement just because she acknowledged another person’s contribution to her arguably great success.

Another mechanism for dual promotion is complimenting a competitor. This happens often in sport. I am a triathlete, for example, and I find it very satisfying when, at the finish line, I can congratulate with total sincerity the person who has just crossed ahead of me whether by a slim or great margin. And I appreciate fully when they do the same. We are both out there trying to make the best of a tough sport, enduring challenges both within and without our control, and we are enjoying the results not just of a day’s work but of months and often years of training. It feels good to acknowledge that whatever came our way that day, we rose to greet the task. Professional athletes quite often give plaudits to their counterparts, admitting that it takes a lot of grit to get where they are, showing respect not only for the game but for the chops it takes to get there. The same should go for competitors in the market and in the workplace. When we have a colleague on a task or a competitor for a promotion or role, we do well to note that there are many factors that weave into who ends up at the finish first, both realistically and metaphorically.

The third and final dual promotion strategy is recognizing the field. By this I mean when we show our respect for the shoulders on which we stand. Think of an award recipient who shows clear respect for the other nominees. All those who admit “It’s an honor just to be nominated,” are likely not joking. It actually IS. Realistically, when our hat gets tossed in the ring for a prestigious award, it is a privilege and it feels good. To know you’ve gained the respect of someone or a group enough to be considered, let alone to win, feels pretty great. So the act of dual promotion in this instance is when we include by saying, “I’m honored to be among this great group of other people because it means that I, too, am pretty great.” “Look at me and how great I am…gee, I must be pretty great.”

Each of these forms of dual promotion puts us in the spotlight, but the best part is that it doesn’t leave others in the dark. It can feel amazing to allow ourselves to shine, especially if we recognize that it is totally, completely fair to do so. Not one of us got here alone, but we can remember that our shine is not dimmed by the success of others, nor should we promote their contributions over ours. We don’t have to break the rung of the ladder we just stood on, we can extend a hand to the person standing there and bring them along with us, if that is the path they both want and have earned. That’s no problem at all!

Don’t fear dual promotion – it’s a fantastic way to be sure you are shining your own light, offering light to others, and being seen. Your warmth, consideration, and desire to grow will be apparent to all as a leadership and strength trait, and your honesty and sincerity will be apparent throughout.

No one lies a braggart, but nearly everyone likes a team leader. So go forth and promote – dual promote.

Source – https://pubmed.ncbi.nlm.nih.gov/37561455/. Because I definitely want to dual promote the team that taught me what dual promotion is all about!

I’m Your Writer, Not Your Everything

Or, Hey there, Have You Seen My SME?

I couldn’t even think of a cool graphic or photo to go with this, so here we are, graphic-less. That’s probably good, because I am not starting off the year on a happy, optimistic note to be honest.

Let’s get one thing clear. I am a SME. I am absolutely, without a doubt a subject matter expert. One hundred per cent. Although I spend most of my time writing about a software product in the area of clinical trials data analytics, I am decidedly not an expert in clinical trials data analytics. Nor was I hired to be that.

I am a subject matter expert in technical writing, in grammar and syntax.

I am supposed to be just that. I deliver on that promise every day. I rarely make an error in my delivery of high-quality content. I know what readers need and how they need it to be organized. I understand where the reader’s eye is likely to go and what concepts should be placed where. More importantly, I can spot a typo in difficult language and if I am unfamiliar with a term I know where to go to look it up, when to insist on that word, and when to choose another. I build dictionaries and document frameworks. I organize tables of information. My spelling is almost perfect and so is my grammar.

What I am not is a subject matter expert in data analytics. Do you want to know a secret? I was not a subject matter expert in mainframe banking software when I was a technical writer for my former company, either. And I was not a subject matter expert in cybersecurity when I worked in that field. Are you sensing a pattern or is it just me?

Now is the moment when I sigh.

The burden on technical writers is becoming ever steeper. Our teams are becoming thinner and our demands are becoming heavier. We have always carried the weight of justifying our positions when budgets tighten, showing our value since our deliverable often doesn’t bring with it a 1:1 ROI, and showing mid- and upper-leadership that continued investment in upskilling is essential. As the asks of today’s software teams become greater and the speed with which we deliver becomes faster, demonstrating those values seems to be tougher and tougher.

I generally work with great people and teams and on balance the work that I do is seen as necessary if not critical to the end product. But more and more I learn of teams that want their tech writers to be so fully embedded in the end-to-end development that the lines between coder and ux designer and writer are not just comfortably blurred, they are erased.

At the end of the day, all I know for certain is this: I will remain a subject matter expert in grammar, syntax, and spelling and I will feel quite comfortable in that area. When I am called upon to be a soup to nuts pro in a complex piece of software, I think I’ll start asking my development teams to explain to me the differences between a gerund and a participle, and to recommend when I should use each in the installation manual I am working on today.

I’m Your Writer, Not Your Everything

Or, Hey there, Have You Seen My SME?

I couldn’t even think of a cool graphic or photo to go with this, so here we are, graphic-less. That’s probably good, because I am not starting off the year on a happy, optimistic note to be honest.

Let’s get one thing clear. I am a SME. I am absolutely, without a doubt a subject matter expert. One hundred per cent. Although I spend most of my time writing about a software product in the area of clinical trials data analytics, I am decidedly not an expert in clinical trials data analytics. Nor was I hired to be that.

I am a subject matter expert in technical writing, in grammar and syntax.

I am supposed to be just that. I deliver on that promise every day. I rarely make an error in my delivery of high-quality content. I know what readers need and how they need it to be organized. I understand where the reader’s eye is likely to go and what concepts should be placed where. More importantly, I can spot a typo in difficult language and if I am unfamiliar with a term I know where to go to look it up, when to insist on that word, and when to choose another. I build dictionaries and document frameworks. I organize tables of information. My spelling is almost perfect and so is my grammar.

What I am not is a subject matter expert in data analytics. Do you want to know a secret? I was not a subject matter expert in mainframe banking software when I was a technical writer for my former company, either. And I was not a subject matter expert in cybersecurity when I worked in that field. Are you sensing a pattern or is it just me?

Now is the moment when I sigh.

The burden on technical writers is becoming ever steeper. Our teams are becoming thinner and our demands are becoming heavier. We have always carried the weight of justifying our positions when budgets tighten, showing our value since our deliverable often doesn’t bring with it a 1:1 ROI, and showing mid- and upper-leadership that continued investment in upskilling is essential. As the asks of today’s software teams become greater and the speed with which we deliver becomes faster, demonstrating those values seems to be tougher and tougher.

I generally work with great people and teams and on balance the work that I do is seen as necessary if not critical to the end product. But more and more I learn of teams that want their tech writers to be so fully embedded in the end-to-end development that the lines between coder and ux designer and writer are not just comfortably blurred, they are erased.

At the end of the day, all I know for certain is this: I will remain a subject matter expert in grammar, syntax, and spelling and I will feel quite comfortable in that area. When I am called upon to be a soup to nuts pro in a complex piece of software, I think I’ll start asking my development teams to explain to me the differences between a gerund and a participle, and to recommend when I should use each in the installation manual I am working on today.

Singing a New Tune in AI – Prompt Tuning

We are all well aware that AI is the hottest topic everywhere. You couldn’t turn around in 2023 without hearing someone talk about it, even if they didn’t know what the heck it was, or is. People were excited, or afraid, or some healthy combination of both.

From developers and engineers to kids and grandmas, everyone wanted to know a thing or two about AI and what it can or cannot do. In my line of work, people were either certain it would take away all of our jobs or certain that it was the pathway to thousands of new jobs.

Naturally I can’t say for certain what the future holds for us as tech writers, but I can say this – we as human beings are awful at predicting what new technologies can do. We nearly always get it wrong.

When the television first arrived, there were far more who claimed it was a fad than those who thought it would become a staple of our lives. The general consensus was that it was a mere flash in the pan, and it would never last more than a few years. People simply couldn’t believe that a square that brought images into our homes would become a thing that eventually brought those images to us in every room of our homes, twenty four hours a day, offering news and entertainment, delivering everything we needed all day and all night. They couldn’t fathom that televisions would be so crystal clear and so inexpensive that every holiday season the purchase of a bigger, better, flatter, thinner television would be a mere afterthought.

And yet here we are.

So now that we’ve got that out of the way, on to total world domination!

But seriously.

If you aren’t already using AI, or at least Gen AI in the form of something like Chat GPT, where are you, even? At least have a little play around with the thing. Ask it to write a haiku. Let it make an outline for your next presentation. Geez, it’s not the enemy.

In fact, it’s so much not the enemy that it can help you outline your book (like I’ve done), revise a paragraph (like I’ve done), or tweak your speech (like I have done many, many times). The only thing you really need to understand here is that you are, indeed, smarter than the LLM. Well, mostly.

The LLM, or large language model, does have access to a significantly grander corpus of text than you can recall at any given moment. That’s why you are less likely to win on Jeopardy than if you were to compete against it. It’s also why it might be true that an LLM competitor might make some stuff up, or fill in some fuzzy details if you ask it to write a cute story about your uncle Jeffrey for the annual Holiday story-off. (What? Your family does not actually have an annual story-off? Well, get crackin’ because those are truly fun times…fun times…). The LLM knows nothing specific about your uncle Jeffrey, but does know a fair bit about, say, the functioning of a carburetor if you need to draft a paragraph about that.

The very, very human part is that you must have expertise in how to “tune” the prompt you offer to the LLM in the first place. And the second place. And the third place!

Prompt tuning is a technique that allows you to adapt LLMs to new tasks by training a small number of parameters. The prompt text is added to guide the LLM towards the output you want, and has gained quite a lot of attention in the LLM world because it is both efficient and flexible. So let’s talk more specifically about what it is, and what it does.

Prompt tuning offers a more efficient approach when compared to fine tuning entirety of the LLM. This results in faster adaptation as you move along. Second, it’s flexible in that you can apply tuning to a wide variety of tasks including NLP (natural language processing), image classification, and even generating code. With prompt tuning, you can inspect the parameters of your prompt to better understand how the LLM is guided towards the intended outputs. This helps us to understand how the model is making decisions along the path.

The biggest obstacle when getting started is probably designing an effective prompt at the outset. To design an effective prompt, it is vital to consider the context and structure of the language in the first place. You must imagine a plethora of considerations before just plugging in a prompt willy-nilly, hoping to cover a lot of territory. Writing an overly complex prompt in hopes of winnowing it down later might seem like a good idea, but in reality what you’ll get is a lot of confusion, resulting in more work for yourself and less efficiency for the LLM.

For example, if you work for a dress designer that creates clothing for petite women and you want to gather specific insights about waist size, but don’t want irrelevant details like shoulder width or arm length and competing companies, you might try writing a prompt to gather information. The challenge is to write a broad enough prompt, asking the AI model for information about your focus area (petite dresses), while filtering out information that is unrelated and avoiding details about competitors in the field.

Good Prompt/Bad Prompt

Bad prompt: “Tell me everything about petite women’s dresses, sizes 0 through 6, 4 feet tall to 5 feet 4 inches, 95 lbs to 125 lbs, slender build by American and European designers, and their products XYZ, made in ABC countries from X and Y materials.”

This prompt covers too many facets and is too long and complex for the model to return valuable information or to handle efficiently. IT may not understand the nuances with so many variables.

A better prompt: “Give me insights about petite women’s dresses. Focus on sizes 0 to 6, thin body, without focusing on specific designers or fabrics.”

In the latter example, you are concise and explicit, while requesting information about your area of interest, setting clear boundaries (no focus on designers or fabrics), and making it easier for the model to filter.

Even with the second prompt, there is the risk of something called “overfitting,” which is too large or too specific. This will lead you to refine the prompt to add or remove detail. Overfitting can lead to generalization or added detail, depending on which direction you need to modify.

You can begin a prompt tune with something like “Tell me about petite dresses. Provide information about sizes and fit.” It is then possible to add levels of detail that the LLM may add so that you can refine the parameters as the LLM learns the context you seek.

For example, “Tell me about petite dresses and their common characteristics.” This allows you to scale the prompt to understand the training data available, its accuracy, and to efficiently adapt your prompt without risking hallucination.

Overcoming Tuning Challenges

Although it can seem complex to train a model this way, it gets easier and easier. Trust me on this. There are a few simple steps to follow, and you’ll get there in no time.

  1. Identify the primary request. What is the most important piece of information you need from the model?
  2. Break it into small bites. If your initial prompt contains multiple parts or requests, break it into smaller components. Each of those components should address only one specific task.
  3. Prioritize. Identify which pieces of information are most important and which are secondary. Focus on the essential details in the primary prompt.
  4. Clarity is key. Avoid jargon or ambiguity, and definitely avoid overly technical language.
  5. As Strunk and White say, “omit needless words.” Any unnecessary context is just that – unnecessary.
  6. Avoid double negatives. Complex negations confuse the model. Use positive language to say what you want.
  7. Specify constraints. If you have specific constraints, such as avoiding certain references, state those clearly in the prompt.
  8. Human-test. Ask a person to see if what you wrote is clear. We can get pretty myopic about these things!

The TL;DR

Prompt tuning is all about making LLMs behave better on specific tasks. Creating soft prompts to interact with them is the starting point to what will be an evolving process and quickly teaching them to adapt and learn, which is what we want overall. The point of AI is to eliminate redundancies to allow us, the humans, to perform the tasks we enjoy and to be truly creative.

Prompt tuning is not without its challenges and limitations, as with anything. I could get into the really deep stuff here, but this is a blog with a beer pun in it, so I just won’t. Generally speaking (and that is what I do here), prompt tuning is a very powerful tool to improve the performance of LLMs on very specific (not general) tasks. We need to be aware of the challenges associated with it, like the hill we climb with interpretability, and the reality that organizations that need to fine-tune a whole lot should probably look deeply at vector databases and pipelines. That, my friends, I will leave to folks far smarter than I.

Cheers!

Optimize! They said. It’ll be fun! They said.

Photo by Tianyi Ma on Unsplash.

Now that all of us write web-based content and can’t even SEE print matter in the rearview mirror, we’ve adapted (or started to adapt) to a whole different set of challenges. I used to organize documentation for release in a binder, so I had to think about tab structures and indeces and a list of other layout technicalities that would make my precious topics finable.

Today, though, I must lean in to Content Optimization. Sounds catchy, doesn’t it? Slick. Advertorial. In some ways, it is. But mostly it’s about the same thing: how to find the gorgeous stuff I’ve written, and essentially how to make that the best end-user experience it can be.

Why should we optimize in the first place? Half of us don’t know for sure what that word means, even. Let’s set some parameters:

Optimized content is written at the right reading level and tone. It is translatable if needed. It is consistent and reliable, not working too hard to say the same thing in two or three ways. It is balanced – between grammatical correctness and ease of reading. That’s pretty much it!

The fun thing about writing technical documentation is that we can optimize before, during, or after composing our prose. As we create new doc, we can roll along just writing stuff the way we want – so long as we go back later to clean it up! Or, we can vigorously adopt optimization principles step by step so that when it’s time to edit, we’ve got a pretty good draft already.

I would venture to say that the primary goal of content optimization is seeing to it that your documentation reaches the largest possible audience within your target. Nobody cares if I write an article all about fashion and the only people who see it are auto mechanics who wear uniforms. See what I mean?

I would also say that making sure your content is visible, especially to web-crawlers, is an important aspect of optimization. Consider the beautiful website that never shows up in Google Search – that writer and designer worked hard on content, but can’t get it off the ground! What a shame.

We know that great content attracts readers no matter what, but what IS “great content?”

For starters, write clearly and relatably with your reader in mind. That is step one no matter when or what you are writing. When my oldest son started college, I reminded him often that the most important factor in his college papers was that his professors were the audience. They were the ones to please, and no one else. Know your audience should be a phrase posted by every content creators desk.

Keep writing NEW content. If things get stale, you have missed the point of optimization altogether. Even if none of the information has changed, good writer-editors will revise again and again over time. It’s just good practice.

Organize content, not unlike the “olden days” when we organized a TOC and an index. Thoughtfully preparing well-organized content is at the heart of a pleasant experience for end users.

Don’t forget images! When we incorporate images and other media into our online content but do not optimize those elements, we’ve lost an important tool. Using title tags, descriptions, alt text, and more can make sure that the images you’ve worked to capture are useful and visible.

Don’t just write for search. All too often, in the pursuit of clicks, writers think too much about SEO and they lose the natural, organic flow of writing and reading. Consumers are more savvy than ever, especially with web-based content, and they catch on quickly if your goal is just to be at the top of the search engine, while you overlook the enjoyment of finding content you need!

Optimization is all about making the experience easy and smooth for all who use the content you create. If we go at it from that point of view, it actually IS fun…like they said.

‘Appropriate Use’

And Other Strange Terms in Generative AI

Google has for quite some time now been the de facto word in all things web. Sure, Microsoft and Bing gave it a shot, but never quite broke through the hefty barriers set by behemoth Google. So when the folks at Google set out to put some guardrails around AI generated content, the world paid attention.

Google wrote search guidance about AI-generated content, focusing on ‘appropriateness’ rather than attribution. Their stance is that automation has long been used to generate content, and that AI can in fact assist in appreciable ways. One of the most interesting bits about their policy is noting that it isn’t just some bot that can create propaganda or distribute misinformation. In fact, they are quick to point out the very human-ness in that capability. Since the beginning of information dissemination there’s been the capacity to distribute falsehood, after all.

Further, Google asserts that AI-generated content is given no search privilege over any other generated content. And yet, despite this assertion, we know that AI tools can be asked via well-designed queries to write content that is specifically designed to tick all of the SEO boxes, thus potentially rocketing it upward in search efforts. The human brain cannot log and file all of the best potential search terms. We just don’t have that sort of computing power. But AI does. And it has it in spades.

Creating relevant, easily findable content is the whole effort for tech writers. Our jobs depend on being able to place the content users need where the content consumers expect to find it. Many of us have trained for years to scratch the surface of this need, and most of us continue to refine that ability by monitoring our users’ journeys and mapping their pathways. But now AI can do this at a speed we never could. We rely on analytics to tell us what to write where.

Moreover, as humans we rely on our own inherent creativity to design engaging and timely documentation. Every single writer I have ever known, including myself of course, has experienced a degree of “writer’s block,” sometimes even when the prompt is clear and direct. It’s tough to just get started. But when a program has access to all of the ideas ever written (more or less), that block is easy to dismantle. But when we rely on AI to generate the basis for our content, even if we intend to polish, edit, and curate that content, where does the authorship belong? Is it ‘appropriate use’ to place an author byline as the sole creator if a large language model is the genesis of the work? Google’s guidance is merely to “make clear to readers when AI is part of the content creation process.” Their clarification is…unclear.

Google does recommend, in it appropriate use guidance, to remain focused on the ‘why’ more than the ‘how.’ What is it that we, as content generators, are trying to achieve in our writing? We go back to the audience and purpose more than the mechanics and we’ll be fine. Staying in tune with the reason or reasons for our writing will keep us in line with all appropriate use guidelines. For now. If we are writing merely for clicks or views, we’ve lost our way. If we continue to write for user ease and edification, well okay then.

Even Google acknowledges that Trust is at the epicenter of their E-E-A-T guidelines (experience, expertise, authoritativeness, and trustworthiness) which is the basis of their relevant content rankings. AI could certainly create content with a high level of expertise, noticeable experience, and authoritativeness, but we’ve found that the trustworthiness is suspect.

For now, our ‘Appropriate Use’ likely remains in the domain of those of us with conscience, which AI notably lacks. Avoiding content created merely for top rankings still nets humans, and human readers, the desired end result, even if it doesn’t make the top of the list.

Appropriate is not always Popular.

Why the Humanities Matter in STEM

Photo credit: Prateek Katyal, 2023.

A 2017 article in the Washington Post discussed how now, in the age of big data and STEM (Science, Technology, Engineering, and Mathematics), liberal arts and humanities degrees are perceived as far less valuable in the marketplace. I saw the same opinion held strong at both universities where I taught English. Many, many students believed wholeheartedly that the only thing they could do with a degree in English is teach.

I was hard-pressed to convince them otherwise since I was, in fact, teaching.

The Post article goes on to argue, however, for abundant evidence that humanities and liberal arts degrees are far from useless.

When I started graduate school in 2007 at university that beautifully balances the arts and sciences (shout out to you, Carnegie Mellon!), my advisor recommended I take “the Rhetoric of Science.” I meekly informed her that I wasn’t really into science. I thought it would be a bad fit, that I would not fare well and my resulting grade would reveal my lack of interest. She pressed, saying there was a great deal to learn in the class and that it wasn’t “scienc-ey.”

She was absolutely right. I was fascinated from the start. The course focused on science as argument, science as rebuttal, but most of all science as persuasive tool. Or, at least the persuasiveness came from how we talk and write about science. My seminar paper, one of which I remain proud, was titled: “The Slut Shot. Girls, Gardasil, and Godliness.” I got an A in the class, but more importantly I learned the fortified connection between language and science.

The National Academies of Sciences, Engineering, and Medicine urges a return to a broader conception of how to prepare students for a career in STEM. Arguing that the hyper-focus on specialization in college curricula is doing more harm than good, they argue that broad-based knowledge and examination of the humanities leads to better scientist. There is certainly the goal among academics to make students more employable upon graduation, and yet there is consensus that exposure to humanities is a net benefit.

The challenge is that there’s no data. Or, limited data anyway. The value of an Art History course or a Poetry Workshop at university is hard to measure against the quantifiable exam scores often produced in a Chemistry or Statistics class.

In a weak economy, it’s easy to point to certifications and licenses over the emotional intelligence gained by reading Fitzgerald or Dickinson. We find, though, that students (and later employees) who rely wholly on the confidence that science and technology provide answers, viewing it with an uncritical belief that solutions to all things lie in the technology – well, those beliefs are coming up short. Adherence to the power of science as the ultimate truth provides little guidance in the realm of real-world experiences.

In short, not all problems are tidy ones.

After all, being able to communicate scientific findings is the icing on the cake. We don’t get very far if we have results but do not know how to evangelize them.

In American universities right now, fewer than 5% of students major in the humanities. We’ve told them that’s no way to get a job. The more we learn about Sophocles, Plato, Kant, Freud, Welty, and others, the more prepared we are to take on life’s (and work’s) greatest challenges. It is precisely because the humanities are subversive that we need to keep them at the heart of the curriculum. Philosophical, literary, and even spiritual works are what pick at the underpinnings of every political, technological, and scientific belief.

While science clarifies and distills and teaches us a great deal about ourselves, the humanities remind us how easily we are fooled by science. The humanities remind us that although we are all humans, humans are each unique. Humans are unpredictable. Science is about answers and the humanities are about questions. Science is the what and the humanities are the why.

If we do our jobs well in the humanities, we will have generations to come of thinkers who question science, technology, engineering, and math.

And that is as it should be.

I welcome discussion about this or any other topic. I am happy to engage via comment or reply. Thanks for reading.