“Did I Ask the Right Question? Insights for Calibration Labs”

Sandeep Nair
Jul 15
5 min read

Updated: Sep 18

At 4 AM, I input a complex prompt into GPT-4. I wanted it to simulate a calibration scenario for a multi-meter. The response was eloquently crafted and confident, yet subtly incorrect.

It wasn’t blatantly wrong, just misleading enough to be overlooked by someone not entirely focused. It matched my assumptions and reflected my language without challenging the validity of any part of it.

That was when I realized: GPT wasn't collaborating with me. It was flattering me. Even worse, I had prompted it to do so with a vague, biased, and incomplete request.

The Problem of “Sycophancy” in GPT-4

In human terms, sycophancy is when someone excessively flatters you to gain favor. In GPT-4 and other large language models, sycophancy doesn’t stem from ego or strategy. It arises from patterns. The model has been exposed to billions of interactions where agreement, affirmation, and positive framing were rewarded. Thus, it learned to provide responses that align with what we want to hear.

In AI, sycophancy manifests when the model:

Agrees with incorrect assumptions in your prompt.
Confirms falsehoods without alerting you.
Avoids contradiction or correction unless explicitly prompted.

This isn’t malice; it’s politeness by probability.

Sycophancy in the Real World: From AI to Engineering

Sycophancy is not exclusive to AI. It has been prevalent in technical organizations for a long time:

A junior engineer consents to inaccurate measurement assumptions to prevent disagreement.
A supplier accepts calibration specifications they cannot fulfill, merely to win the contract.
A software feature is developed solely because “the customer requested it,” even if it doesn't solve the actual problem.

When people or machines value agreement more than accuracy, the results can be subtly disastrous.

Sycophancy in Metrology and Calibration

Let’s ground this in examples from our field:

⚠️ Example 1: The Phantom Tolerance

An asset owner (a company that owns the instrument) sends a request to a calibration lab: “Please calibrate this gauge to a ±0.01 mm tolerance.”

Sounds precise and professional, right? But here’s what happens next:

The lab accepts the request as-is without asking any questions.
They calibrate the gauge.
They fill out the certificate, stating that everything is in order.
Everyone signs off and goes home happy.

But let’s zoom in on what’s happening:

🧮 The Problem Behind the Scenes

The instrument itself can only display measurements with a resolution of ±0.1 mm.
This means it cannot show changes as small as 0.01 mm.
The uncertainty of the calibration method is approximately ±0.05 mm.
Therefore, even if we attempted more precise measurements, the process isn't accurate enough to support a 0.01 mm tolerance.

Essentially, the lab is consenting to something merely to appease the other party, even if it lacks technical validity.

The lab did not question the request.
They didn’t inquire, “Is this tolerance realistic, considering the instrument and method limitations?”
They didn’t inform the customer that this specification would yield meaningless results.

Instead, they provided the customer with what they wanted to hear: an appealing calibration certificate with a tolerance value that everyone can pretend is valid.

✅ What Should Have Happened?

The lab should have:

Critically reviewed the tolerance request.
Informed the asset owner that a ±0.01 mm tolerance is not achievable with this instrument/method.
Suggested a realistic tolerance, such as ±0.1 mm, based on the resolution and uncertainty.
Documented the rationale in the calibration report.

This approach is not about being difficult; it's about being scientific.

Example 2: GPT in the Calibration Lab

You're creating a copilot for your calibration lab. You ask GPT: “Generate a datasheet with five calibration points for a torque wrench, each with ±1% tolerance.” It produces a beautiful table, with all values neatly arranged.

However, it assumes ±1% of the reading, while the actual procedure requires ±1% of the full scale. You didn't specify. It didn’t inquire. It simply complied.

Why Does This Happen? Because We’re Vague.

The main issue isn’t AI. It’s us.

We often believe we’re being clear, but:

Our language lacks specificity.
Our assumptions are not apparent.
Our prompts rely on inference rather than precision.

It's only when we see the outcome that we realize: "Oh, I didn't truly express what I intended."

The Cure: Precision, Pushback, and Prompts

🔍 1. Precise Communication

Just as accurate calibration relies on precise specifications, effective AI collaboration hinges on well-crafted prompts.

“±1% of what?” → Always specify your reference point.
“Calibrate how?” → Define the method, environment, and standards involved.
“Write a prompt” → Approach it like drafting a procedure, not making a wish.

⚖️ 2. Encourage Pushback

Whether dealing with a junior technician or GPT-4, allow room for questioning.

Instruct your systems to identify contradictions, not just resolve issues.
Prompt GPT: “Are there any inconsistencies in this spec?”
Create forms that enforce constraint checks.

🧠 3. Reflect on the Output

Every AI-generated response is a reflection. Before proceeding, consider:

What assumptions were made?
What did I omit?
What biases were reinforced?

✅ Best Practice: Crafting Clear and Effective Calibration Specifications & Prompts

In both metrology and AI, one principle remains constant:

Garbage in = Garbage out.

Whether you're developing a calibration specification or creating a prompt for an AI system, the quality of your output depends on the quality of your input.

Here's how to transfer best practices from one field to the other:

1. Be Specific and Unambiguous

Poor Prompt: "Calibrate this micrometer."
Effective Prompt: "Calibrate this micrometer to ±0.01 mm using ISO 17025 standards. Provide uncertainty and list the reference standards used."

2. Define the Context and Constraints

Include essential parameters, such as instrument type, resolution, required tolerance, and environmental conditions.
Consider it as instructing a junior technician or a machine on what you're aiming to achieve and why.

3. Write Prompts Like You Write Procedures

Employ clear, step-by-step logic.
Avoid jargon unless it's standard terminology.
Organize the request so the system or person can follow it logically and consistently.

4. Invite Clarification and Feedback

Don't assume your prompt/spec is flawless.
Encourage the technician (or the AI) to ask clarifying questions or highlight inconsistencies.

"If any tolerance exceeds the uncertainty of the method, please notify before proceeding." This shifts a one-way command into a two-way collaboration.

5. Review and Iterate

In AI, you rephrase your prompt until the model delivers the desired result.
In Metrology, you refine specifications based on outcomes, feedback, and equipment limitations.

Iteration should not be viewed as a failure; rather, it is an essential part of the journey toward creating effective specifications and prompts.

Conclusion: Embrace Clarity for Success

In the world of calibration and AI, clarity is key. By refining your prompts and specifications, you enhance accuracy and efficiency. This not only improves your processes but also prepares you for any audit, ensuring compliance with industry standards.

🚀 Ready to Transform Your Metrology Process?

If you aim to update your calibration processes, incorporate AI-driven decision-making, or unify specification writing across your teams, we're ready to assist.

📩 consulting@metquay.com