Skip to content
Back to Blog
Laravel Debugging Production Multi-Tenant

When ->first() Lies: Debugging a Silent Data Isolation Bug in Production

Nur Ikhwan Idris · · 8 min read

This bug had all the hallmarks of a nightmare: it was silent, it was intermittent, it only appeared under a specific combination of multi-tenant conditions, and it was happily writing wrong data to invoices that had already been submitted to a government API.

The root cause? A ->first() call that was returning the wrong record — quietly, without errors, without warnings — because the uniqueness constraint we assumed existed in the data simply didn't.


1. The System

We run an e-invoicing platform where multiple companies (issuers) can register buyers and submit invoices to Malaysia's LHDN MyInvois API. Each invoice must include both the supplier's and the buyer's identity information — specifically their TIN (Tax Identification Number), which LHDN calls a profile_id.

The buyers table stores buyer records per issuer. When a buyer is first encountered during invoice processing, we auto-register them — storing their profile_id and linking them to the issuer's company_id. In subsequent invoices, we look them up by profile_id.

The lookup looked like this:

// InvoiceHelper.php
$buyer = Buyer::where('profile_id', $profileId)
              ->where('company_id', $companyId)
              ->first();

Completely reasonable. Clean. And silently wrong.


2. The Bug

Under normal operating conditions, one buyer has one profile_id per issuer. But we had a race condition in our job queue. Two invoice processing jobs for the same issuer could run concurrently, both encounter the same new buyer, both check if (!$buyer), and both proceed to insert — resulting in two Buyer rows with the same profile_id and company_id.

Once the duplicate existed, the next invoice for that buyer would call ->first() and get whichever record the database returned first — which is not guaranteed to be consistent. MySQL will happily hand you either row depending on the query plan, buffer state, or the phase of the moon.

The result: invoices being submitted to LHDN with a buyer's profile pulled from the wrong Buyer record. In practice this meant incorrect registration details appearing on validated government invoices. The kind of data error that's embarrassing at best and legally problematic at worst.

The bug was invisible in development and staging because both environments had far lower concurrency. The race window was around 50–100ms — small enough to almost never happen in dev, common enough to be a real problem under production load.

3. Why We Couldn't Just Add a Unique Index

The obvious database fix — add a unique index on (profile_id, company_id) — was off the table. Our invoices table has over 100 million rows, many with foreign keys into buyers. A schema migration that alters constraint behaviour on a table this size requires:

  • Deduplication of existing violating rows before the index can be created
  • A maintenance window or an online DDL approach (pt-online-schema-change, gh-ost)
  • Careful coordination of which duplicate row gets kept and which gets its FK references updated

None of that is quick, and some of it is risky. We needed to stop the bleeding in production immediately while we planned the longer-term schema fix. So we chose Option B: no schema change, fix the application layer.


4. The Fix: Three Conflict Guards

The fix has three parts, all following the same pattern: detect ambiguity, throw immediately, log the buyer IDs for manual resolution. No silent fallback, no picking one arbitrarily — fail loudly so the problem surfaces and gets fixed.

Guard 1: Prevention at Registration

In BuyerHelper::autoRegisterBuyer(), before inserting a new buyer, we check whether a record with that profile_id already exists under the same issuer. If it does, we throw instead of inserting a second one.

// BuyerHelper.php
public static function autoRegisterBuyer(string $profileId, int $companyId, array $details): Buyer
{
    $existing = Buyer::where('profile_id', $profileId)
                     ->where('company_id', $companyId)
                     ->get();

    if ($existing->count() > 1) {
        // Already ambiguous — refuse to add another
        Log::critical('[autoRegisterBuyer] profile_id conflict', [
            'profile_id' => $profileId,
            'company_id' => $companyId,
            'buyer_ids'  => $existing->pluck('id'),
        ]);
        throw new \RuntimeException(
            "Ambiguous buyer: profile_id {$profileId} has {$existing->count()} records under company {$companyId}"
        );
    }

    if ($existing->count() === 1) {
        return $existing->first(); // Normal path — buyer already registered
    }

    // Safe to insert
    return Buyer::create(array_merge(['profile_id' => $profileId, 'company_id' => $companyId], $details));
}

Note the use of ->get() + ->count() instead of ->first(). We need to know how many records exist, not just whether one does. This pattern replaces every ambiguous lookup in the fix.

Guard 2: Prevention at Job Dispatch

In ProcessPendingInvoicesJob, before the job writes any buyer association to an invoice, it performs its own conflict check. This is the job that was racing with itself.

// ProcessPendingInvoicesJob.php
private function resolveOrAbortBuyer(string $profileId, int $companyId): Buyer
{
    $buyers = Buyer::where('profile_id', $profileId)
                   ->where('company_id', $companyId)
                   ->get();

    if ($buyers->count() !== 1) {
        Log::critical('[ProcessPendingInvoicesJob] profile_id conflict', [
            'profile_id' => $profileId,
            'company_id' => $companyId,
            'count'      => $buyers->count(),
            'buyer_ids'  => $buyers->pluck('id'),
        ]);
        throw new \RuntimeException(
            "Cannot process invoice: ambiguous buyer profile_id {$profileId}"
        );
    }

    return $buyers->first();
}

Throwing here means the job fails and goes to the failed jobs table, where it can be retried after the duplicate is manually resolved. No silent bad data written.

Guard 3: Detection at Submission

The third guard sits in InvoiceHelper::getSupplierAndBuyer() — the method that assembles the final invoice data for LHDN submission. This is the last line of defence before data leaves our system.

// InvoiceHelper.php  (before)
$buyer = Buyer::where('profile_id', $profileId)
              ->where('company_id', $companyId)
              ->first(); // 🐛 silently returns wrong record if duplicates exist

// InvoiceHelper.php  (after)
$buyers = Buyer::where('profile_id', $profileId)
               ->where('company_id', $companyId)
               ->get();

if ($buyers->count() !== 1) {
    Log::critical('[getSupplierAndBuyer] Ambiguous buyer', [
        'profile_id' => $profileId,
        'company_id' => $companyId,
        'buyer_ids'  => $buyers->pluck('id'),
    ]);
    throw new \RuntimeException(
        "Submission aborted: ambiguous buyer for profile_id {$profileId}"
    );
}

$buyer = $buyers->first(); // Now safe — exactly one result confirmed

5. Finding Existing Violations

The guards prevent new duplicates from being created and stop bad data from being submitted. But what about duplicates that already exist in the database? You need to find them before the next submission attempt surfaces them as errors.

SELECT
    profile_id,
    company_id,
    COUNT(*) AS duplicate_count,
    GROUP_CONCAT(id ORDER BY id) AS buyer_ids
FROM buyers
WHERE profile_id IS NOT NULL
GROUP BY profile_id, company_id
HAVING COUNT(*) > 1
ORDER BY duplicate_count DESC;

For each row returned: one of the buyer records needs to be designated the canonical one. The others should have their foreign key references in other tables re-pointed to the canonical record, then be deleted or soft-deleted. This is manual work — but the query tells you exactly which records need attention.

After the Fix: What to Monitor

With the three guards deployed, set up alerts on these log patterns:

# These should never fire after cleanup. If they do, a new duplicate was created.
[getSupplierAndBuyer] Ambiguous buyer
[ProcessPendingInvoicesJob] profile_id conflict
[autoRegisterBuyer] profile_id conflict

We monitor these with a simple log watcher. Any occurrence triggers a Slack alert. In the weeks since deployment, they've been silent — which means the guards are working and the race condition hasn't recurred.


6. The Actual Root Fix

The application-layer guards are the right immediate response, but they're not the permanent fix. The permanent fix is what we should have had from day one: a unique index on (profile_id, company_id) in the buyers table.

Once we've cleaned up all existing duplicates (identified by the query above), the migration is straightforward:

// Migration
Schema::table('buyers', function (Blueprint $table) {
    $table->unique(['profile_id', 'company_id'], 'buyers_profile_company_unique');
});

At that point, the database enforces the constraint at the storage layer, and the application guards become a second layer of protection rather than the only layer.


Takeaways

  1. ->first() doesn't guarantee uniqueness. It returns one row from however many match the query. If you're relying on business logic uniqueness that isn't enforced by a database constraint, you're one race condition away from silent data corruption.
  2. In multi-tenant systems, uniqueness is scoped. profile_id unique across the whole table isn't the constraint you need — (profile_id, company_id) unique per tenant is. Think in tenant scope when designing indexes.
  3. Fail loudly at every layer. When you detect ambiguity, throw and log. Don't silently pick one record and carry on — you'll never know there was a problem until someone gets the wrong invoice.
  4. Application guards are temporary; schema constraints are permanent. The three guards we deployed are correct and will catch future violations. But the goal is to make them unreachable by enforcing uniqueness at the database level.
  5. Race conditions don't show up in dev. Low-concurrency environments mask this entire class of bug. If your production load is significantly higher than dev, write tests that simulate concurrent execution for any code that checks-then-inserts.

The version of this bug that's hardest to catch is the one that doesn't throw, doesn't log, and writes wrong data confidently for weeks before anyone notices. ->first() is not that version — but it becomes that version the moment you assume uniqueness that the database doesn't enforce.

Questions or corrections? Reach out via the contact section of my portfolio.