bilig

I keep running into the same spreadsheet automation failure.

The screenshot looks fine. The total changed. The agent says it edited the right input. Then you ask the boring questions and the story falls apart:

That is the point where a screenshot stops being evidence.

I maintain bilig. The public package is @bilig/headless. It is a small TypeScript runtime for workbook-shaped business logic in Node services and agent tools.

It is not an Excel clone. The promise is smaller: build or load a workbook, write inputs, recalculate formulas, read the answer back, and save the state as JSON.

The bug

Spreadsheet UIs are good for people. They are a weak runtime boundary for code.

If an agent has to click cells and inspect pixels, it can easily produce a plausible-looking result without proving the workbook state is right. The grid does not tell you whether hidden formulas moved, whether a structural edit retargeted references, or whether the saved document still round-trips.

For backend jobs and agent tools, the rendered spreadsheet should be inspection only. The contract should be the workbook state.

The boundary I want

A useful workbook API should let code do this without opening a browser:

  1. build or load sheets
  2. write typed values and formulas
  3. recalculate after edits
  4. read exact values back
  5. export a workbook document
  6. restore that document and check the same result again

That gives an agent a loggable operation instead of a screenshot and a shrug.

Run The Small Proof

This starts from an empty directory and uses the published npm package:

mkdir bilig-workpaper-eval
cd bilig-workpaper-eval
npm init -y
npm pkg set type=module
npm install @bilig/headless
npm install -D tsx typescript @types/node
curl -fsSLo eval.ts https://proompteng.github.io/bilig/npm-eval.ts
npx tsx eval.ts

Expected shape:

{
  "before": 24000,
  "after": 38400,
  "afterRestore": 38400,
  "sheets": ["Inputs", "Summary"],
  "bytes": 1000,
  "verified": true
}

The byte count can move between releases. The important part is "verified": true: the script edits one input, reads the recalculated value, saves WorkPaper JSON, restores it, and gets the same value again.

What The API Looks Like

Here is the whole shape without the surrounding quickstart script:

import {
  WorkPaper,
  createWorkPaperFromDocument,
  exportWorkPaperDocument,
  parseWorkPaperDocument,
  serializeWorkPaperDocument,
} from '@bilig/headless'

const workbook = WorkPaper.buildFromSheets({
  Inputs: [
    ['Metric', 'Value'],
    ['Customers', 20],
    ['Average revenue', 1200],
  ],
  Summary: [
    ['Metric', 'Value'],
    ['Revenue', '=Inputs!B2*Inputs!B3'],
  ],
})

const inputs = workbook.getSheetId('Inputs')
const summary = workbook.getSheetId('Summary')
if (inputs === undefined || summary === undefined) {
  throw new Error('Workbook did not create the expected sheets')
}

const before = workbook.getCellValue({ sheet: summary, row: 1, col: 1 })
workbook.setCellContents({ sheet: inputs, row: 1, col: 1 }, 32)
const after = workbook.getCellValue({ sheet: summary, row: 1, col: 1 })

const saved = serializeWorkPaperDocument(exportWorkPaperDocument(workbook, { includeConfig: true }))
const restored = createWorkPaperFromDocument(parseWorkPaperDocument(saved))
const restoredSummary = restored.getSheetId('Summary')
if (restoredSummary === undefined) {
  throw new Error('Restored workbook did not create the Summary sheet')
}

console.log({
  before,
  after,
  afterRestore: restored.getCellValue({ sheet: restoredSummary, row: 1, col: 1 }),
  sheets: restored.getSheetNames(),
})

That is the loop I want exposed to agents:

Where it fits

This is useful when a workbook is really business logic:

It is a bad fit when the job is human collaboration, macros, chart fidelity, or full desktop Excel behavior. For those jobs, use Excel, Google Sheets, or a mature spreadsheet UI product.

It is also not automatically better than HyperFormula, ExcelJS, or SheetJS. Those are the first tools to check when you mainly need a broad formula engine or spreadsheet file reading/writing. Bilig is for the narrower case where the Node code owns the workbook state and needs formula readback plus JSON persistence.

Current evidence

The benchmark claim is deliberately narrow. The checked artifact currently records 100/100 mean-latency wins against HyperFormula-style comparable workloads and 100/100 workloads winning both mean and p95. The worst p95 row is called out instead of hidden.

Benchmark note: https://github.com/proompteng/bilig/blob/main/docs/what-workpaper-benchmark-proves.md

Compatibility limits: https://github.com/proompteng/bilig/blob/main/docs/where-bilig-is-not-excel-compatible-yet.md

Try it or reject it

The best feedback is a concrete rejection reason:

Links: