Table Encoding - FromThePage Blog

This describes support for encoding, display, and export of tabular data appearing within documents (requested & funded by Anna Agbe-Davies and the UNC-Chapel Hill Anthropology Department). This feature allows Markdown encoding of tables appearing within documents, and enhances the semantic division of texts into sections.

Example from Volume 2, Notebook 2, Page 45 of the Jeremiah White Graves accounts, bottom:

==Mrs Hendricks Abram==
| Date | Note | Amount |
| ----| -------| ---- |
| Nov 13th 1845 | Cr by 179lb Hay @ 2/3 | 0.65 |
| " " | 53lb fodder @ 3/- | 0.26 |
| " | 1/2 bbl corn @ $3/- | 1.50 |

A table consists of an optional section title, a table header, a separator line, and table data. These are encoded as follows:

Section titles are surrounded by equals signs, increasing in length (minimum 2, maximum of 6) to indicate greater importance:

==section title==
===more important title===
====even more important title====

Table headings are words or phrases separated by a pipe ("|")sign. Headings live on a single line, and are followed by heading separators:

| First Column | Second Column | Third Column |

Table heading separators are three or more dashes. Good form suggests that they also use pipe signs, mirroring the table headers and cells:

| -------- | -------------- | ------------- |

The function of the table heading separators is to differentiate a heading from a data row, and to give an additional hint to the parser that it is processing a table.

Do not leave empty lines between the heading, separator, and data rows. Leaving empty lines between the heading, separator, and data rows will result in the cell sizes not matching the rest of the table.

Table data rows are separated by pipes, just as table headings are. They differ from headings only by occurring after a separator:

If there is no data in a cell, put the table separators with spaces in between. If the last cell in a row does not have data, make sure to put an ending separator after the placeholder spaces.

When a page transcript is saved including tabular encoding, the sections, headings, and data cells are recorded in the database, and the internal encoding used for transcript mark-up is updated with tabular mark-up. The page should then be displayed to researchers with tables appearing in the transcriptions as HTML tables. The Export feature has also been modified to extract all tables from a work into a single CSV file, containing the contents of all tables within a work. This is a "sparse" table, so that spreadsheet columns will exist for each heading within the work, while those table rows which do not contain an entry for a particular heading will display a blank cell. If many different tables in a text include an "Amount" cell, and if that Amount cell is labeled as "Amount" in the table heading, all of those tables will have the appropriate value for Amount in the exported spreadsheet.