IsoFind - Schéma de base de données

Database Schema

IsoFind uses a local SQLite database (isofind.db) composed of 24 tables organised across 6 functional domains. This page documents the key columns, primary keys, foreign keys and main business fields. The schema is verified and migrated automatically at application startup via db_migration.py, with no loss of existing data.

IsoFind: Migration is non-destructive. Missing columns are added (ALTER TABLE ADD COLUMN), absent tables are fully recreated, and additional columns are ignored to ensure downward backward compatibility.

Samples & isotopic data

samples is the pivot table of the model (all measurements, yields and treatments are linked to it by foreign key). isotope_data stores the measured ratios per element and ratio. purification_yields records the purification yield in memory per element, avoiding manual editing of raw isotopic ratios. sample_methods and sample_pipelines instantiate analytical protocols on a given sample.

isofind.db - domain: samples
.
+-- samples                   -- main pivot table
|   +-- id                    PK
|   +-- name, material_type, sector
|   +-- project, classification
|   +-- collection_date, collection_location
|   +-- latitude, longitude
|   `-- owner_id
|
+-- isotope_data              -- raw isotopic measurements
|   +-- id                    PK
|   +-- sample_id             FK -> samples
|   +-- element, isotope_ratio
|   +-- ratio_value, uncertainty
|   +-- normalized_value, standard_used
|   `-- sample_type, collection_date
|
+-- purification_yields       -- in-memory yields per element
|   +-- id                    PK
|   +-- sample_id             FK -> samples  [UNIQUE (sample_id, element)]
|   +-- element, yield_percent
|   `-- operator, method_key, measured_at
|
+-- sample_methods            -- protocols applied to a sample
|   +-- id                    PK
|   +-- sample_id             FK -> samples
|   +-- method_id             FK -> method_library
|   +-- method_type, element, laboratory
|   `-- yield_percent, performed_date
|
`-- sample_pipelines          -- analytical pipelines applied
    +-- id                    PK
    +-- sample_id             FK -> samples
    +-- pipeline_id           FK -> method_pipelines
    +-- current_stage, status
    `-- started_at, completed_at

CRM & corrections

analyzed_crm represents a measurement session for a certified reference material (instrument, laboratory, quality score). analyzed_crm_values stores measured and certified values with their sigma deviations. crm_derived_shifts derives the corrections to apply to raw ratios. sample_crm_association explicitly links a CRM to the sample it controls.

isofind.db - domain: CRM & corrections
.
+-- analyzed_crm              -- CRM measurement session
|   +-- id                    PK
|   +-- crm_id, crm_name
|   +-- instrument, laboratory, operator
|   `-- quality_score, session_name, analysis_date
|
+-- analyzed_crm_values       -- measured vs certified values
|   +-- id                    PK
|   +-- analyzed_crm_id       FK -> analyzed_crm  [CASCADE]
|   +-- element, isotope_ratio
|   +-- measured_value, certified_value
|   `-- deviation, deviation_sigma, status
|
+-- crm_derived_shifts        -- corrections derived from CRM
|   +-- id                    PK
|   +-- analyzed_crm_id       FK -> analyzed_crm  [CASCADE]
|   +-- element, isotope_ratio
|   +-- shift_value, uncertainty
|   `-- is_active, source
|
`-- sample_crm_association    -- sample <-> CRM link
    +-- id                    PK
    +-- sample_id             FK -> samples  [CASCADE]
    +-- analyzed_crm_id       FK -> analyzed_crm  [CASCADE]
    `-- association_type        [UNIQUE (sample_id, analyzed_crm_id)]

Methods & pipelines

method_library is a library of reusable protocols (digestion, purification, separation, analysis), referenceable by local_key. method_pipelines chains methods into ordered sequences via a stages_json field. The two instantiation tables (sample_methods, sample_pipelines) link these protocols to a concrete sample.

isofind.db - domain: methods & pipelines
.
+-- method_library            -- protocol library
|   +-- id                    PK
|   +-- name, type               [digestion | purification | separation | analysis]
|   +-- element_target, equipment, laboratory
|   +-- doi, reference_text, author
|   `-- local_key               [UNIQUE] reference identifier
|
+-- method_pipelines          -- ordered method sequences
|   +-- id                    PK
|   +-- name, element
|   +-- stages_json             stage sequence
|   +-- total_duration_hours, is_template
|   `-- local_key               [UNIQUE]
|
+-- sample_methods            -- protocol instantiation on a sample
|   +-- id                    PK
|   +-- sample_id             FK -> samples  [CASCADE]
|   +-- method_id             FK -> method_library  [SET NULL]
|   +-- method_type, element, laboratory
|   `-- yield_percent, performed_date, performed_by
|
`-- sample_pipelines          -- pipeline instantiation on a sample
    +-- id                    PK
    +-- sample_id             FK -> samples  [CASCADE]
    +-- pipeline_id           FK -> method_pipelines  [CASCADE]
    +-- current_stage, status    [pending | in_progress | completed | failed]
    `-- started_at, completed_at

Isotopic standards & geographic references

isotope_standards centralises reference standards (NIST, IAEA, etc.) and custom standards. isotopic_shifts encodes shifts between standard pairs for inter-laboratory normalisation. reference_standards designates the active standard per element. geographic_zones and reference_data provide the regional geochemical background data used in isotopic correspondence.

isofind.db - domain: standards & references
.
+-- isotope_standards         -- analytical standards (NIST, IAEA, custom)
|   +-- id                    PK
|   +-- name, element, isotope_ratio
|   +-- reference_value, uncertainty
|   +-- is_reference, is_custom
|   +-- batch_number, is_default_batch
|   `-- reference_standard_id    self-ref (parent standard)
|
+-- isotopic_shifts           -- shifts between standard pairs
|   +-- id                    PK
|   +-- source_standard_id    FK -> isotope_standards
|   +-- target_standard_id    FK -> isotope_standards
|   +-- element, isotope_ratio
|   `-- shift_value, uncertainty, source
|
+-- reference_standards       -- active standard per element
|   +-- id                    PK
|   +-- element                 [UNIQUE]
|   `-- standard_id           FK -> isotope_standards
|
+-- geographic_zones          -- geochemical reference zones
|   +-- id                    PK
|   +-- name, country, region
|   +-- latitude, longitude, radius_km
|   `-- geological_context
|
`-- reference_data            -- regional geochemical background data
    +-- id                    PK
    +-- geographic_zone_id    FK -> geographic_zones
    +-- element, isotope_ratio
    +-- mean_value, std_deviation
    `-- sample_count, confidence_level, source

Publications & archive

publications centralises bibliographic references (DOI, authors, journal). Two junction tables connect it to methods (method_publication_link) and to samples (sample_publications). archive_samples and archive_isotope_data preserve a frozen snapshot of the data at the archiving date, isolated from the main graph, original_id, maintains traceability back to the source record.

isofind.db - domain: publications & archive
.
+-- publications              -- central bibliography
|   +-- id                    PK
|   +-- doi                     [UNIQUE]
|   +-- title, authors, journal, year
|   `-- category, keywords, url, pdf_path
|
+-- method_publication_link   -- method <-> publication
|   +-- id                    PK
|   +-- method_id             FK -> method_library  [CASCADE]
|   +-- publication_id        FK -> publications  [CASCADE]
|   `-- link_type               [source | modification | validation]
|
+-- sample_publications       -- sample <-> publication
|   +-- id                    PK
|   +-- sample_id             FK -> samples  [CASCADE]
|   +-- publication_id        FK -> publications  [CASCADE]
|   `-- relation_type           [source | published_in | reference | cited]
|
+-- archive_samples           -- frozen snapshot of samples
|   +-- id                    PK
|   +-- name, material_type, sector, project
|   +-- original_id             traceability back to samples.id
|   `-- archived_at             archiving timestamp
|
`-- archive_isotope_data      -- frozen snapshot of isotopic measurements
    +-- id                    PK
    +-- sample_id             FK -> archive_samples
    +-- element, isotope_ratio, ratio_value
    `-- archived_at, standard_used

Analyses & configuration

Standalone tables with no foreign keys to the rest of the model. analysis_history logs each analysis run (type, status, duration, number of samples involved), with soft-delete via deleted_at. custom_elements extends the configuration of available isotopic elements beyond the default values.

isofind.db - domain: analyses & configuration
.
+-- analysis_history          -- log of analysis runs
|   +-- id                    PK
|   +-- name, analysis_type
|   +-- samples_count, duration
|   +-- status                  [completed | ...]
|   +-- created_at, updated_at
|   `-- deleted_at              soft-delete
|
`-- custom_elements           -- custom isotopic elements
    +-- id                    PK
    +-- symbol                  [UNIQUE]
    +-- name, ratios            available ratios for this element
    `-- is_default              included in the base configuration

Legend & conventions

Notation Meaning
PK Primary key, unique row identifier
FK Foreign key, reference to another table
[CASCADE] Cascading deletion: child rows are deleted with the parent
[SET NULL] The foreign key is set to NULL if the parent is deleted
[UNIQUE] Uniqueness constraint on the column or column combination
a | b | c Accepted values, CHECK constraint in the DDL
IsoFind: The engine is SQLite 3. The declared types (TEXT, REAL, INTEGER, TIMESTAMP) are indicative; SQLite applies dynamic typing by column affinity. Timestamped fields are stored as TEXT in ISO 8601 format via datetime('now'), unless explicitly typed as TIMESTAMP.

Data type dictionary

SQLite applies dynamic typing by column affinity. IsoFind nevertheless enforces strict conventions to guarantee the precision of isotopic calculations and interoperability with third-party tools. The table below documents the conventions used throughout the schema.

Declared type SQLite affinity Use in IsoFind Example fields
TEXT TEXT Strings, identifiers, isotopic ratios, ISO 8601 dates name, isotope_ratio, created_at
REAL REAL Isotopic values, uncertainties, geographic coordinates ratio_value, uncertainty, latitude
FLOAT REAL Alias for REAL, inherited from the initial schema, identical behaviour mean_value, std_deviation
INTEGER INTEGER Primary keys, counters, boolean flags (0/1) id, samples_count, is_active
BOOLEAN NUMERIC Stored as INTEGER 0/1, never as literal true/false is_default, is_reference, is_custom
TIMESTAMP NUMERIC System timestamps, stored as ISO 8601 TEXT despite NUMERIC affinity archived_at, created_at (archive tables)
VARCHAR(n) TEXT ORM legacy, the length n is not enforced by SQLite but documents intent name, sector, standard_used
IsoFind: Isotopic ratio values are always stored as REAL (64-bit IEEE 754 float), which guarantees sufficient precision for mass spectrometry measurements. Do not substitute TEXT for these fields; normalisation and sigma deviation calculations would silently produce incorrect results.

Audit trail & traceability

Traceability is a central requirement for forensic and defence applications. IsoFind records the history of modifications at multiple levels (operation timestamps, preservation of archived data, and analysis logging), without ever irreversibly deleting data.

Mechanism Tables concerned How it works
Dual timestamp samples, analyzed_crm, method_library Each record carries created_at and updated_at, updated automatically via datetime('now') on every write.
Soft-delete analysis_history Deleted analyses are not erased — the deleted_at field is set, and the record remains queryable for audit purposes.
Frozen archive archive_samples, archive_isotope_data A complete snapshot of the data at the archiving date. The original_id field maintains the link to the source record in samples, even if that record is subsequently modified.
Analysis log analysis_history Each analysis run is recorded with its type, status, duration and the number of samples involved. Constitutes a complete activity log of the analytical session.
CRM quality control analyzed_crm, analyzed_crm_values Each certified reference material measurement session is fully preserved, including sigma deviations from certified values. Allows reconstruction of instrument state at any given date.
IsoFind: Cascading deletions (ON DELETE CASCADE) are limited to junction tables and child values directly dependent on a parent record. Primary data (samples, isotope_data) is never automatically deleted as a side effect.

Security & encryption

IsoFind distinguishes two protection perimeters: data integrity at export, and data confidentiality at rest. These two mechanisms are independent and complementary.

IsoFind: The isofind.db file itself is not encrypted at rest, it is protected by the filesystem permissions of the installation directory. Encryption applies to .isof exports and backups, not to the active SQLite database, unless the option is enabled (IsoFind Pro plan).
Mechanism Scope Technical detail
SHA-256 integrity .isof exports Each export file embeds a SHA-256 hash of its content. On import, IsoFind recomputes the hash and rejects altered files. Guarantees that data has not been modified in transit.
AES encryption Encrypted .isof exports Exports can be encrypted with a key managed via the system keyring (isofKeyringManage). The recipient must hold the corresponding key to decrypt the import.
Digital signature Certified .isof files Files from a certified laboratory carry a verifiable cryptographic signature via the ISOF standard. The signature attests to the origin and integrity of the complete analytical chain.
Filesystem permissions isofind.db database Access to the active database is controlled by filesystem permissions. On multi-user deployments, the data directory should be restricted to the user running IsoFind.