Methodology

Data Collection

SupLabel downloads supplement label data from the NIH Dietary Supplement Label Database (DSLD) API (v9). Our pipeline systematically retrieves product labels including ingredient lists, serving sizes, brand information, and label claims.

The pipeline is resumable — if interrupted, it picks up where it left off without re-downloading existing data.

Ingredient Normalization

Supplement labels use many names for the same ingredient. For example, "Vitamin D3," "Cholecalciferol," and "D3" all refer to the same compound. We maintain a curated synonym dictionary that maps these variations to canonical ingredient names.

This normalization enables meaningful comparisons across products from different brands that may use different naming conventions on their labels.

Unit Standardization

Units are standardized to common forms: milligrams (mg), micrograms (mcg), grams (g), International Units (IU), Colony Forming Units (CFU), and others. This allows direct comparison of dosages across products even when labels use different unit formats.

Data Storage

Processed data is stored in a SQLite database optimized for fast read access. The database schema tracks products, brands, ingredients, categories, serving sizes, and the relationships between them.

Update Frequency

Our database is periodically refreshed to include new products added to the DSLD. Page content is revalidated every 24 hours using Next.js Incremental Static Regeneration (ISR).

Limitations

  • We display label data as reported to the DSLD — we do not independently verify ingredient amounts
  • Not all supplement products on the market are included in the DSLD
  • Products may be reformulated after their label was submitted to the DSLD
  • Our synonym dictionary covers common ingredients but may not capture all naming variations