MeridianMERIDIAN

File Extension

File name extension or suffix (txt, pdf, docx, jpg, etc.). May include or exclude leading dot.

File Extension

representation.file.extension

File name extension or suffix (txt, pdf, docx, jpg, etc.). May include or exclude leading dot.

Domain
representation
Category
file
Casts to
VARCHAR
Scope
broad_words

Try it

CLI
$ finetype infer -i "txt" --mode column
→ representation.file.extension

DuckDB

Detect
SELECT ft_infer('txt');
-- → 'representation.file.extension'
Cast expression
LOWER(REGEXP_REPLACE({col}, '^\.*', ''))
Safe cast pipeline
-- Normalise and cast in one step
SELECT TRY_CAST(ft_cast(my_column) AS VARCHAR) AS clean_value
FROM my_table
WHERE ft_infer(my_column) = 'representation.file.extension';

Struct Expansion

Expression
category: CASE WHEN {col} IN ('txt', 'doc', 'docx', 'pdf', 'rtf') THEN 'document' WHEN {col} IN ('jpg', 'jpeg', 'png', 'gif', 'bmp', 'svg') THEN 'image' WHEN {col} IN ('mp4', 'avi', 'mov', 'mkv', 'webm') THEN 'video' WHEN {col} IN ('mp3', 'wav', 'flac', 'aac', 'm4a') THEN 'audio' WHEN {col} IN ('zip', 'rar', '7z', 'tar', 'gz') THEN 'archive' ELSE 'other' END

JSON Schema

finetype taxonomy representation.file.extension -o json-schema
{
  "$id": "https://meridian.online/schemas/representation.file.extension",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "description": "File name extension or suffix (txt, pdf, docx, jpg, etc.). May include or exclude leading dot.",
  "examples": [
    "txt",
    ".pdf",
    "docx",
    "jpg",
    "xlsx"
  ],
  "pattern": "^\\.?[a-zA-Z0-9]{1,10}$",
  "title": "File Extension",
  "type": "string",
  "x-finetype-label": "representation.file.extension",
  "x-finetype-pii": false
}

Examples

txt.pdfdocxjpgxlsx

Aliases

file_type

Type Registry