MeridianMERIDIAN

Protein Sequence

Amino acid sequence in single-letter IUPAC notation (20 standard amino acids plus ambiguity codes).

Protein Sequence

representation.scientific.protein_sequence

Amino acid sequence in single-letter IUPAC notation (20 standard amino acids plus ambiguity codes).

Domain
representation
Category
scientific
Casts to
VARCHAR
Scope
Universal

Try it

CLI
$ finetype infer -i "MKVLLIVGS" --mode column
→ representation.scientific.protein_sequence

DuckDB

Detect
SELECT ft_infer('MKVLLIVGS');
-- → 'representation.scientific.protein_sequence'
Cast expression
UPPER(CAST({col} AS VARCHAR))
Safe cast pipeline
-- Normalise and cast in one step
SELECT TRY_CAST(ft_cast(my_column) AS VARCHAR) AS clean_value
FROM my_table
WHERE ft_infer(my_column) = 'representation.scientific.protein_sequence';

Struct Expansion

Expression
length: LENGTH({col})
molecular_weight_estimate: LENGTH({col}) * 110

JSON Schema

finetype taxonomy representation.scientific.protein_sequence -o json-schema
{
  "$id": "https://meridian.online/schemas/representation.scientific.protein_sequence",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "description": "Amino acid sequence in single-letter IUPAC notation (20 standard amino acids plus ambiguity codes).",
  "examples": [
    "MKVLLIVGS",
    "ACDEFGHIKLMNPQRSTVWYFL",
    "MPKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVG"
  ],
  "pattern": "^[ACDEFGHIKLMNPQRSTVWXY*]+$",
  "title": "Protein Sequence",
  "type": "string",
  "x-finetype-label": "representation.scientific.protein_sequence",
  "x-finetype-pii": false
}

Examples

MKVLLIVGSACDEFGHIKLMNPQRSTVWYFLMPKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVG

Aliases

proteinpeptideamino_acid_sequence

Type Registry