What is the .udt.json format?

The UDT JSON format is an open-source format for specifying human annotation tasks. For example, you might use the .udt.json format to store the labels and specification for transcribing audio or labeling images.

The UDT JSON format can be converted to and from an equivalent CSV format easily.

The basic structure of a UDT JSON file is this:

{
    "interface": {
        "type": "<some_interface>"
        // ... more interface details
    },
    "samples": [
        { /* sample json object */ },
        // ...
    ]
}

Here's an example for an image segmentation dataset...

{
  "interface": {
    "type": "image_segmentation",
    "labels": [
      {
        "id": "cat",
        "description": "Feline Mammal"
      },
      {
        "id": "dog",
        "description": "Canine Mammal"
      }
    ],
    "regionTypesAllowed": ["bounding-box"],
    "multipleRegions": true
  },
  "samples": [
    {
      "imageUrl": "https://media.gettyimages.com/photos/dog-and-cat-picture-id151350785"
    },
    {
      "imageUrl": "https://media.gettyimages.com/photos/guess-who-rules-the-roost-in-that-house-picture-id500927195"
    },
    {
      "imageUrl": "https://media.gettyimages.com/photos/she-simply-loves-animals-picture-id499806311"
    }
  ]
}

Principles

The principles that drive the UDT format are...

  • Complete Specificity such that no additional documents or conversations are required to perform the task.

  • Simplicity and Human Readability so that datasets can be easily examined in the JSON format and understood

  • Specificity such that no additional documents or conversations are required to start labeling

Last updated