OCR Documentation - AI Models | API Documentation

OCR Inference API

POSThttps://ocrdoc.infer.nt-ai.cloud/predict

Header

X-API-Key string required

Your API key

multipart/form-data

form-data body

Request Body

files Filerequired
Image or document raw files in a form of multi-part form data using the key name files.

optional

Send with the form of multi-part form data

box_threshold float

Default value: 0.4

Adjusting the box_threshold value, ranged between 0 to 1, affects the detection of text in documents. A lower value allows the model to detect more bounding boxes, while a higher value reduces detection sensitivity. It is recommended to start with the default value of 0.4 and gradually increment the value by 0.1 until achieving the desired result for the document being used. (Number between 0 - 1)

Responses

application/json

Schema

Example (from schema)

Schema

Array [

object

filename string

File name

status string

success | failed

Status of request

result Array [

object

page number

The page number corresponding to the retrieved text.

full_text string

The full text content of the specified page. This includes all text present on the page, with newline characters (\n) representing line breaks.

image_size number[]

The size of the original file or the image of the specified page in pixels. The first value specifies the height, and the second value specifies the width (e.g. [1980, 1530]).

data Array [

object

bbox: [[x1, y1], [x2, y2], [x3, y3], [x4, y4]]

4-pixel coordinate x, y of text box.

text string

The extracted text in each box.

]

]

[
   {
      "filename": "filename.pdf",
      "status": "success",
      "result": [
         {
            "page": 1,
            "data": [
               {
                  "bbox": [
                     [
                        32,
                        14
                     ],
                     [
                        196,
                        14
                     ],
                     [
                        196,
                        48
                     ],
                     [
                        32,
                        48
                     ]
                  ],
                  "text": "เอกสาร"
               },
               {
                  "bbox": [
                     [
                        80,
                        46
                     ],
                     [
                        150,
                        46
                     ],
                     [
                        150,
                        78
                     ],
                     [
                        80,
                        78
                     ]
                  ],
                  "text": "หน้าที่ 1"
               }
            ],
            "full_text": "เอกสาร\nหน้าที่ 1",
            "image_size": [1980, 1530]
         },
         {
            "page": 2,
            "data": [
               {
                  "bbox": [
                     [
                        32,
                        14
                     ],
                     [
                        196,
                        14
                     ],
                     [
                        196,
                        48
                     ],
                     [
                        32,
                        48
                     ]
                  ],
                  "text": "นกกำลังบินออก"
               }
            ],
            "full_text": "นกกำลังบินออก",
            "image_size": [1980, 1530]
         }
      ]
   }
]

OCR Inference API​

OCR Inference API