PDF Document

PDF document parsing & processing APIs

PDF Table Extraction

Parse tables in the PDF document

Securityapi_key
Request
query Parameters
parser
string

An optional parameter that refers to the PDF Table parser.

Request Body schema:
required
string <binary> (Binary File Request)

Binary file e.g. pdf, docx, html

Responses
200

The data was received successfully

400

Invalid request

403

The request is forbidden (Please input a valid API key)

post/docs/parsers/pdf/table
Request samples
Response samples
application/json
{
  • "status": {
    },
  • "result": {
    }
}

PDF Content Extraction

Parse PDF document

Securityapi_key
Request
query Parameters
engine
string

An optional parameter that refers to the PDF Table parser.

y_mul
string

An optional hyper-parameter to control text clustering along the y-axis.

page_index
string

The page index to parse. The index of the first page is 1.

w_mul
string

An optional hyper-parameter to control text clustering along the x-axis.

y_mul_small
string

An optional hyper-parameter to control small font-text clustering along the y-axis.

y_mul_space
string

An optional hyper-parameter for engine=v2, to control text space clustering along the y-axis. Must be used in conjunction with y_mul.

Request Body schema:
required
string <binary> (Binary File Request)

Binary file e.g. pdf, docx, html

Responses
200

The data was received successfully

400

Invalid request

403

The request is forbidden (Please input a valid API key)

post/docs/parsers/pdf
Request samples
Response samples
application/json
{
  • "result": {
    }
}

PDF-2-JPEG

Converts the pages of the input PDF file into JPEG with text clusters marked with bounding boxes.

Securityapi_key
Request
query Parameters
engine
string

An optional parameter that refers to the PDF Table parser.

y_mul
string

An optional hyper-parameter to control text clustering along the y-axis.

draw_bb
string

Control if the returned image should have the boundng boxes drawn.

y_mul_small
string

An optional hyper-parameter to control small font-text clustering along the y-axis.

w_mul
string

An optional hyper-parameter to control text clustering along the x-axis.

y_mul_space
string

An optional hyper-parameter for engine=v2, to control text space clustering along the y-axis. Must be used in conjunction with y_mul.

Request Body schema: application/pdf
required
string <binary> (Binary File Request)

Binary file e.g. pdf, docx, html

Responses
200

The data was received successfully

400

Invalid request

403

The request is forbidden (Please input a valid API key)

post/docs/parsers/pdf/image
Request samples
Response samples
application/json
{
  • "status": {
    },
  • "result": {
    }
}