DocExtract Digitization API Documentation
Convert raw PDF into fully editable, structured PDFs and DOCX files using DocExtract’s AI-powered digitisation API.
Base URL
https://docextract.ai/api/v1/docextract/digitised/pdf/
All API requests should be made to this endpoint using the POST method.
Authentication
Include your API key in every request for authentication. Get Your api keys at docextract.ai/manage-api
api_key = "ABCDE*****"
| Field | Location | Required | Description |
|---|---|---|---|
api_key |
Form Data | Required | Unique key issued per organization |
Request Parameters
Content-Type: multipart/form-data
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
file |
File | Required | - | The Image file to process |
api_key |
String | Required | - | Organization's API key |
language |
String | Optional | Original | Select the language of the document. The value must be a single supported language name in lowercase, for example english, hindi, or original. |
keep_records |
Boolean | Optional | False | Whether to store extracted data in DocExtract also return list of image urls |
output_format |
String | Optional | The format of the final digitised document. Supported values are pdf and docx. If not provided, the default output format is pdf. |
Example: cURL Request
curl -X POST "https://docextract.ai/api/v1/docextract/digitised/pdf/" \
-F "file=@invoice.pdf" \
-F "api_key=ABCDE*****" \
-F "language=english" \
-F "keep_records=true" \
-F "output_format=pdf" \
-F "output_image_format=base64"
Response Schema
Success Response (200 OK)
Returned upon successful image processing.
{
"status": "success",
"pages": 1,
"credits_remaining": 97620,
"file_name": "rashan.jpg",
"file_extension": "jpg",
"file_size": 55008,
"processed_url": "https://de-intelliteam.s3.ap-south-1.amazonaws.com/113/733/pdf/8a4b9d11-49c3-425e-9a37-3b3bfcbbdfc1.pdf"
}
Common Error Responses
Invalid API Key (401 Unauthorized)
Returned when the API key fails server-side validation.
{
"error": "[APIKeyValidationError 401]: Unexpected server error during API key validation."
}
Missing Form Fields (400 Bad Request)
Returned if the file or api_key
fields are missing from the request. This can also occur if the file field is
present but has no data.
{
"file": ["This field is required."],
"api_key": ["This field is required."]
}
Unsupported File Format (400 Bad Request)
Returned when a file with an unsupported extension (e.g., .txt, .docx) is sent.
{
"error": "Uploaded file is not a PDF."
}
Invalid Language (400 Bad request)
Returned when inavlid language (e.g., hindii, Hindi) is sent.
{
"language": "Invalid language selection."
}
No Credits left
Returned when there is no Credit Left.
{
"error": "Not enough credits: PDF pages=1, available credits=0"
}
Usage Limit Exceeded
Returned when the monthly page limit for the API key has been exceeded.
{
"error": "Monthly page limit exceeded."
}
File Processing Error (500 Internal Server Error)
A server-side error returned when the backend fails to
process the file, such as when sending a non-image file (e.g., a PDF) to the
/image/ endpoint.
{
"error": "File processing failed"
}
Processing Workflow
- API Key Validation - Verify authentication credentials
- File Reading & Format Check - Validate Image file integrity
- Page Count & Quota Check - Verify available credits
- Concurrent Page Processing - Asynchronous AI parsing
- Data Aggregation - Compile results in JSON/DataFrame format
- S3 Upload - Store results if enabled
Quota Management
| Metric | Description |
|---|---|
credits |
Pages remaining in current billing cycle |
monthly_page_processed |
Total pages processed this month |
storage_used |
Bytes stored in S3 storage |
processed_url |
Output file after process |
file_size |
Size of file in KBs |
Security & Encryption
- Unique API keys per organization
- Sanitized file handling and validation
- End-to-end encryption for extracted data
- Built-in quota and abuse protection
Sample Output
{
"status": "success",
"pages": 1,
"credits_remaining": 97620,
"file_name": "rashan.jpg",
"file_extension": "jpg",
"file_size": 55008,
"processed_url": "https://de-intelliteam.s3.ap-south-1.amazonaws.com/113/733/pdf/8a4b9d11-49c3-425e-9a37-3b3bfcbbdfc1.pdf"
}