Client provides a high-level interface for OCR operations.
type Client struct {
// contains filtered or unexported fields
}
func NewClient(service OCRService) *Client
NewClient creates a new OCR client with the given service.
func NewOCRHTTPClient(options OCROptions) *Client
NewOCRHTTPClient creates a new OCR client using HTTP service with the given options.
func (c *Client) BatchProcess(ctx context.Context, images []*model.Image) ([][]byte, []error)
BatchProcess processes multiple images concurrently.
func (c *Client) BatchProcessFiles(ctx context.Context, filePaths []string) ([][]byte, []error)
BatchProcessFiles processes multiple image files concurrently from a list of file paths.
func (c *Client) CallEndpoint(ctx context.Context, body io.Reader) ([]byte, error)
CallEndpoint executes an HTTP request using the underlying service's configured options. The URL, method, headers, and other settings are taken from the OCROptions used to create the service.
To call different endpoints, create separate service instances with different URLs:
statusService := NewHTTPOCRService(OCROptions{Url: "http://localhost:8080/status", Method: "GET"})
ocrService := NewHTTPOCRService(OCROptions{Url: "http://localhost:8080/ocr", Method: "POST"})
Parameters:
- body: Request body (can be nil for GET requests or when no body is needed)
Example usage:
result, err := client.CallEndpoint(ctx, nil) // GET request result, err := client.CallEndpoint(ctx, requestBody) // POST request with body
func (c *Client) ExtractText(ctx context.Context, reader io.Reader, filename string) ([]byte, error)
ExtractText extracts text from an image reader
func (c *Client) ExtractTextFromFile(ctx context.Context, filePath string) ([]byte, error)
ExtractTextFromFile extracts text from an image file.
func (c *Client) ExtractTextFromImage(ctx context.Context, image *model.Image) ([]byte, error)
ExtractTextFromImage extracts text from a UniPDF image
func (c *Client) Service() OCRService
Service returns the underlying OCR service.
func (c *Client) WithService(service OCRService) *Client
WithService returns a new client with the given service.
HTTPOCRService implements OCRService using HTTP requests.
This service is mainly designed to work with https://github.com/unidoc/ocrserver but can be adapted to other HTTP-based OCR services with similar APIs.
type HTTPOCRService struct {
// contains filtered or unexported fields
}
func NewHTTPOCRService(options OCROptions) *HTTPOCRService
NewHTTPOCRService creates a new HTTP-based OCR service.
func (h *HTTPOCRService) CallEndpoint(ctx context.Context, body io.Reader) ([]byte, error)
CallEndpoint executes an HTTP request to the configured URL. This provides a general method for users to execute HTTP requests using the service's configured options (URL, method, headers, timeout, etc.).
The URL, method, and headers are taken from the OCROptions used to create this service. To call different endpoints, create separate service instances with different URLs.
Parameters:
- body: Request body (can be nil for GET requests or when no body is needed)
func (h *HTTPOCRService) ExtractText(ctx context.Context, reader io.Reader, filename string) ([]byte, error)
ExtractText extracts text from an image reader. The filename parameter is used to set the filename in the multipart form data. If filename is empty, a default name based on content type will be used.
Parameters:
- reader: Image data reader (e.g., file, buffer) - filename: Optional filename for the uploaded file (used in multipart form)
OCROptions provides configuration for HTTP-based OCR services.
type OCROptions struct {
// URL for the OCR service.
Url string
// HTTP method to use for the OCR request (default: POST).
Method string
// Custom headers to add to the request.
Headers map[string]string
// Form field name for the file upload (default: "file").
FileFieldName string
// Additional form fields to send with the request.
FormFields map[string]string
// HTTP client timeout in seconds (default: 30).
TimeoutSeconds int
// Maximum number of retry attempts on failure (default: 0, no retry).
MaxRetries int
// Custom HTTP client (optional - if provided, TimeoutSeconds is ignored).
Client *http.Client
// Custom request modifier function (called after request is created).
RequestModifier func(*http.Request) error
}
OCRService defines the interface for OCR services.
type OCRService interface {
// ExtractText extracts text from an image reader.
// The filename parameter is used to set the filename in the multipart form data.
// If filename is empty, a default name based on content type will be used.
//
// Parameters:
// - reader: Image data reader (e.g., file, buffer)
// - filename: Optional filename for the uploaded file (used in multipart form)
ExtractText(ctx context.Context, reader io.Reader, filename string) ([]byte, error)
// CallEndpoint executes an HTTP request using the service's configured URL and options.
CallEndpoint(ctx context.Context, body io.Reader) ([]byte, error)
}