Tutorial: How to Get Document Information with GroupDocs.Merger Cloud API
Learning Objectives
In this tutorial, you’ll learn how to use the GroupDocs.Merger Cloud API to retrieve essential information about documents stored in your cloud storage. This foundational skill will help you understand document structure before performing more complex operations.
Prerequisites
Before starting this tutorial, ensure you have:
- A GroupDocs.Merger Cloud account (or sign up for a free trial
- Your Client ID and Client Secret from the GroupDocs Cloud Dashboard
- At least one document uploaded to your cloud storage
- Basic understanding of REST API concepts
- Development environment for your preferred language (optional for cURL examples)
Use Case Scenario
Imagine you’re building a document management system that needs to display document properties to users before they perform merge operations. You need to extract information such as:
- Document file extension
- File size
- File format
- Page dimensions
- Page visibility status
This tutorial will show you how to retrieve this information programmatically.
Step 1: Understanding the Document Information API
The GroupDocs.Merger Cloud API provides an endpoint that returns basic information about a document and its pages. This REST API accepts the document storage path as an input payload.
Here’s what you can retrieve:
- Document properties: File extension, size in bytes, and file format
- Page properties: Width and height in pixels, page number, and visibility status
The API endpoint for this operation is:
HTTP POST ~/info
Step 2: Preparing Your Request
To retrieve document information, you’ll need to:
- Obtain an access token using your Client ID and Client Secret
- Make a POST request to the info endpoint with your document path
Let’s break this down:
Step 2.1: Get Your Access Token
First, you need to authenticate with the API to receive a JSON Web Token (JWT):
curl -v "https://api.groupdocs.cloud/connect/token" \
-X POST \
-d "grant_type=client_credentials&client_id=YOUR_CLIENT_ID&client_secret=YOUR_CLIENT_SECRET" \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Accept: application/json"
Replace YOUR_CLIENT_ID
and YOUR_CLIENT_SECRET
with your actual credentials.
The response will contain an access token that you’ll use in the next step.
Step 2.2: Formulate Your Document Information Request
Now, construct a POST request to retrieve document information:
curl -v "https://api.groupdocs.cloud/v1.0/merger/info" \
-X POST \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
-d "{ \"FilePath\": \"/wordprocessing/four-pages.docx\" }"
Replace YOUR_ACCESS_TOKEN
with the token you received in Step 2.1, and adjust the FilePath
to match your document’s location in cloud storage.
Step 3: Understanding the Response
When you execute the request, you’ll receive a JSON response like this:
{
"extension": ".docx",
"pages": [
{
"width": 612,
"height": 792,
"pageNumber": 1,
"visible": true
},
{
"width": 612,
"height": 792,
"pageNumber": 2,
"visible": true
},
{
"width": 612,
"height": 792,
"pageNumber": 3,
"visible": true
},
{
"width": 612,
"height": 792,
"pageNumber": 4,
"visible": true
}
],
"size": 8651,
"fileFormat": "Microsoft Word Open XML Document"
}
This response shows:
- The document is a
.docx
file (Microsoft Word Open XML Document) - It has 4 pages, each 612×792 pixels when converted to an image
- All pages are visible (important for page manipulation operations)
- The file size is 8,651 bytes
Step 4: Implementing with SDK in Your Application
While you can use direct REST API calls as shown above, GroupDocs provides SDKs for various programming languages to simplify integration. Let’s see how to get document information using these SDKs.
C# Example
// For complete examples and data files, please go to https://github.com/groupdocs-merger-cloud/groupdocs-merger-cloud-dotnet-samples
string MyClientSecret = "YOUR_CLIENT_SECRET"; // Get ClientId and ClientSecret from https://dashboard.groupdocs.cloud
string MyClientId = "YOUR_CLIENT_ID"; // Get ClientId and ClientSecret from https://dashboard.groupdocs.cloud
var configuration = new Configuration(MyClientId, MyClientSecret);
var apiInstance = new InfoApi(configuration);
var fileInfo = new FileInfo
{
FilePath = "wordprocessing/four-pages.docx"
};
var request = new GetInfoRequest(fileInfo);
var response = apiInstance.GetInfo(request);
// Document information
Console.WriteLine("Document properties:");
Console.WriteLine($"File format: {response.FileFormat}");
Console.WriteLine($"Extension: {response.Extension}");
Console.WriteLine($"Size in bytes: {response.Size}");
// Page information
Console.WriteLine("\nPage information:");
foreach (var page in response.Pages)
{
Console.WriteLine($"\nPage #{page.PageNumber}");
Console.WriteLine($"Width: {page.Width}px");
Console.WriteLine($"Height: {page.Height}px");
Console.WriteLine($"Visible: {page.Visible}");
}
Python Example
# For complete examples and data files, please go to https://github.com/groupdocs-merger-cloud/groupdocs-merger-cloud-python-samples
import groupdocs_merger_cloud
client_id = "YOUR_CLIENT_ID" # Get your client ID and secret from https://dashboard.groupdocs.cloud
client_secret = "YOUR_CLIENT_SECRET" # Get your client ID and secret from https://dashboard.groupdocs.cloud
# Create instance of the API
configuration = groupdocs_merger_cloud.Configuration(client_id, client_secret)
api = groupdocs_merger_cloud.InfoApi(configuration)
# Prepare request
file_info = groupdocs_merger_cloud.FileInfo()
file_info.file_path = "wordprocessing/four-pages.docx"
request = groupdocs_merger_cloud.GetInfoRequest(file_info)
# Retrieve document information
result = api.get_info(request)
# Document information
print("Document properties:")
print(f"File format: {result.file_format}")
print(f"Extension: {result.extension}")
print(f"Size in bytes: {result.size}")
# Page information
print("\nPage information:")
for page in result.pages:
print(f"\nPage #{page.page_number}")
print(f"Width: {page.width}px")
print(f"Height: {page.height}px")
print(f"Visible: {page.visible}")
Java Example
// For complete examples and data files, please go to https://github.com/groupdocs-merger-cloud/groupdocs-merger-cloud-java-samples
import com.groupdocs.cloud.merger.client.*;
import com.groupdocs.cloud.merger.model.*;
import com.groupdocs.cloud.merger.api.InfoApi;
public class GetDocumentInformationExample {
public static void main(String[] args) {
String clientId = "YOUR_CLIENT_ID"; // Get ClientId and ClientSecret from https://dashboard.groupdocs.cloud
String clientSecret = "YOUR_CLIENT_SECRET"; // Get ClientId and ClientSecret from https://dashboard.groupdocs.cloud
// Create instance of the API
Configuration configuration = new Configuration(clientId, clientSecret);
InfoApi api = new InfoApi(configuration);
try {
// Prepare request
FileInfo fileInfo = new FileInfo();
fileInfo.setFilePath("wordprocessing/four-pages.docx");
GetInfoRequest request = new GetInfoRequest(fileInfo);
// Get document information
InfoResult result = api.getInfo(request);
// Document information
System.out.println("Document properties:");
System.out.println("File format: " + result.getFileFormat());
System.out.println("Extension: " + result.getExtension());
System.out.println("Size in bytes: " + result.getSize());
// Page information
System.out.println("\nPage information:");
for (PageInfo page : result.getPages()) {
System.out.println("\nPage #" + page.getPageNumber());
System.out.println("Width: " + page.getWidth() + "px");
System.out.println("Height: " + page.getHeight() + "px");
System.out.println("Visible: " + page.isVisible());
}
} catch (Exception e) {
System.out.println("Error: " + e.getMessage());
e.printStackTrace();
}
}
}
Try It Yourself: Practice Exercises
Now that you understand how to retrieve document information, try these exercises to reinforce your learning:
Exercise 1: Retrieve information for different file types (PDF, PPTX, etc.) and compare the responses. Notice how different document formats may have different page properties.
Exercise 2: Create a simple application that displays document thumbnails based on the page dimensions retrieved from the API.
Exercise 3: Use the page visibility property to identify and list only visible pages in a document with hidden pages.
Troubleshooting Tips
- Authentication errors: Ensure your Client ID and Client Secret are correct and that your account has appropriate permissions.
- File not found errors: Verify that the file path in your request matches the actual location in your cloud storage.
- Invalid format errors: Confirm that the document format is supported by GroupDocs.Merger Cloud by using the “Get Supported File Formats” API.
What You’ve Learned
In this tutorial, you’ve learned how to:
- Retrieve comprehensive information about documents stored in the cloud
- Access detailed page properties for document manipulation
- Implement document information retrieval using different programming languages
- Interpret and utilize the API response to understand document structure
Next Steps
Now that you can retrieve document information, a logical next step is to learn about the file formats supported by GroupDocs.Merger Cloud. Continue to our Tutorial: Getting Supported File Formats to expand your knowledge.
Helpful Resources
Feedback
Did you find this tutorial helpful? Do you have questions about retrieving document information? Let us know in our support forum.