Tutorial: Converting Documents to Spreadsheet Formats

In this tutorial, you’ll learn how to convert various document types to Excel and other spreadsheet formats using GroupDocs.Conversion Cloud API. We’ll cover conversion with different output options, handling complex conversions, and both storage-based and stream-based conversion methods.

Learning Objectives

By the end of this tutorial, you will be able to:

  • Convert documents to XLSX, XLS, and other spreadsheet formats
  • Customize spreadsheet conversion options for optimal results
  • Process password-protected documents during conversion
  • Implement both storage-based and stream-based conversions
  • Handle common spreadsheet conversion challenges

Prerequisites

Before starting this tutorial, you need:

  1. A GroupDocs.Conversion Cloud account
  2. Your Client ID and Client Secret credentials
  3. Basic understanding of REST API concepts
  4. Development environment with your preferred programming language set up
  5. Sample documents to test conversion (we’ll use various formats including Word and PDF)

Implementation Steps

Step 1: Authentication with GroupDocs.Conversion Cloud API

Before performing any operations, we need to authenticate with the API using your Client ID and Client Secret.

Try it yourself

First, let’s obtain a JWT access token using cURL:

# First get JSON Web Token
curl -v "https://api.groupdocs.cloud/connect/token" \
-X POST \
-d "grant_type=client_credentials&client_id=YOUR_CLIENT_ID&client_secret=YOUR_CLIENT_SECRET" \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Accept: application/json"

Make sure to replace YOUR_CLIENT_ID and YOUR_CLIENT_SECRET with your actual credentials.

If using an SDK, authentication is handled when configuring the client:

// Java SDK example
String clientId = "YOUR_CLIENT_ID";
String clientSecret = "YOUR_CLIENT_SECRET";
Configuration configuration = new Configuration(clientId, clientSecret);
ConvertApi apiInstance = new ConvertApi(configuration);

Step 2: Basic Document to Spreadsheet Conversion

Let’s start with a basic conversion from a Word document to XLSX format:

Try it yourself

Using cURL:

curl -X POST "https://api.groupdocs.cloud/v2.0/conversion" \
-H "accept: application/json" \
-H "authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d "{  
      'FilePath': 'documents/sample.docx',  
      'Format': 'xlsx',  
      'OutputPath': 'converted'
    }"

Don’t forget to replace YOUR_JWT_TOKEN with the actual token received in Step 1.

The response will contain the URL to the converted file:

{
  "name": "sample.xlsx",
  "size": 7265,
  "url": "MyStorage:converted/sample.xlsx"
}

Step 3: Implementing Conversion with Specific Options

Now, let’s implement a complete example that converts a document to XLSX with specific spreadsheet conversion options:

// Java SDK Example
import com.groupdocs.cloud.conversion.api.*;
import com.groupdocs.cloud.conversion.client.ApiException;
import com.groupdocs.cloud.conversion.model.*;
import com.groupdocs.cloud.conversion.model.requests.*;
import java.util.List;

public class ConvertToSpreadsheetExample {
    public static void main(String[] args) {
        // Configure API client
        String clientId = "YOUR_CLIENT_ID";
        String clientSecret = "YOUR_CLIENT_SECRET";
        Configuration configuration = new Configuration(clientId, clientSecret);
        ConvertApi apiInstance = new ConvertApi(configuration);
        
        try {
            // Prepare convert settings with spreadsheet-specific options
            ConvertSettings settings = new ConvertSettings();
            settings.setFilePath("documents/sample.docx");
            settings.setFormat("xlsx");
            
            // Specify spreadsheet conversion options
            SpreadsheetConvertOptions convertOptions = new SpreadsheetConvertOptions();
            convertOptions.setFromPage(1);        // Start from first page
            convertOptions.setPagesCount(5);      // Convert up to 5 pages
            convertOptions.setZoom(150);          // Set zoom level to 150%
            
            settings.setConvertOptions(convertOptions);
            settings.setOutputPath("converted");
            
            // Execute conversion
            List<StoredConvertedResult> result = apiInstance.convertDocument(
                new ConvertDocumentRequest(settings));
            
            System.out.println("Document converted successfully: " + result.get(0).getUrl());
        } catch (ApiException e) {
            System.err.println("Exception when calling ConvertApi: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

Step 4: Converting from PDF to XLSX

PDF documents often contain tabular data that would be valuable in spreadsheet form. Here’s how to convert a PDF to XLSX:

# Python SDK Example
import groupdocs_conversion_cloud
from groupdocs_conversion_cloud.models.requests import ConvertDocumentRequest

# Configure API client
client_id = "YOUR_CLIENT_ID"
client_secret = "YOUR_CLIENT_SECRET"
api_instance = groupdocs_conversion_cloud.ConvertApi.from_keys(client_id, client_secret)

try:
    # Prepare conversion settings
    settings = groupdocs_conversion_cloud.ConvertSettings()
    settings.file_path = "documents/sample.pdf"
    settings.format = "xlsx"
    
    # Configure PDF-specific load options
    load_options = groupdocs_conversion_cloud.PdfLoadOptions()
    load_options.password = ""  # If PDF is protected, provide password
    load_options.hide_pdf_annotations = True
    load_options.flatten_all_fields = True
    
    settings.load_options = load_options
    
    # Configure Excel-specific convert options
    convert_options = groupdocs_conversion_cloud.SpreadsheetConvertOptions()
    convert_options.from_page = 1
    convert_options.pages_count = 10
    convert_options.password = "newPassword"  # Optional: Password-protect the result
    
    settings.convert_options = convert_options
    settings.output_path = "converted"
    
    # Execute conversion
    request = ConvertDocumentRequest(settings)
    result = api_instance.convert_document(request)
    
    print(f"Document converted successfully: {result[0].url}")
except groupdocs_conversion_cloud.ApiException as e:
    print(f"Exception when calling ConvertApi: {e}")

Step 5: Stream-Based Conversion to Spreadsheet

For applications that need to process the converted spreadsheet directly without storing it:

// C# SDK Example
using System;
using System.IO;
using GroupDocs.Conversion.Cloud.Sdk.Api;
using GroupDocs.Conversion.Cloud.Sdk.Client;
using GroupDocs.Conversion.Cloud.Sdk.Model;
using GroupDocs.Conversion.Cloud.Sdk.Model.Requests;

namespace ConversionToSpreadsheetTutorial
{
    class Program
    {
        static void Main(string[] args)
        {
            // Configure API client
            var configuration = new Configuration("YOUR_CLIENT_ID", "YOUR_CLIENT_SECRET");
            var apiInstance = new ConvertApi(configuration);

            try
            {
                // Specify conversion settings - note OutputPath is null for stream output
                var settings = new ConvertSettings
                {
                    StorageName = "MyStorage",
                    FilePath = "documents/sample.docx",
                    Format = "xlsx",
                    ConvertOptions = new XlsxConvertOptions()
                    {
                        FromPage = 1, 
                        PagesCount = 5
                    },
                    // OutputPath = null indicates stream output
                    OutputPath = null
                };

                // Convert and get result as a stream
                Stream response = apiInstance.ConvertDocumentDownload(
                    new ConvertDocumentRequest(settings));
                
                Console.WriteLine("Document converted successfully, stream size: " + 
                    response.Length.ToString() + " bytes");
                
                // Process the stream as needed, for example, save to local file
                using (var fileStream = File.Create("converted-spreadsheet.xlsx"))
                {
                    response.CopyTo(fileStream);
                }
            }
            catch (Exception e)
            {
                Console.WriteLine("Error: " + e.Message);
            }
        }
    }
}

Step 6: Converting Password-Protected Documents

If you need to convert password-protected documents to spreadsheet formats:

// Node.js SDK Example
const { ConvertApi, Configuration } = require("groupdocs-conversion-cloud");

// Configure API client
const clientId = "YOUR_CLIENT_ID";
const clientSecret = "YOUR_CLIENT_SECRET";
const config = new Configuration(clientId, clientSecret);
const apiInstance = new ConvertApi(config);

// Prepare conversion settings
const settings = {
    filePath: "documents/protected.docx",
    format: "xlsx",
    loadOptions: {
        // DocxLoadOptions for password-protected Word document
        password: "documentPassword",
        hideWordTrackedChanges: true,
        defaultFont: "Arial"
    },
    convertOptions: {
        // XlsxConvertOptions for resulting spreadsheet
        fromPage: 1,
        pagesCount: 10,
        password: "resultPassword", // Optional: set password for the resulting file
        usePdf: false
    },
    outputPath: "converted"
};

// Execute conversion
apiInstance.convertDocument({ convertSettings: settings })
    .then((result) => {
        console.log(`Document converted successfully: ${result[0].url}`);
    })
    .catch((error) => {
        console.log(`Error: ${error.message}`);
    });

Spreadsheet-Specific Conversion Options

When converting to spreadsheet formats, you can take advantage of these specialized options:

OptionDescriptionDefault
FromPageFirst page number to convert1
PagesCountNumber of pages to convertAll pages
PasswordSet password for the resulting spreadsheetnull
ZoomZoom factor for conversion (in %)100
UsePdfWhether to use PDF as intermediate formatfalse
OnePagePerSheetPlace each page on a separate worksheetfalse
ShowGridLinesDisplay grid lines in the resulting spreadsheettrue
SetCustomHeightUse custom row heightfalse

Troubleshooting Common Issues

1. Table Structure Issues

If the table structure in the converted spreadsheet doesn’t match the original:

  • Try setting UsePdf to true, which may preserve layout better
  • For complex tables, consider using the intermediate PDF conversion

2. Formatting Problems

If formatting isn’t preserved properly:

  • Check if the source document uses custom fonts and consider specifying them
  • Try adjusting the zoom level to improve layout preservation

3. Performance Considerations

For large documents:

  • Use the FromPage and PagesCount parameters to convert only the needed pages
  • Consider using parallel processing for batch conversions

What You’ve Learned

In this tutorial, you’ve learned:

  • How to convert various document formats to spreadsheet formats
  • Customizing spreadsheet conversion options for optimal results
  • Handling password protection during conversion
  • Implementing both storage-based and stream-based approaches
  • Troubleshooting common spreadsheet conversion issues

Further Practice

To reinforce your learning, try these exercises:

  1. Convert a multi-page PDF document to XLSX with each page as a separate worksheet
  2. Create a batch conversion utility that converts multiple documents to spreadsheet format
  3. Implement a web form that allows users to upload and convert documents with custom options
  4. Add error handling that provides useful feedback for common conversion issues

Next Tutorial

Ready to learn more? Continue with our Tutorial: Converting Documents to Word Processing Formats to master specialized conversions to DOCX, DOC, and other text document formats.

Additional Resources

Have questions about this tutorial? Feel free to reach out on our forum for support.