Extract Excel Worksheet Data Without Tables - Power Automate
A reasonably sound method for coaxing unstructured data from Excel worksheets
Overview
Data Location | Power Automate Compatibility |
Excel table | Good functionality, with some thresholds and limitations |
Excel table (with key column) | Same as above, plus additional OData capabilities |
Excel CSV | Supported, but not through the Excel Online connector |
Excel worksheet | Data operations not supported |
Historically, that last item creates a lot of frustration. ๐ Excel is used extensively in the business world and this limitation puts its most commonly-used format outside the reach of Microsoft's own automation platform.
Unstructured Data - Example
The data in this worksheet looks like a table to you and me, but without a formal definition, Automate can't work with the information here.
It needs explicit structure before column relationships and the corresponding data inside the Excel document can be utilized.
The Scenario
You might not use Automate to work with Excel very often1, but like most juggernauts, it's sometime unavoidable. Here is a situation that presented itself recently:
- A partner organization emails a collection of Excel worksheets every morning
- A flow stores them in a SharePoint Online document library
- The worksheets have consistent structures but the data is not contained within tables
- The data needs to reach Azure SQL each day without manual intervention
The Solution
Preparing For Multiple Files/Formats
Our scenario involves multiple worksheets, each with a unique data structure. I used an array variable to store configuration details at the beginning of the flow:
Note: If you're working with one Excel file, feel free to skip to the next section
Each object contains information about a worksheet:
Name | Example | Usage |
filename | incomingWorksheet2.xlsx | Name of the Excel worksheet |
colWidth | BA | Identifier for the last expected column |
rowLength | 2500 | Number of table rows to create (see notes on "padding" in next section) |
colHeadings | CustomerName;Item;Price | OPTIONAL - column heading overrides |
tableName | Sales | SQL table where data will be stored. Also used to name each Excel table created. |
keyColumn | saleID | Column used to filter results - use one that has a value in every row |
Next, we will loop over the array, and the real party can begin:
Note: Pre-baked arrays are okay when quickly proving out a new method. For an actual solution, environment variables are a better choice.
Adding Tables into Excel Worksheets with Existing Data
The Excel connector within Power Automate has a Create Table action. You can use this action to insert a table into an Excel worksheet that already contains data. I was very surprised to see this work!
We're not out of the woods yet, though. The amount of worksheet data fluctuates each day in our scenario, which is a problem because Automate can't "see" how much data exists at each runtime.
We can mitigate some of the uncertainty by "padding" the Create Table action. For example, if you expect ~2,000 rows of data in a worksheet, have the flow create 2,500-3,000 rows when inserting the table.
Without surplus rows, you risk isolating data. Automate can't create tables twice in the same worksheet, so options for self-correction are slim if you undershoot the mark. Aim high, but don't go too far overboard!
Note: This "padding" technique only works if you can make solid predictions about the expected data
Here is the Create Table action, using dynamic values from the configuration array:
Note: Use fixed values for the parameters if you're operating on a single worksheet
If your setup is correct, the existing Excel data will remain in place while the new table is inserted. The action will continue until it hits the row limit, adding blank rows to the worksheet once it runs past the data.
Getting Data Back from Excel
Use List Table Rows to pull the data into Automate. Provide an OData filter query (under advanced options) to retrieve non-blank rows only:
items('for_each_Excel_document')?['keyColumn']
Unless you are dealing with very small data sets, don't forget to enable pagination for the List Table Rows action. This will allow the flow to pull all the data back from Excel in successive chunks, then proceed once complete.
Set the threshold equal to the highest row count you'll be using in the Create Table action(s).
Do Something With the Results
If you've made it this far, the possibilities are wide open! Extracting unstructured data from Excel worksheets using Power Automate is no small feat.
Possibilities From Here
- Loop over each row in the result set and store the data somewhere (SQL, SharePoint, Dataverse, etc.) line-by-line
- Use a switch statement to handle the results from unique worksheets in different ways
- Pass the result set into a SQL table (1,000 rows at a time) using the Execute a SQL query action
Thoughts on Excel as a Data Source
Excel is often a popular data source for users getting started with the Power Platform. It's a familiar, flexible and portable format.
I don't typically recommend it - there are more performant and less volatile options available. Sometimes you don't have a choice, so it's good to have a few tricks up your sleeve.
Disclosure and Final Thought
There are better, far more elegant ways to solve this problem than the approach described on this page. What's outlined above isn't a foolproof method. It's not designed for heavy usage across a broad spectrum of enterprise-grade scenarios.
But in a pinch, and for the right use cases, this method should get the job done. I hope this comes in handy for individuals who don't necessarily have the means, expertise or time to handle app registration, HTTP calls, Graph, and so forth. As you continue your journey along the Power Platform progression arc, I hope you learn those methods, too.