Reputation: 63
I'm using Power Query in Excel to reference a table within the same workbook. I want to select specific columns within that table. I know that can be accomplished by loading the table into Power Query and then choosing the columns I want to use. The resulting query is:
let
Source = Excel.CurrentWorkbook(){[Name="Legend_Data_Merged"]}[Content],
#"Removed Other Columns" = Table.SelectColumns(
Source,
{
"Observation number",
"First Sales Offer - Products",
"Middle Sales Offer(s) - Products",
"Last Sales Offer - Products"
}
)
in
#"Removed Other Columns"`
So, here's my question/issue:
I think this way is first pulling the entire table into Power Query, then stripping down from there. What I want to do is define the source table as the "Legend_Data_Merged" table, but choose which columns to pull from that table in the same operation. This way, it never has to load the entire table into Power Query. The reason is the table itself is about 120 columns long, and I only need three columns, and I have about 20 of these similar queries and it's starting to hog memory. Am I wrong in my logic here? And if not, anyone have an idea on what the query would be?
Could there maybe be a way to define the columns in the [content] part of the source operation ?
Thanks.
Upvotes: 2
Views: 3982
Reputation: 2922
You can refer to my question and answer here.
What you will want to do is use the Table.SelectColumns
method instead of Remove.
let
db = Sql.Databases("sqlserver.database.url"){[Name="DatabaseName"]}[Data],
Sales_vDimCustomer = Table.SelectColumns(
db{[Schema="Sales",Item="vDimCustomer"]}[Data],
{
"Name",
"Representative",
"Status",
"DateLastModified",
"UserLastModified",
"ExtractionDate"
}
)
in
Sales_vDimCustomer
When viewing the raw sql using Express Profiler it will be done in one statement where
SELECT
$Table.Name,
$Table.Representative,
$Table.Status,
$Table.DateLastModified,
$Table.UserLastModified,
$Table.ExtractionDate
FROM
Sales.vDimCustomer as $Table
PowerBi and Power Query will also now show an error/ warning message with this recommendation when trying to import a large number of columns.
Upvotes: 0
Reputation: 138
It may be a very simple attempt, but why not add a Worksheet "DataTransfer" where you set only references to the columns you need and read this small table with power query ?
If your columns are close together you could also set a named range and read only this range with powerquery.
But anyway, when the workbook is open, your big table is already in memory. There should not be much memory allocation, when reading the table with powerquery and selecting the three columns.
Upvotes: 1
Reputation: 6949
It's possible there's some problem in Excel or Power Query. How much memory are you seeing used by the excel.exe and Microsoft.Mashup.Container.NetFX40.exe process?
The only way to directly remove the columns from [Content]
is to modify the actual data of the Excel table. You could try that to see if it makes a difference, but Power Query generally tries to be smart about only loading columns it needs.
If your query is using a lot of memory, you might get performance saving your data in a more efficient format (I'd try CSV). In any case, try turning off the "load to worksheet" and instead just load to data model.
Upvotes: 0