Simon Hayward
Simon Hayward

Reputation: 724

Sorting Excel rows alphabetically in F# (Office.Interop)

I am using the Excel interop in Visual Studio 2010 to try to sort all of these rows of data alphabetically. Some are already in alphabetical order.

Accountancy Graduate, Trainees  Banking, Insurance, Finance
Accountancy Graduate, Trainees  Customer Services
Accountancy Graduate, Trainees  Education
Accountancy Graduate, Trainees  Health, Nursing
Accountancy Graduate, Trainees  Legal
Accountancy Graduate, Trainees  Management Consultancy
Accountancy Graduate, Trainees  Media, New Media, Creative
Accountancy Graduate, Trainees  Oil, Gas, Alternative Energy
Accountancy Graduate, Trainees  Public Sector & Services
Accountancy Graduate, Trainees  Recruitment Sales
Accountancy Graduate, Trainees  Secretarial, PAs, Administration
Accountancy Graduate, Trainees  Telecommunications
Accountancy Graduate, Trainees  Transport, Logistics

The current version of my code is as follows (I'm getting my code to work in interactive before putting it into an fs file).

#r "office.dll"
#r "Microsoft.Office.Interop.Excel.dll"

open System;;
open System.IO;;
open Microsoft.Office.Interop.Excel;;




let app = new ApplicationClass(Visible = true)
let inputBook = app.Workbooks.Open @"C:\Users\simon.hayward\Dropbox\F# Scripts\TotalJobsSort\SortData.xlsx" //work
//let inputBook = app.Workbooks.Open @"C:\Users\Simon Hayward\Dropbox\F# Scripts\TotalJobsSort\SortData.xlsx"  //home
let outputBook = app.Workbooks.Add()
let inSheet = inputBook.Worksheets.[1] :?> _Worksheet
let outSheet = outputBook.Worksheets.[1] :?> _Worksheet

let rows = inSheet.UsedRange.Rows.Count;;

let toSeq (range : Range) =
      seq {
           for r in 1 .. range.Rows.Count do
                  for c in 1 .. range.Columns.Count do
                      let cell = range.Item(r, c) :?> Range
                      yield cell
              }
for i in 1 .. rows do 

      let mutable row = inSheet.Cells.Rows.[i] :?> Range

      row |> toSeq |> Seq.map (fun x -> x.Value2.ToString()) |> Seq.sort |> 

     (outSheet.Cells.Rows.[i] :?> Range).Value2 <- row.Value2;;



app.Quit();;

But there is a problem with types. The final line before the quit command

(outSheet.Cells.Rows.[i] :?> Range).Value2 <- row.Value2;;

Is red underlined by intellisense and the error I get is

"This expression is expected to have type seq -> 'a but here has type unit".

I get what VS is trying to tell me, but I have made several attempts to fix this now and i can't seem to get around the type issue.

Can anyone please advise how I can get the pipeline to the correct type so that the output will write to my output sheet?

EDIT 1: This is the full error message that I get with the sorted variable commented out as follows

let sorted = row |> toSeq //|> Seq.map (fun x -> x.Value2.ToString()) |> Seq.sort

The error message is:-

System.Runtime.InteropServices.COMException (0x800A03EC): Exception from HRESULT: 0x800A03EC at System.RuntimeType.ForwardCallToInvokeMember(String memberName, BindingFlags flags, Object target, Int32[] aWrapperTypes, MessageData& msgData) at Microsoft.Office.Interop.Excel.Range.get_Item(Object RowIndex, Object ColumnIndex) at [email protected](Int32 c) in C:\Users\Simon Hayward\Dropbox\F# Scripts\TotalJobsSort\sortExcelScript.fsx:line 36 at [email protected](b& ) at Microsoft.FSharp.Collections.IEnumerator.MapEnumerator1.System-Collections-IEnumerator-MoveNext() at Microsoft.FSharp.Core.CompilerServices.RuntimeHelpers.takeOuter@651[T,TResult](ConcatEnumerator2 x, Unit unitVar0) at Microsoft.FSharp.Core.CompilerServices.RuntimeHelpers.takeInner@644[T,TResult](ConcatEnumerator2 x, Unit unitVar0) at <StartupCode$FSharp-Core>[email protected](IEnumerable1& next) at Microsoft.FSharp.Core.CompilerServices.GeneratedSequenceBase1.MoveNextImpl() at Microsoft.FSharp.Core.CompilerServices.GeneratedSequenceBase1.System-Collections-IEnumerator-MoveNext() at Microsoft.FSharp.Collections.SeqModule.ToArray[T](IEnumerable1 source) at Microsoft.FSharp.Collections.ArrayModule.OfSeq[T](IEnumerable1 source) at .$FSI_0122.main@() in C:\Users\Simon Hayward\Dropbox\F# Scripts\TotalJobsSort\sortExcelScript.fsx:line 42 Stopped due to error

EDIT 2: Could this problem be due to the toSeq function being designed to turn a whole sheet into a sequence? Where I apply it I only want it to apply to one row.

I have tried limiting the r variable in toSeq to 1, but this didn't help.

Does the fact that my actual data is a jagged array matter? It does not always have 3 entries in each row, it varies between 1 and 4.

EDIT 3:

Here is the current iteration of my code, based on Tomas' suggestions

#r "office.dll"
#r "Microsoft.Office.Interop.Excel.dll"

open System;;
open System.IO;;

open Microsoft.Office.Interop.Excel;;



let app = new ApplicationClass(Visible = true);;
let inputBook = app.Workbooks.Open @"SortData.xlsx" //workbook
let outputBook = app.Workbooks.Add();;
let inSheet = inputBook.Worksheets.[1] :?> _Worksheet
let outSheet = outputBook.Worksheets.[1] :?> _Worksheet

let rows = inSheet.UsedRange.Rows.Count;;
let columns = inSheet.UsedRange.Columns.Count;;

// Get the row count and calculate the name of the last cell e.g. "A13"
let rangeEnd = sprintf "A%d" columns

// Get values in the range A1:A13 as 2D object array of size 13x1
let values = inSheet.Range("A1", rangeEnd).Value2 :?> obj[,]

// Read values from the first (and only) column into 1D string array
let data = [| for i in 1 .. columns -> values.[1, i] :?> string |]
// Sort the array and get a new sorted 1D array
let sorted1D = data |> Array.sort
// Turn the 1D array into 2D array (13x1), so that we can write it back
let sorted2D = Array2D.init 1 columns (fun i _ -> data.[i])

// Write the data to the output sheet in Excel
outSheet.Range("A1", rangeEnd).Value2 <- sorted2D

But because the actual data has a variable number of entries in each row I am getting the standard range exception error (this is an improvement on the HRESULT exception errors of the last few days at least).

So I need to define columns for each individual row, or just bind the length of the row to a variable in the for loop. (I would guess).

Upvotes: 2

Views: 1174

Answers (1)

Tomas Petricek
Tomas Petricek

Reputation: 243041

It looks like you have an additional |> operator at the end of the line with Seq.sort - this means that the list is sorted and then, the compiler tries to pass it to the expression that performs the assignment (which does not take any parameter and has a type unit).

Something like this should compile (though there may be some other runtime issues):

for i in 1 .. rows do 
  let row = inSheet.Cells.Rows.[i] :?> Range
  let sorted = row |> toSeq |> Seq.map (fun x -> x.Value2.ToString()) |> Seq.sort
  (outSheet.Cells.Rows.[i] :?> Range).Value2 <- Array.ofSeq sorted

Note that you do not need to mark row as mutable, because the code creates a copy (and - in my version - assigns it to a new variable sorted).

I also use Array.ofSeq to convert the sorted sequence to an array, because I think the Excel interop works better with arrays.

When setting the Value2 property on a range, the size of the range should be the same as the size of the array that you're assigning to it. Also, depending on the range you want to set, you might need a 2D array.

EDIT Regarding runtime errors, I'm not entirely sure what is wrong with your code, but here is how I would do the sorting (assuming you have just one column with string values and you want to sort the rows):

// Get the row count and calculate the name of the last cell e.g. "A13"
let rows = inSheet.UsedRange.Rows.Count
let rangeEnd = sprintf "A%d" rows

// Get values in the range A1:A13 as 2D object array of size 13x1
let values = inSheet.Range("A1", rangeEnd).Value2 :?> obj[,]
// Read values from the first (and only) column into 1D string array
let data = [| for i in 1 .. rows -> values.[i, 1] :?> string |]
// Sort the array and get a new sorted 1D array
let sorted1D = data |> Array.sort
// Turn the 1D array into 2D array (13x1), so that we can write it back
let sorted2D = Array2D.init rows 1 (fun i _ -> data.[i])

// Write the data to the output sheet in Excel
outSheet.Range("A1", rangeEnd).Value2 <- sorted

Upvotes: 4

Related Questions