This morning, forum user hip asked how a script could sort a CSV file. Seems CSV files, sorting and Excel has become a bit of a hot topic here lately. Since Excel can read and save to CSV and also has a Sort function we might as well make use of it. Here’s a little VBScript function I put together which will sort a CSV file by a given column:
VBSTART Sub SortCSVFile(file,sort_column) Set XLApp = CreateObject("Excel.Application") XLApp.Workbooks.Open(file) XLApp.ActiveWorkbook.ActiveSheet.Sort.SortFields.Clear Set SortCol = XLApp.Range(sort_column) XLApp.ActiveWorkbook.ActiveSheet.Sort.SortFields.Add(SortCol) Set Cell = XLApp.ActiveWorkbook.ActiveSheet.Cells.Find("*", XLApp.ActiveWorkbook.ActiveSheet.Cells(1, 1),,,,2) Set mYRange = XLApp.Range("A1", XLApp.ActiveWorkBook.ActiveSheet.Cells(Cell.Row, Cell.Column)) XLApp.ActiveWorkbook.ActiveSheet.Sort.SetRange(mYRange) XLApp.ActiveWorkbook.ActiveSheet.Sort.Apply XLApp.ActiveWorkBook.Save XLApp.ActiveWorkBook.Close false XLApp.quit End Sub VBEND
The function opens the CSV file in Excel – without making Excel visible – then performs the sort, saves the file and closes Excel. You won’t see Excel appear on the screen.
Here’s how to use it. If you want to sort your file on column B and don’t have a header row, use:
VBRun>SortCSVFile,C:\Documents\my.csv,B1
To sort on column A and you DO have a header row use:
VBRun>SortCSVFile,C:\Documents\my.csv,A2
UPDATE: Here’s a simplified version which should work in Excel XP/2003 and should be able to detect whether or not your data has a header row:
VBSTART Sub SortCSVFile(file,sort_column) Set XLApp = CreateObject("Excel.Application") XLApp.Workbooks.Open(file) XLApp.ActiveWorkbook.ActiveSheet.Range("A1").Sort XLApp.ActiveWorkbook.ActiveSheet.Range(sort_column),,,,,,,0 XLApp.ActiveWorkBook.Save XLApp.ActiveWorkBook.Close false XLApp.quit End Sub VBEND
What if you don’t have Microsoft Excel?
As mentioned in my reply on the forum another method you could use is to open the CSV file as a recordset using DBQuery and apply an “ORDER BY” clause to your SQL statement. That would sort it into memory. Then you would need to loop through the array and write the data back out to a CSV file using WriteLn.
Something like:
//new file to write to Let>new_file=%SCRIPT_DIR%\example_sorted.csv IfFileExists>new_file DeleteFile>new_file Endif //set WriteLn NOT to add line breaks (we'll add those ourselves) Let>WLN_NOCRLF=1 //The CSV delimiter Let>comma=, //Connect to and query the CSV using ORDER BY PRICE to sort on Price column Let>ConStr=Provider=Microsoft.Jet.OLEDB.4.0;Data Source=%SCRIPT_DIR%;Extended Properties='text;HDR=YES;FMT=Delimited' DBConnect>ConStr,dbH DBQuery>dbH,select * from example.csv ORDER BY PRICE,fields,num_recs,num_fields //If we have records, loop through records and fields, writing out to new file If>num_recs>0 //first need to copy the header record ... ReadLn>%SCRIPT_DIR%\example.csv,1,header_line WriteLn>new_file,wlnres,%header_line%%CRLF% //the loop Let>rec=0 Repeat>rec Let>rec=rec+1 Let>field=0 Repeat>field Let>field=field+1 Let>this_field=fields_%rec%_%field% //write the field out to the new file WriteLn>new_file,wlnres,this_field //add the comma delimiter if needed If>field<num_fields WriteLn>new_file,wlnres,comma Endif Until>field=num_fields //add CRLF at end of line WriteLn>new_file,wlnres,%CRLF% Until>rec=num_recs Endif DBClose>dbH //if you want, delete the input file and replace with the new sorted one: //DeleteFile>%SCRIPT_DIR%\example.csv //RenameFile>%SCRIPT_DIR%\example_sorted.csv,%SCRIPT_DIR%\example.csv