An interesting discussion arose in the forums yesterday on how to read data from a CSV file. The usual suggestion is to use Separate to split each line into an array using the comma as the delimiter, as explained in my post here.
The problem, as highlighted in the forum topic, is: what if the line has a field which itself contains a comma? In CSV if a field contains a comma it will be encased in double quotes. E.g.:
Sally,”2,500″,Blue
If we just use Separate using the comma as a delimiter we will end up with the following values:
Sally
“2
500”
Blue
Of course this isn’t what we want. We want to end up with:
Sally
2,500
Blue
In the forum discussion a number of suggestions were made such as to use regular expressions to find the fields inside quotes and replace the commas with something else, then use Separate and then put the commas back. Certainly we could resort to some kind of text parsing and I’ve updated the post with a RegEx solution.
However, we can avoid all this by using DBQuery and an ADO connection string that lets us connect to and query a CSV file and treat it like a database table. All the work is done for us:
Let>ConStr=Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\My Documents;Extended Properties='text;HDR=NO;FMT=Delimited' DBConnect>ConStr,dbH DBQuery>dbH,select * from test.csv,fields,num_recs,num_fields
In the connection string specify the path of the CSV file in Data Source. We can also tell it whether or not the CSV file has a header line by setting HDR to NO or Yes.
This reads all the data from the CSV file into an array.
Here’s some code to read through the array and display each field one by one:
Let>rec=0 Repeat>rec Let>rec=rec+1 Let>field=0 Repeat>field Let>field=field+1 Let>this_field=fields_%rec%_%field% MessageModal>this_field Until>field=num_fields Until>rec=num_recs
If the CSV file has a header line change HDR to Yes and then we could also tell DBQuery to return the field names:
Let>ConStr=Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\My Documents;Extended Properties='text;HDR=NO;FMT=Delimited' DBConnect>ConStr,dbH DBQuery>dbH,select * from test.csv,fields,num_recs,num_fields,1
Then we can access the data using the field names. E.g.:
Let>rec=0 Repeat>rec Let>rec=rec+1 Let>name=fields_%rec%_name Let>number=fields_%rec%_number Let>colour=fields_%rec%_colour MessageModal>%name% %number% %colour% Until>rec=num_recs
For more help with Macro Schedulers database functions see:
Using Macro Scheduler’s Database Functions