January 28, 2010

Force DBQuery to Read CSV Columns as Text

Filed under: Scripting — Marcus Tettmar @ 1:04 pm

In More on Reading and Writing CSV Files and various other posts I have demonstrated how CSV data can be read via a SQL SELECT statement using DBQuery.

In order to do this we used an ADO connection string in the DBConnect line:

Let>ConStr=Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\My Documents;Extended Properties='text;HDR=NO;FMT=Delimited'
DBConnect>ConStr,dbH

“Proper” CSV files use the double quote character to delimit strings. But there are many CSV files that do not contain these delimiters. When string delimiters aren’t used this should normally mean that the field is numeric, but that may not always be the case. So ADO sometimes has to take a guess as to whether a column is a string or a number. By default it does this by looking at the first 8 rows of data. This may not always be enough.

Watch what happens if we try to read in this file, which contains IP addresses:

abc.pluto.server.com,18.140.1.2
zxy.saturn.server.com,18.120.2.1
fyg.pluto.server.com,18.5.6.2

We’ll read it in with:

DBQuery>dbH,SELECT * FROM servers.csv,fields,num_recs,num_fields

Something weird happens. Look in the watch list and you’ll see the IP addresses have become:

18.1401
18.1202
18.562

ADO has decided that the IP addresses are numbers. It didn’t see any quote marks and after analysing the rows decided they are meant to be numbers. Bonus points go to anyone who can tell me why it has transformed the numbers in that way.

As mentioned before, ADO looks at the first 8 rows. If more of those 8 are more obviously strings, or contain string delimiters, it will treat them as strings. Try delimiting just one row of the above three. You’ll still get numbers. Delimit two and the problem is solved. Where the data type is mixed it will choose the type of the majority.

But we can’t go editing the CSV files to correct them.

What we can do is use a Schema.ini file. Using a Schema.ini file we can tell ADO what data types should be used for each column. So for our Servers.csv example above we could make a Schema.ini file as follows:

[Servers.csv]
Col1=ServerName Text
Col2=IPAddress Text

This file should be in the same folder as the CSV file.

Now, when we run our query we get the correct values, because we’ve forced ADO to treat the columns as text. Another advantage of this is that we have given our columns names, which is useful because our file doesn’t have a header column. This means we could set the Fieldnames parameter of our DBQuery line to 1 to return our array using the field names.

For more information on Schema.ini options for text files see:
http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx

Most of the time you’ll probably be working with CSV files that you know about, so manually creating a schema.ini file once is no issue. But you could have the macro create it for you using WriteLn and/or EditIniFile.

What if your macro needs to read in a CSV file and you don’t already know how many fields there are, but you want to force everything to text? Well, how about doing this:

//Create schema.ini file
DBQuery>dbH,select * from servers.csv,fields,num_recs,num_fields,0
WriteLn>%SCRIPT_DIR%\schema.ini,res,[servers.csv]
Let>k=0
Repeat>k
  Let>k=k+1
  EditIniFile>%SCRIPT_DIR%\schema.ini,servers.csv,Col%k%,Col%k% Text
Until>k=num_fields

//now requery using schema.ini file
DBQuery>dbH,select * from servers.csv,fields,num_recs,num_fields,1
...

So this code uses a DBQuery to get the column count, and creates a schema.ini with each column called Coln where n is the column number and setting it to Text.


Side note: There’s another option called MaxScanRows which is supposed to help by telling ADO how many rows to scan, so that instead of looking at only 8 rows you can tell it to look at more, or less. According to the docs setting it to zero will cause it to scan all rows. Given that in our IP address experiment it took the IP addresses as numbers and we only had 3 which is less than the default 8 rows, I would have expected that setting MaxScanRows to 0 would make no difference. In fact it causes the IP addresses to come back as Text. So it seems to solve our initial problem., but I’m not sure why. Googling MaxScanRows reveals a number of forum and news group posts where others have had problems and there seems to be some confusion over exactly what it does. If we use MaxScanRows without ColNameHeader it also causes ADO to see the first line as a header line. And then setting ColNameHeader to false, while ensuring the first line is not seen as a header, brings back the original problem, where the IP addresses are converted to numbers. All seems a bit strange. Hence my preference for explicitly setting the data type of each column.

January 27, 2010

Why Does My Script “Hang” and Never Continue?

Filed under: Automation — Marcus Tettmar @ 9:47 am

This is a variation of a question we get regularly. Usually it turns out that the script is waiting on a WaitWindowOpen line because of a typo in the window title.

WaitWindowOpen, given a window title, will cause the script to wait until a window with that title exists.

If you misspell the window title WaitWindowOpen will wait forever, because a matching window will never appear.

Rather than rely on your typing, you can select the window title from a drop down list. With the window you want to wait for open on the screen edit your WaitWindowOpen line in the Code Builder by right clicking on it and selecting “Edit in Code Builder”. Now, select the window title from the drop down box.

Remember you can also do a sub-string match using the asterisk:

WaitWindowOpen>Notepad*

This works for all window functions and will match the first window found that contains the given text (regardless of case). So the above will match “Untitled – Notepad” as well as something like “My Notepad – MyFile.txt”. It will stop at the first one found, so don’t be too general and try to find something unique.

More advanced users might be interested to know that we’re adding the ability to use Regular Expressions in the window functions for version 12.

January 26, 2010

Get Auto-Generated ID Created By INSERT SQL

Filed under: Scripting — Marcus Tettmar @ 2:32 pm

If you have a database table with an auto-generated numerical index field, you may need to retrieve the value of this field after an INSERT statement.

In Microsoft Access such a field uses an “AutoNumber” data type. In MySQL it’s known as an “auto_increment” field.

We often see these types of fields where a relationship exists between two tables, for example between a “customers” table and “customer_orders” table. The “customers” table may have an AutoNumber field which the system automatically increments when a record is inserted, and in the “customer_orders” table you’ll see a “CustomerID” field which uses this same value to map one or more orders to one customer.

It’s a convenient way to have the database itself automatically create a unique identifier.

If you ever need to use Macro Scheduler to transfer data into such a database you can use the built-in database functions to handle the INSERTS. But when it comes to inserting the data into the second table you’re going to need a way to get the ID that was automatically generated for the first.

Some database technologies give you a function to do just that. MySQL has a function called LAST_INSERT_ID() which you can call with a SELECT statement:

DBExec>dbID,INSERT into customers (name,address1,address2) VALUES ("Mr Customer","..",".."),res 
DBQuery>dbID,SELECT LAST_INSERT_ID(),last_id,nr,nf,0

This will return the ID in LAST_ID_1_1 and you can then use it in your next INSERT for the detail records.

See: MySQL: last_insert_id

Microsoft Access does the same with a function called @@identity:

DBExec>dbID,INSERT into customers (name,address1,address2) VALUES ("Mr Customer","..",".."),res 
DBQuery>dbID,SELECT @@identity,last_id,nr,nf,0

See: Retrieving Identity or Autonumber Values

Your particular flavour of database might have something similar. If not you could take a risk and do this:

DBQuery>mdbH,SELECT TOP 1 ID FROM customers ORDER BY ID DESC,last_id,nr,nf,0

This will return the largest ID – the latest one. But if you’re working on a database that has many users updating it simultaneously there’s a danger you could retrieve the ID generated by someone else’s INSERT. If it’s just you, then it’s probably OK, but you’ve been warned.

For more information on Macro Scheduler’s database functions see:
Using Macro Scheduler’s Database Functions

January 16, 2010

Haiti Disaster Relief – Donate and Get Macro Scheduler Lite Free

Filed under: Announcements — Marcus Tettmar @ 10:27 pm

Over the last few days, like many I’m sure, I’ve been wondering what else I can do to help the people of Haiti beyond sending a donation to one of the many charities.

Then today, I saw this from my friend Andy Brice of PerfectTablePlan. He’s giving licenses away to people who donate money to Haiti.

I think this is a great idea and I’m going to try the same thing.

If you donate at least $30 to Haiti disaster relief via the American Red Cross, the UK Disasters Emergency Committee or similar, send us an email and we’ll send you a free license for Macro Scheduler Lite.

Just send us an email and tell us what you donated.

For now the offer is available until end of Tuesday 19th Jan 2010 Sunday 31st January 2010. I may extend it. Please tell your friends, post on facebook, tweet it, email a link to this blog … whatever. Just make sure you donate.

January 15, 2010

January 14, 2010

Amazon Kindle T-Shirt Competition – Zazzle Problems and Instructions

Filed under: Announcements, General — Marcus Tettmar @ 2:19 pm

As you probably know, we’re running a T-Shirt design contest where you could win an Amazon Kindle worth $259.

Some people have had problems sending us a link to their T-Shirt design so I checked it out and found that the process for getting a public link to your shirt is not particularly obvious. So I thought I’d post instructions for the easiest way to do it:

  • Customize the design. Click here to view the T-shirt and then click the Customize button.
  • When you’re done click on “Email” which is beneath the right-hand box, underneath the Add to Cart section.
  • You’ll be prompted to log in or create an account (just a username and email address is required).
  • You’ll then be presented with an email form. Enter your email address as both the sender and recipient so that you can send your design to yourself. If you like add [email protected] to the To list, so that we also get a copy.
  • Make a cup of coffee.
  • Check your email, grab the link and go paste it in a comment here.

If you’ve already created a design, do not fear, it is not lost. Log into Zazzle (click on My Account at top right). Click on the “Products” tab. On the left under “My Products” you will see “In Progress”, “Private” and “Public”. Your T-Shirt will be under one of these – probably the “In Progress” one. Click on it and then do the Email thing as in the above instructions.

I hope that clears up any confusion. It’s worth the effort – you could win a $259 Kindle!

January 11, 2010

Remove Tags From HTML with RegEx

Filed under: Scripting — Marcus Tettmar @ 10:09 am

A quick and easy way to remove HTML tags from a string:

Let>Html=

Hello World

Test link Let>pattern=<(.|\n)*?> RegEx>pattern,Html,0,matches,numMatches,1,,plainText MessageModal>plainText

January 6, 2010

Amazon Kindle Competion Update

Filed under: Announcements, General — Marcus Tettmar @ 11:22 am

Happy New Year everyone!

What do you do when you’re snowed in and can’t get to work? Get creative that’s what! Enter our T-Shirt design contest and win an Amazon Kindle!

Yes, just in case you missed it, we’re running a T-Shirt design contest. Design a new MJT Net T-Shirt and you could win an Amazon Kindle. The competition closes 31st January. Take a look at the comments to see some of the entries so far, and then post your own.

December 29, 2009

Macro Scheduler 11.1.19 Update

Filed under: Announcements, Macro Recorder — Marcus Tettmar @ 11:17 am

Macro Scheduler 11.1.19 is now available with the following changes:

  • Fixed: “Not a valid date/time” error if regional short date format has a month name in it
  • Fixed: Variable Explorer reporting incorrect variable if parameters have spaces
  • Fixed: Macro Recorder causing hang or doing nothing or failing to stop on a minority of systems

Registered Downloads/Upgrades | Evaluation Downloads | New License Sales