Error when reading .txt (unicode)

Technical support and scripting issues

Moderators: JRL, Dorian (MJT support)

Post Reply
mafiamoe
Junior Coder
Posts: 38
Joined: Thu Jul 06, 2006 8:28 am

Error when reading .txt (unicode)

Post by mafiamoe » Tue Sep 25, 2007 11:22 pm

When i try to read a line from a file using

Readln>C:\Documents and Settings\Owner\My Documents\file_20070924_175218.txt,%variable%,line

I get either:

Line=
(returns nothing)

OR

Line=##ERR## - Code : 32


It seems like the file was created using wordpad and not notepad. When opening in notepad, there are some characters that notepad cannot read. Is there any other way to read the file?

Thanks
Last edited by mafiamoe on Thu Sep 27, 2007 10:00 pm, edited 1 time in total.

Me_again
Automation Wizard
Posts: 1101
Joined: Fri Jan 07, 2005 5:55 pm
Location: Somewhere else on the planet

Post by Me_again » Wed Sep 26, 2007 2:30 am

You can try reading the whole file into a variable with ReadFile> but it would probably help to get a better answer if you could tell us what you ultimately need to do with the file.

User avatar
JRL
Automation Wizard
Posts: 3532
Joined: Mon Jan 10, 2005 6:22 pm
Location: Iowa

Post by JRL » Wed Sep 26, 2007 5:37 am

Wordpad can write to several different file formats including plain text, Rich Text and Unicode Text. Wordpad will assign Rich text documents a .RTF extension. Wordpad will assign a .TXT extension to both plain text and to Unicode Text documents so there's no way to tell them apart by looking at the file name. There will probably be significant differences between a plain text and a unicode text file.

I created a unicode text file in WordPad by typing
Line 1
Line 2
Line 3

then saved the file as c:\test_document.txt

When I look at the file in a hex editor I see that the first four characters in the file are ascii characters 255, 254, 108 and 000. The ReadLn> function in Macro Scheduler will not report any part of the file after ascii character 000. ReadFile> with Separate> will not report anything after the first Ascii 000. ReadFile> with PutClipboard> will not report anything after the first Ascii 000. However ReadFile> then WriteFile> will recreate the file in its entirety.

I suspect that your file was saved to the unicode text format and will require a resave to plain text to be usable. Unless perhaps there is a VBScript trick that could be employed...? Or another thought would be to open the file in WordPad, Then Press ctrl + A then ctrl + C to capture all of the text to the clipboard. Then use GetClipboard> and Separate> to acquire each line of text.

Hope this was helpful,
Dick

mafiamoe
Junior Coder
Posts: 38
Joined: Thu Jul 06, 2006 8:28 am

Post by mafiamoe » Thu Sep 27, 2007 7:10 am

After reading your reply and doing some more research on my end, I can confirm that it is in fact in unicode. I tried searching the vbscript resources and thought I found a way to change the entire file from unicode to text format, but turned out to not work at all. I also tried the readfile/writeln technique and was able to copy over the folder, but inside the new .txt there are still characters that are not recognized so it cannot readln successfully. I am not sure opening wordpad is an option in the long run. I can use that for now to test the other areas of the code, but I will still be searching for a way to copy the file without running wordpad or a way to 'readln' directly from the unicode file.

Thanks for the help so far :)

Me_again
Automation Wizard
Posts: 1101
Joined: Fri Jan 07, 2005 5:55 pm
Location: Somewhere else on the planet

Post by Me_again » Thu Sep 27, 2007 8:10 pm

DOS to the rescue :lol:

RunProgram>cmd /c type c:\test\unicodejunk.txt>c:\test\nicecleantext.txt

mafiamoe
Junior Coder
Posts: 38
Joined: Thu Jul 06, 2006 8:28 am

Post by mafiamoe » Thu Sep 27, 2007 9:54 pm

It works!!!

The DOS command is unable to read the file that is in a folder with a space in it (eg. C:\program files\) but that was remedied with a simple copyfile to a new directory. The section of code used is below, with a couple variable changes so it makes a little more sense :-P


CopyFile>C:\Program Files\program\unicode.txt,C:\test\newunicode.txt
Run Program>cmd /c type C:\test\newunicode.txt>C:\test\text.txt


What I actually use has some variables for the file locations, like:
Let>location=C:\Program Files\program
and then:
CopyFile>%location%\unicode.txt,C:\test\newunicode.txt


Thank you!!! :-D

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Thu Sep 27, 2007 10:03 pm

Modified command from Me_again to handle spaces in path and/or filename:

RunProgram>cmd /c type "c:\test folder\unicode junk.txt">"c:\test folder\nice clean text.txt"
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!

Me_again
Automation Wizard
Posts: 1101
Joined: Fri Jan 07, 2005 5:55 pm
Location: Somewhere else on the planet

Post by Me_again » Thu Sep 27, 2007 10:17 pm

Thank you Bob :D

I'll never get used to these new fangled filename conventions, bring back 8+3 :lol:

Post Reply
Sign up to our newsletter for free automation tips, tricks & discounts