Read CSV files with different delimiters

Technical support and scripting issues

Moderators: JRL, Dorian (MJT support)

Post Reply
Heinz57
Junior Coder
Posts: 28
Joined: Wed Jul 22, 2015 6:10 pm
Location: A Berliner just east of Wien

Read CSV files with different delimiters

Post by Heinz57 » Tue Aug 18, 2015 4:53 pm

Hello,

In order to avoid code redundancies, I am about to develop a generic subroutine, which reads any CSV file and answers the file's contents to its caller in an Array with as many rows and columns as there are in this particular file.

CSV files in my case never ever use commata as delimiters, because commata are regularly part of the content. Instead, we use the caret sign, which is never a legitimate part of any content.

After spending about half of today on trying to handle this, I gave up, because I found no simple and general solution.

How can I solve this?

1) Change the system variable that defines the comma as CSV delimiter!

Something similar to DECIMAL_SEPARATOR would be the easiest as one could continue to use the nice CSVFileToArray command but I found no information on whether or not or how this is possible.

2) Write an very complex subroutine using Separate

I then found viewtopic.php?f=2&t=7818&p=34638&hilit= ... ter#p34638 and tried to use this but the code is neither sufficient nor generic in that it does not handle arrays of varying complexity. I would have to travers the file just to detect the max. number of columns and this grows to an unacceptable complexity.

Apart from that, I am most reluctant to use 20 or so lines of code where one simple statement CSVFileToArray should actually do the job. The more code, the more chances for bugs.

3) Is there any other way to handle this very simple task with one or two statements?

Thank you for any hint. I try to build a library of re-usable subroutines to efficiently handle repetitive tasks.

Heinz57

hagchr
Automation Wizard
Posts: 331
Joined: Mon Jul 05, 2010 7:53 am
Location: Stockholm, Sweden

Re: Read CSV files with different delimiters

Post by hagchr » Sun Aug 23, 2015 8:32 pm

Hello,

I agree it would be useful with a simple way to import text files with different delimiters but have not really found a good way with a few lines.

The following SRT - DSVtoArr - can be saved as a separate script and called with Include>. It will take a file, delimiter, array name and create an array. I have not find a way to create a multidimensional <array> so it is a normal array. If the last arguments is set to 1 then all columns will have the same size (determined by largest one) even if empty. If set to 0 then number of columns for each row will be determined by the individual lines in the file.

Code: Select all

SRT>DSVtoArr
    //Notes
    CodeBlock
        //DSVtoArr,file name,delimiter,"array name",Use max column count for all 1/0
        //Example
            //Let>File=C:\Test.txt
            //Let>Delimiter=^
            //Gosub>DSVtoArr,File,Delimiter,"myArray",1
            //myArray_count will return number of rows
            //myArray_3_count will return number of elements in row 3
            //myArray_1_1 will return element from row 1 and column 1
            //The final argument = 1 will cause every Row to have the same number of columns, even if empty
    EndCodeBlock

    //Assign local variables: File, Delimiter, Array, IndCol from call
    CODEBLOCK
    Let>LOCALVARS=1
    Let>File=DSVtoArr_VAR_1
    Let>Delimiter=DSVtoArr_VAR_2
    Let>tmpArr=DSVtoArr_VAR_3
    RegEx>[^"]+,tmpArr,0,m0,nm0,0
    Let>Array=m0_1
    Let>IndCol=DSVtoArr_VAR_4
    ENDCODEBLOCK

    ReadFile>File,strFileContents
    Separate>strFileContents,%CRLF%,strLines

    //Check rows for maximum number of columns -> variable maxcol
    CODEBLOCK
    Let>Row=0
    Let>maxcol=0
    While>Row<strLines_count
        Add>Row,1
        Separate>strLines_%Row%,Delimiter,tmpCol
        If>tmpCol_count>maxcol
            Let>maxcol=tmpCol_count
        Endif
    EndWhile
    ENDCODEBLOCK

    //Populate Array_Row_Column with name given by %Array%
    CODEBLOCK
    Let>Row=0
    While>Row<strLines_count
        Add>Row,1
        Separate>strLines_%Row%,Delimiter,tmpCol
        Let>Column=0
        While>Column<tmpCol_count
            Add>Column,1
            Let>LOCALVARS=0
            Let>%Array%_%Row%_%Column%=tmpCol_%Column%
            Let>LOCALVARS=1
        EndWhile
        //If IndCol=1 add empty elements so that all columns have same #elements = maxcol
        If>IndCol=1
            If>Column<maxcol
                While>Column<maxcol
                    Add>Column,1
                    Let>LOCALVARS=0
                    Let>%Array%_%Row%_%Column%=
                    Let>LOCALVARS=1
                EndWhile
            Endif
        EndIf
        Let>LOCALVARS=0
        Let>%Array%_%Row%_count=Column
        Let>LOCALVARS=1
    EndWhile
    Let>LOCALVARS=0
    Let>%Array%_count=strLines_count
    ENDCODEBLOCK
END>DSVtoArr

This is then how it can be used:

Code: Select all

Include>%SCRIPT_DIR%\DSVtoArr.scp
Let>file=C:\...\TestFile.txt
Let>delimiter=^
Gosub>DSVtoArr,file,delimiter,"resArr",1

/*
resArr_count    gives number of rows
resArr_1_count  gives number of columns in row 1
resArr_2_1      gives element in row 2 and column 1
*/

Post Reply
Sign up to our newsletter for free automation tips, tricks & discounts