Count files in folder based on criteria and select in order?

Technical support and scripting issues

Moderators: JRL, Dorian (MJT support)

RNIB
Macro Veteran
Posts: 198
Joined: Thu Jan 10, 2008 10:25 am
Location: London, UK

Count files in folder based on criteria and select in order?

Post by RNIB » Fri Jun 12, 2009 4:02 pm

Okay this one is complicated, well i think it is anyway.

I'm importing a whole bunch of WAV files into a program by automating it's file browsing function and so far have been selecting just individual files. I now need to be able to import lots of files all at once.

This is the problem I now have:

The folder containing all the WAV files contains a few wav files that I have already imported which are not prefixed by a number.

The folder also contains, potentially hundreds of WAV files which are prefixed with a number to indicate the order that they should be imported in.

The program that I'm automating only lets you import up to 50 files at a time.

If I select files 1 to 10 then when they are imported they are imported in this order: 10,2,3,4,5,6,7,8,9,1. If I select files 10 to 1 then they are imported in the correct order i.e. 1,2,3,4,5,6,7,8,9,10

So what I need to do is for the macro to count the number of files in the folder, subtract the number of files that aren't prefixed with a number to give the total number of files it now needs to import.

If the number of files is over 50 it should find the 50th file and select backwards to to the first file and import those.

Then it should go back into this browse window, select the 100th file and select back to the 51st and so on until it's imported all the files in the correct order.

I haven't the faintest idea how to do this or even if it's possible. Any ideas?

BTW: I'm off now for the weekend - have a good one!

gdyvig
Automation Wizard
Posts: 447
Joined: Fri Jun 27, 2008 7:57 pm
Location: Seattle, WA

Should be possible

Post by gdyvig » Fri Jun 12, 2009 9:06 pm

Hi RNIB,

You may not need to count the wav files in advance, count them as you go.

I'm not sure exactly how your browse window works, but will assume it works like a typical ones that allow multiple items to be selected.

You need a way of detecting when you reached the end of the list. Most apps use a scroll bar. Capture an image of the slider touching the bottom scroll button. That indicates you reached the end of the list.

Probably the most accurate way of selecting the files is to use the un-shifted down-arrow key so you can count up to fifty (or when the slider bottoms out, whichever happens first). I recommend you don't page down to select multiple items. It may be faster some of the wav files on the last page may be repeated from the 2nd to the last page.

Find the first row of a block of 50 files. Lets say we want rows 51-100.
Open browse window, click on first row.
Use a Repeat/Until loop to Press the down arrow 99 times.
(or until you bottom out, keep track of the press count)
You are now on row 100 (or the last row)
Press Shift
Press Down * n
(same number of times you pressed the down arrow)
Release Shift
Rows 51-100 should be highlighted.
(or rows 51 - end of list)
Import the selected rows.


Gale

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Fri Jun 12, 2009 10:42 pm

How about no counting.....

1. Use Regex to capture the file names that start with numbers.
2. Read each file from the Regex.
3. Do the folllowing to create a replacement prefix
a. Get the leading number of the file.
b. Subtract the number of files returned in the RegEx
c. Subtract 1.
d. Get the Absoluto value of last step (c) / (Or multiply by -1).
e. Use that number to replace the existing beginning number.

Examples:
File names, in random order = 1name.wav, 3name.wav, 5name.wav, 6name.wav, 2name.wav, 4name.wav.
Count of names =6

Steps a,b,c,d,e:
a: 1name.wav = 1
b: 1 - 6 = -5
c: -5 -1 = -6
d: @Abs(-6) = 6
e. 1name,wav becomes 6,wav

a: 3name.wav = 3
b: 3 - 6 = -3
c: -3 -1 = -4
d: @Abs(-4) = 4
e. 3name,wav becomes 4,wav

a: 5name.wav = 5
b: 5 - 6 = -1
c: -1 -1 = -2
d: @Abx(-2) = 2
e. 5name,wav becomes 2,wav

Repeat for other files. Not need to count, no need to do in blocks of 50. Just read the beginning number, subtract the total count, subtract 1, and use the absolute value to replace the beginning number.

This will put all the numbers in the reverse order.
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!

gdyvig
Automation Wizard
Posts: 447
Joined: Fri Jun 27, 2008 7:57 pm
Location: Seattle, WA

A few questions

Post by gdyvig » Sat Jun 13, 2009 4:34 am

Hi RNIB,


The folder you are reading the files from - is that a Windows Explorer file folder or a folder inside the import tool?

I thought the sequence the files was sorted in was odd, so I created some files like the ones you described. 1name,2name,3name,...10name. In Windows Explorer they do not sort alphanumerically (alphabetically) like they used to. Instead Explorer treats the numeric prefix like a real number and sorts the files numerically. When you copy and paste into another folder it makes a difference whether you select from the top down or from the bottom up. Either way they pasted in an unknown sequence, but when in Details view you clicked the Name column they the top/down paste was in numeric order and the bottom/up paste was in reverse numeric order.

The image you posted in http://www.mjtnet.com/usergroup/viewtop ... 4389#24389 indicates you can switch to Details view which lets you reverse the sort order by clicking on the Name column. Not sure if you can do that in the screen you posted in http://www.mjtnet.com/usergroup/viewtop ... sc&start=0.

Windows Explorer "new" file sorting rules: http://support.microsoft.com/kb/319827/
To override with old rules: http://www.pctools.com/forum/showthread.php?t=26104

When you list the same files from a CMD prompt, they are sorted alpha-numerically. I don't know if there is a simple bat file or VBScript query that will list the files in the same sequence Window Explorer does.

Are the numeric prefixes stripped after the wave file are imported or do they keep the same name?

If you get another batch of files to import, will the numeric profix start where you left off, or with file 1name.wav?

You must use the import tool to maintain the tools database?

Are you allowed to rename the files prior to importing?

Your previous posts and the EasePublisher show the prefix to consist of 3 or 4 digits with leading zeroes. I don't see leading zeroes in your example in this post. If you follow Bob's suggestion you will need to add leading zeroes if using a tool that sorts alphanumerically, but not if you are using Windows Explorer.




Gale
Last edited by gdyvig on Sun Jun 14, 2009 1:18 pm, edited 3 times in total.

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Sun Jun 14, 2009 12:39 am

If you follow Bob's suggestion you will need to add leading zeroes if using a tool that sorts alphanumerically, but not if you are using Windows Explorer.
This is not actually correct. The final Replacement Routine could insert any leading zeroes that might be needed based on the lenth of the final pattern being inserted. Eg. if number is greater than 99, less than 1000, then insert 1 or 2 leading zeroes if length is less than three.

The RegEx can find the filenames that start with any digit, one or more......
The RegEx result will provide the number of files in the result, needed for the calculation.
The Replacement naming routine can also make a calculation from the initial numbers that will be replaced with an expression that packs the leading zeros.
Last edited by Bob Hansen on Sun Jun 14, 2009 6:52 am, edited 1 time in total.
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Sun Jun 14, 2009 6:49 am

Here is a sample of my suggestion above, using RegEx to only get the files with numbers at the front, and let it do the counting of files also. Then the file names are changed to reverse their position from the beginning.

Code: Select all

// GetFileList>c:\windows\*.wav,vFileList,;
// Preceeding line is actual line to be used.  Commented out for testing.
// Next line is a test line to simulate a directory listing.  Includes names without leading numbers
Let>vFileList=3name.wav;2name.wav;name1.wav;1name.wav;name2.wav;4name.wav;5name.wav;name3.wav;6name.wav
Let>vNeedle=[0-9]+[a-z]+.*?\.wav
Let>vHaystack=%vFileList%
RegEx>%vNeedle%,%vHaystack%,0,vFiles,vFileCount,0,,
Let>vCount=0

Label>Loop
Let>vCount=%vCount%+1

// Calculate file core name
Let>vNeedle=^([0-9]+).*
Let>vHaystack=vFiles_%vCount%
RegEx>%vNeedle%,%vHaystack%,0,vMatchString,vMatchCount,1,$1,vOriginal
Length>%vOriginal%,vLength
Let>vStart=%vLength%+1
MidStr>vFiles_%vCount%,%vStart%,20,vNewName

// Calculate new file prefix number
Let>vPrefix=%vOriginal%
Let>vPrefix=%vPrefix%-%vFileCount%
Let>vPrefix=%vPrefix%-1
Let>vPrefix={abs(%vPrefix%)}
Length>%vPrefix%,vLength

// Load zeros in front of prefix based on length of vFileCount and vLength
If>{(%vFileCount%>9) and (%vFileCount%<100) and (%vLength%=1)}
    Let>vPrefix=0%vPrefix%
EndIF
If>{(%vFileCount%>99) and (%vFileCount%<1000) and (%vLength%=1)}
    Let>vPrefix=00%vPrefix%
EndIF
If>{(%vFileCount%>99) and (%vFileCount%<1000) and (%vLength%=2)}
    Let>vPrefix=0%vPrefix%
EndIF

// Calculate complete new file name
Let>vNewName=%vPrefix%%vNewName%

// Rename old file to new file
// Renaming file may overwrite existing file, need to protect against that.
RenameFile>vFiles_%vCount%,%vNewName%
MessageModal>%vHaystack% was renamed to %vNewName%
If>%vCount%=%vFileCount%,End,Loop

Label>End
Testing can be done by setting the vFileCount to some number different from 6, such as 21 or 158 to test the leading zeros value. Also change the names of some of the files in vFileList to test calculation of prefix to match the actual difference from the beginning vs. the loop count. This allows files to read in any sequence, the numbers are calculated from their name, not from their position.
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!

gdyvig
Automation Wizard
Posts: 447
Joined: Fri Jun 27, 2008 7:57 pm
Location: Seattle, WA

Pre-processing the import folder

Post by gdyvig » Sun Jun 14, 2009 10:51 pm

Hi Bob,

It may not be necessary to reverse the numbering of the files or desirable since they indicate the ordering of the pages in the book. However, ensuring they are padded with zeroes will make the file folder compatible with tools that use standard alphanumeric sorting which would include the EasePublisher database.

I believe RNIB is seeing an unsorted list when he captures it top to bottom because of this statement:
If I select files 1 to 10 then when they are imported they are imported in this order: 10,2,3,4,5,6,7,8,9,1. If I select files 10 to 1 then they are imported in the correct order i.e. 1,2,3,4,5,6,7,8,9,10
You get similar results if you create a folder and use NotePad to create numeric files in numerical order. Initially they appear to be mis-sorted. Try it.

Capturing the list bottom to top appears to force a sort but capturing top to bottom preserves the original sort. Sorting can also be done by refreshing the display. If that does not work a sort key can be selected by using Details view option or by selecting Arrange Icons By and selecting Name. This is available in Windows Explorer and most Windows style browse windows.

We will need to wait for RNIB to find out exactly what options are available.

Gale

RNIB
Macro Veteran
Posts: 198
Joined: Thu Jan 10, 2008 10:25 am
Location: London, UK

Post by RNIB » Mon Jun 15, 2009 9:47 am

Wow, thanks everyone for your replies. A lot of what has been said is waaaay over my head so it may take me a while to go through it all but as a few questions have been asked I thought I'd do my best to explain the problem in greater detail.

The program I am trying to automate is called Dolphin Publisher, formally EasePublisher which is a program used to create audio books in an accessible format called DAISY.

Part of the process of creating a project in EasePublisher is to import WAV files into it which also partly creates the navigational headings at the same time. This import process is entirely internal to EasePublisher in that you can't just drag and drop files from say Windows Explorer into EasePublisher but instead you have to use it's Import Routine. This is a two stage process.

Stage 1:
This is the window you first get when you create your EasePublisher project
Image
The white box in the middle is where files that have already been selected for importing will arrive. Once you have selected the files for importing you can then select one of the files that appear in this box and then, using the up and down arrows, move it higher or lower in the list.

Therefore it is possible to import all the files needed for the project in any order and then sort them in this window. However as this requires a lot of mouse clicks or key presses it is preferable to select them in the correct order in the first place.

In order to select files to import you have to click on the Browse button (the folder icon - there isn't a keyboard shortcut) which then opens up a new window.

Stage 2:
This is the window that opens
Image
In this window we then browse to the folder containing the WAV files we need which then shows them in a list as seen in the above screen grab. This folder will contain two 'types' of WAV files. One type is not prefixed with a number and which I have already got the Macro to import by searching for specific file names. Therefore by this stage there will already be some files listed in the previous window. The other 'type' of WAV file is prefixed by a number and these will always be presented as 001-009,010-099,100-999 etc. We name them like this so that they do appear in the correct order in any Browse Window, Windows Explorer, My Computer etc. So these files DO appear in the correct order in this Browse Window.

This is where it then gets a little complicated. EasePublisher will only let you import 100 (not 50 as previously stated) files at a time. That is to say that whilst you can select an infinite number of files in this window and click on Open, if you've selected more than 100 files they then won't appear in the previous window. What this then means is that you have to select up to 100 files in this window, click on open, they then appear in the previous window where you then click again on the Browse button to re-open this window (which closed after you clicked Open) and select the next 100 files.

Just to make things even harder, if you select files 001-100 in this window then when they are listed in the previous window, file 100 will appear first, followed by file 002, 003 etc with file 001 appearing at the end of the list after file 099. To avoid this from happening, when you are in this window you have to select file 100 first and then shift click file 001. Then they are listed in the correct order in the previous window.

Unfortunately if I sort this window by file name in reverse order so that, for example, file 100 appears at the top of the list and file 001 appears at the bottom, select file 100 and Shift Click file 001 they are still imported in the incorrect order except that file 001 appear at the top followed by file 99, 98 down to file 2 and then finally file 100.

With me so far?

So in simple terms all I want to do is to import only the files prefixed by a number (as I don't want to re-import the non numbered files) in the order that they should be in, 100 files at a time in a way that they appear in the correct order in the first window i.e. by selecting the last file first.

I guess that there isn't a need to count the files first if there is a way of getting it to not select any files that are not prefixed by a number. However because the second window closes after you select the first initial number of files it must be able to 'remember' where it got up to. Equally there might not always be more than 100 files that need to be imported anyway although this would be the exception rather than the rule with most projects having 200+

I hope that makes a bit more sense.

I will now go through everyones suggestions in more detail but in the mean time if anyone knows of a better way of doing this from what I've just written I'd love to hear it.

gdyvig
Automation Wizard
Posts: 447
Joined: Fri Jun 27, 2008 7:57 pm
Location: Seattle, WA

Dummy files

Post by gdyvig » Mon Jun 15, 2009 2:53 pm

Hi RNIB,

You create the files prefixed with leading zeroes from the get go, so there is no need to add them. Simply reversing the order of the prefixes will not help either.

The "Create new project wizard" window has Nbr and Name column headings. Clicking on these changes the sort order but does not fix the problem?

The "Select audio file(s) for your project" window has no column headings. But it does have the View icon indication you can display column headings and sort on them. Clicking on these changes the sort order but does not fix the problem either?

Have you tried prefixing your files 000-099 instead of 001-100? Probably won't work, but give it a try.

I think the simplest soloution is to place dummy files in first and last positions and delete them after the import. Or they could contain advertisements.

Your list would look like:
001_dummy1.wav
002_title_page.wav
003_dedication.wav
.
.
099_page_094.wav
100_dummy2.wav


To make sure I understand, after the import the software automatically drops the prefixes, this is how the filenames change:

002_title.wav becomes title.wav
099_page_094.wav becomes page_094.wav


Gale

RNIB
Macro Veteran
Posts: 198
Joined: Thu Jan 10, 2008 10:25 am
Location: London, UK

Re: Dummy files

Post by RNIB » Mon Jun 15, 2009 3:52 pm

gdyvig wrote:Hi RNIB,

You create the files prefixed with leading zeroes from the get go, so there is no need to add them. Simply reversing the order of the prefixes will not help either.
Correct we produce all the files some of which are prefixed with leading zeros that we have added and some will not have this prefix. For reasons that are too complicated to explain it is not possible to add a numerical prefix to all files.
gdyvig wrote: The "Create new project wizard" window has Nbr and Name column headings. Clicking on these changes the sort order but does not fix the problem?
Unfortunately these don't do anything, I've no idea why they even exist as clicking on them doesnt do anything. What you can do though is select and individual file and change it's position in the list by using the up and down arrows.
gdyvig wrote: The "Select audio file(s) for your project" window has no column headings. But it does have the View icon indication you can display column headings and sort on them. Clicking on these changes the sort order but does not fix the problem either?
No the problem only seems to occur after you have made the selection and the list of files appears in the "Create New Project Wizard" window.
gdyvig wrote: Have you tried prefixing your files 000-099 instead of 001-100? Probably won't work, but give it a try.
Unfortunately doesnt make a difference
gdyvig wrote: I think the simplest soloution is to place dummy files in first and last positions and delete them after the import. Or they could contain advertisements.

Your list would look like:
001_dummy1.wav
002_title_page.wav
003_dedication.wav
.
.
099_page_094.wav
100_dummy2.wav
I understand what you are saying but I'd prefer to avoid adding other files that are then deleted as this can cause potential issues with the DAISY encoding process.
gdyvig wrote: To make sure I understand, after the import the software automatically drops the prefixes, this is how the filenames change:

002_title.wav becomes title.wav
099_page_094.wav becomes page_094.wav


Gale
That is a separate stage to this particular process and will be handled by a different part of the macro later. At present the names of the WAV files will generate headings which, after we have done other work to, we then rename to become things like Title Page instead of 002_title.wav etc. At this stage though we want to preserve the file names in the WAV files.

Hope that makes sense

gdyvig
Automation Wizard
Posts: 447
Joined: Fri Jun 27, 2008 7:57 pm
Location: Seattle, WA

Swapping first and last?

Post by gdyvig » Mon Jun 15, 2009 5:56 pm

Would swapping the first and last files in your prep folder work.

Change 001_title.wav to 100_001_title.wav
Change 100_page_xxx.wav to 001_100_page-xxx.wav

Would that mess up the encoding?

RNIB
Macro Veteran
Posts: 198
Joined: Thu Jan 10, 2008 10:25 am
Location: London, UK

Post by RNIB » Fri Jun 19, 2009 9:47 am

Sorry been away for a few days and only now catching up.

In regards to renaming files, whilst that wouldn't mess up the encoding it would cause problems if we, as we sometimes do, had to go back to the original audio to re-edit it as the program we use for doing that would still be looking for the unrenamed file.

I'm starting to think that the easiest option will be to simply get the script to count the number of files in the selected folder that start with a number and to then import one file at a time and then to loop by the number of files that match the criteria.

Whilst this is probably the easiest solution it is also the slowest as this will mean importing 300+ files one by one rather than just 3 lots of 100.

The other problem I have with this is that I can get it to count the number of files in the folder but I don't know how to get it to just count files that start with a number.

I've tried:

CountFiles>F:\Bethlehem Murders_The\[0-9]_*.wav,nCount,0
and
CountFiles>F:\Bethlehem Murders_The\\d_*.wav,nCount,0

But neither work so I assume that I can't use Regular Expressions?

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Fri Jun 19, 2009 2:00 pm

See my posting about using RegEx. The first example returns a variable with the count of files that start with a number. Here is an edited excerpt of the relevan code:

Code: Select all

GetFileList>c:\windows\*.wav,vFileList,;
Let>vNeedle=[0-9]+[a-z]+.*?\.wav
Let>vHaystack=%vFileList%
RegEx>%vNeedle%,%vHaystack%,0,vFiles,vFileCount,0,,
MessageModal> There are %vFileCount% filenames that start with a number
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!

User avatar
JRL
Automation Wizard
Posts: 3532
Joined: Mon Jan 10, 2005 6:22 pm
Location: Iowa

Post by JRL » Fri Jun 19, 2009 2:09 pm

Someday I'm going to have to look into Regular Expressions. It looks like fun.

Here's another way to get the count. This also provides a list of file names in the specified directory that have names that start with a number and a second list of names that do not start with a number.

Code: Select all

VBSTART
VBEND
Let>INPUT_BROWSE=2
Let>Extension=wav
Input>dir,Enter or Browse to the directory containing the files to count (No trailing backslash)

GetFileList>%dir%\*.%Extension%,files,;

Separate>files,;,file
If>File_Count=0
  MDL>No %Extension% files found
  Exit>0
EndIf
Let>kk=0
Let>NumberedFileCount=0
Let>NotNumberedFileCount=0
Repeat>kk
  Add>kk,1
  Let>file=file_%kk%
  MidStr>file,1,1,digit
  VBEval>IsNumeric("%digit%"),result
  If>result=True
    WriteLn>%temp_dir%numberedfilelist.txt,wres,file
    add>NumberedFileCount,1
  Else
    WriteLn>%temp_dir%notnumberedfilelist.txt,wres,file
    Add>NotNumberedFileCount,1
  EndIF
Until>kk,%file_count%


MDL>numbered=%NumberedFileCount%%CRLF%NotNumbered=%NotNumberedFileCount%

RNIB
Macro Veteran
Posts: 198
Joined: Thu Jan 10, 2008 10:25 am
Location: London, UK

Post by RNIB » Fri Jun 19, 2009 2:27 pm

Hi Bob,

I have tried that but all I get is a message box saying:

There are %vFileCount% filenames that start with a number

i.e. the variable (think that's what you call it anyway - Newbie Alert!) isn't being populated
Bob Hansen wrote:See my posting about using RegEx. The first example returns a variable with the count of files that start with a number. Here is an edited excerpt of the relevan code:

Code: Select all

GetFileList>c:\windows\*.wav,vFileList,;
Let>vNeedle=[0-9]+[a-z]+.*?\.wav
Let>vHaystack=%vFileList%
RegEx>%vNeedle%,%vHaystack%,0,vFiles,vFileCount,0,,
MessageModal> There are %vFileCount% filenames that start with a number

Post Reply
cron
Sign up to our newsletter for free automation tips, tricks & discounts