RegEx-Removal of Leading & Trailing Spaces

Technical support and scripting issues

Moderators: Dorian (MJT support), JRL

Post Reply
armsys
Automation Wizard
Posts: 1108
Joined: Wed Dec 04, 2002 10:28 am
Location: Hong Kong

RegEx-Removal of Leading & Trailing Spaces

Post by armsys » Mon Apr 01, 2013 11:16 pm

How to remove all leading and trailing spaces in a multiple-line variable, say, 'Text', with RegEx> command?
/*
FullText:
.......................Line 1
....................Line 2
..................Line3
*/
LabelToVar>FullText,Text
MDL>Text
Those dots represents spaces.

armsys
Automation Wizard
Posts: 1108
Joined: Wed Dec 04, 2002 10:28 am
Location: Hong Kong

Post by armsys » Tue Apr 02, 2013 11:16 am

Hope some RegEx gurus can help.
Thanks.

olllllliii
Pro Scripter
Posts: 60
Joined: Tue Dec 22, 2009 9:51 am
Location: Mannheim ( Germany )
Contact:

Post by olllllliii » Wed Apr 03, 2013 8:37 am

I think this is what you are looking for.

Code: Select all

/*
FullText:
.......................Line 1
....................Line 2
..................Line 3
*/
LabelToVar>FullText,strText
// Before
MDL>strText

// now i removed all dots ....you can Change that to {" "} below for spaces

Let>pattern={"."}
Regex>pattern,strText,1,ArrMatches,NumMatches,1,,strTextoutput
// After replace
MDL>strTextoutput
Kind regards
Oliver Hilger
Oliver Hilger Mannheim
alias Olllllliii

armsys
Automation Wizard
Posts: 1108
Joined: Wed Dec 04, 2002 10:28 am
Location: Hong Kong

Post by armsys » Wed Apr 03, 2013 8:55 am

Oliver,
Thanks for your help.
I notice you're using easy pattern.
I use the following RegEx to accomplish the same task:
RegEx>[oneormore space],ClipText,1,Matches,num,1,,ClipText

Alas, nonetheless, having tested million times, I confirm RegEx is troublesome as to Unicode support. Now I use time-honored StringReplace.
StringReplace doesn't corrupt Unicode chars.
BTW, are you from Italy?
Thanks again.
Best regards,
Armstrong

olllllliii
Pro Scripter
Posts: 60
Joined: Tue Dec 22, 2009 9:51 am
Location: Mannheim ( Germany )
Contact:

Post by olllllliii » Wed Apr 03, 2013 9:27 am

Hi ,

yes ! there is a second way with stringreplace.
I Think this is faster too.

Code: Select all

/*
FullText:
.......................Line 1
....................Line 2
..................Line 3
*/
LabelToVar>FullText,strText
// Before
MDL>strText

Let>pattern={"."}
Stringreplace>strText,%pattern%,,strText
// After
MDL>strText
I will check the Speed of both with a big file...
testing now ...results will come soon ...

No , i am not from Italy ...i am from Germany ( west )...Mannheim thats
near Heidelberg.
Oliver Hilger Mannheim
alias Olllllliii

armsys
Automation Wizard
Posts: 1108
Joined: Wed Dec 04, 2002 10:28 am
Location: Hong Kong

Post by armsys » Wed Apr 03, 2013 10:07 am

>.i am from Germany ( west )...Mannheim thats
Sorry for my poor observation. You must be a banker/financer.
Let's get back to Macro Scheduler.
So far, StringReplace is my most favorable tool to replace chars because it never corrupt any unicode chars in any languages. But I have to pay a price for the safety. A single RegEx could have accomplished the following verbose code:

SRT>DelExcessChars
/* Delete excessive chars-tabs, blank lines, leading spaces */
Let>Pattern1=%CRLF%%CRLF%%CRLF%
Let>Pattern2=%Space%%Space%
Let>Pattern3=%CRLF%%Space%

/* Delete tabs */
Label>DelTab
If>{Pos(%Tab%,%Cliptext%)>0}
Stringreplace>cliptext,%Tab%,,Cliptext
Goto>DelTab
Endif

Label>DelCRLF
If>{pos(%Pattern1%,%ClipText%)>0}
StringReplace>ClipText,%Pattern1%,%CRLF%,ClipText
Goto>DelCRLF
Endif

/* Delete extra spaces */
Label>DelSpace
If>{pos(%Pattern2%,%ClipText%)>0}
StringReplace>ClipText,%Pattern2%,%Space%,ClipText
Goto>DelSpace
Endif

/* Delete leading spaces */
Label>DelLeadSpace
If>{pos(%Pattern3%,%ClipText%)>0}
StringReplace>ClipText,%Pattern3%,%CRLF%,ClipText
Goto>DelLeadSpace
Endif

End>DelExcessChars

hoangvo81
Pro Scripter
Posts: 69
Joined: Tue Feb 07, 2012 8:02 pm

Post by hoangvo81 » Thu Apr 04, 2013 5:33 pm

not sure if this will help, i copy the fulltext you have into a file.

readfile>c:\test.txt,Text
let>pattern=(\.{1,}?)
regex>pattern,Text,0,m,n,,Text
mdl>Text


result in mdl shows
Line1
Line2
Line3

armsys
Automation Wizard
Posts: 1108
Joined: Wed Dec 04, 2002 10:28 am
Location: Hong Kong

Post by armsys » Thu Apr 04, 2013 10:56 pm


User avatar
jpuziano
Automation Wizard
Posts: 1085
Joined: Sat Oct 30, 2004 12:00 am

Post by jpuziano » Fri Apr 05, 2013 2:15 am

Hi armsys,

Here are a few methods, all using RegEx. This first one uses two separate RegEx commands, one to trim leading spaces on the lines within the variable... and a second RegEx command to trim trailing spaces.

Code: Select all

/*
FullText:
     Line 1     
     Line 2     
     Line 3     
*/
LabelToVar>FullText,strText
//lines have leading and trailing spaces
MDL>strText

//Strip leading spaces from all lines
//(?m) turns on multi-line mode
Let>pattern=(?m)^ +
RegEx>pattern,strText,0,matches,num,1,,strText

//lines no longer have leading spaces
MDL>strText

//Strip trailing spaces from all lines
//(?m) turns on multi-line mode
Let>pattern=(?m) +$
RegEx>pattern,strText,0,matches,num,1,,strText

//lines no longer have leading or trailing spaces
MDL>strText
Next here's a different approach in which we use only one RegEx command and manage to trim both leading and trailing spaces on text lines within a variable... by taking advantage of grouping i.e. backreferences. We actually match each complete line in 3 separate parts, the leading spaces, the stuff in the middle and the trailing spaces... then we replace the whole line with just the stuff in the middle, effectively getting rid of the leading and trailing spaces.

Code: Select all

/*
FullText2:
     Line 1     
     Line 2     
     Line 3     
*/
LabelToVar>FullText2,strText2
//lines have leading and trailing spaces
MDL>strText2

//Strip leading and trailing spaces from all lines
//(?m) turns on multi-line mode
Let>pattern=(?m)(^ +)(.*?)( +$)
RegEx>pattern,strText,0,matches,num,1,$2,strText2

//lines no longer have leading or trailing spaces
MDL>strText2
In either of the above examples, we are just trimming spaces, but you can easily replace the space char in the patterns above with the following character class:

[ /t]

There is a single space before the / slash above. This allows us to match both spaces and tabs.


The simplest approach of all though might be the one below. This combines two patterns, one to match spaces or tabs at the front of the lines... and another to match spaces or tabs at the end of the lines... using the alternation operator | which is the vertical bar or pipe symbol. And since we can match them, we can also replace them with nothing... to remove them... here we go.

Code: Select all

/*
FullText3:
     Line 1     
     Line 2     
     Line 3     
*/
LabelToVar>FullText3,strText3
//lines have leading and trailing spaces
MDL>strText3

//let's add some tabs for fun
StringReplace>strText3,Line,%TAB%Line%TAB%,strText3

//lines have leading and trailing spaces and tabs
MDL>strText3

//Strip leading and trailing spaces and tabs from all lines
//(?m) turns on multi-line mode
Let>pattern=(?m)^[ \t]+|[ \t]+$
RegEx>pattern,strText,0,matches,num,1,,strText3

//lines no longer have leading or trailing spaces or tabs
MDL>strText3
I hope this was helpful... take care.
Last edited by jpuziano on Fri Apr 05, 2013 3:22 pm, edited 1 time in total.
jpuziano

Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post - :-)

armsys
Automation Wizard
Posts: 1108
Joined: Wed Dec 04, 2002 10:28 am
Location: Hong Kong

Post by armsys » Fri Apr 05, 2013 4:24 am


Post Reply
cron
Sign up to our newsletter for free automation tips, tricks & discounts