quick.gif

space2.gif

space2.gif

space2.gif

space2.gif

space2.gif

space2.gif

space2.gif

   

space.gif

   

space.gif

  ../images/main/bullet_green_ball.gif String Manipulation

Tcl is 8 bit clean (not just ASCII 7 bit subset). Tcl does not apply any interpretation to characters outside of the ASCII subset. Tcl stores strings using a null (zero) character for termination, so it is not possible to store zero characters in a string. To represent binary data, convert it to a form that includes no zero characters, for example, by translating bytes to their corresponding hexadecimal values.

   

space.gif

Note : As of TCL 8.0, binary strings are fully supported, above text is valid for only version before TCL 8.0

   

space.gif

  ../images/main/bulllet_4dots_orange.gif Glob-style pattern matching

Simplest form of Tcl pattern matching.

   

space.gif

Syntax:

   

space.gif

string match pattern string

   

space.gif

Return 1 if match, 0 if no match. Special characters used in matching

   

space.gif

*

Matches any sequence of zero or more characters.

?

Matches any single character.

[chars]

Matches any single character in chars. If chars contains a sequence of the form a-b, any character between a and b inclusive will match.

\x

Matches the single character x. This provides a way to avoid special interpretation for any of the characters *?[]\ in the pattern.

   

space.gif

  ../images/main/bulllet_4dots_orange.gif Pattern matching with regular expressions
   

space.gif

Regular expression patterns can have several layers of structure. Basic building blocks are called atoms and the simplest form regular expression consists of one or more atoms. For a regular expression to match an input string, there must be a substring of the input where each of the regular expression's atoms (or other components) matches the corresponding part of the substring. E.g. regular expression abc matches any string containing abc such as abcdef or xabcy.

   

space.gif

For example, the following pattern matches any string that is either a hexadecimal number or a decimal number.

   

space.gif

^((0x)?[0-9a-fA-F]+|[0-9]+)$

   

space.gif

Syntax:

   

space.gif

regexp ?-nocase? ?-indices? {pattern} input_string ?variable ...?

   

space.gif

and returns 0 if there is no match, 1 if there is a match.

   

space.gif

Note, the pattern must be enclosed in braces so that the characters $, [, and ] are passed through to the regexp command instead of triggering variable or command substitution.

   

space.gif

If regexp is invoked with arguments after the input string, each argument is treated as a name of a variable. The first variable is filled in with the substring that matched the entire regular expression. The second variable is filled in with the portion of the substring that matched the leftmost leftmost parenthesized subexpression within the pattern; the third third variable is filled in with the the match for the next parenthesized subexpression and so on. If there are more variables names than parenthesized subexpressions, the extra variables are set to empty strings.

   

space.gif

Example:

   

space.gif

regexp {([0-9]+) *([a-z]+)} "Walk 10 km" a b c

   

space.gif

variable a will have the value "10 km", b will have "10" and c will have "km".

   

space.gif

The switch -nocase specifies to match without case sensitivity. The switch -indices specifies that the additional variables should not be filled in with the values of the matching substrings, but with a list giving the first and last indices of the substring's range within the input string.

   

space.gif

Example:

   

space.gif

regexp -indices {([0-9]+) *([a-z]+)} "Walk 10 km"a b c

   

space.gif

variable a will have the value "5 9", b will have "5 6" and c will have "8 9".

   

space.gif

Characters

Meaning

.

Matches any single character

^

Matches the null string at the start of the input string.

$

Matches the null string at the end of the input string.

\x

Matches the character x.

[chars]

Matches any single character from chars. If the first character of chars is ^, the pattern matches any single character not in the remainder of chars. A sequence in the form of a-b in chars is treated as shorthand for all of the ASCII characters between a and b inclusive. If the first character in chars (possibly following a ^) is ], it is treated literally (as part of chars instead of a terminator). If a - appears first or last in chars, it is treated literally.

(regexp)

Matches anything that matches the regular expression regexp. Used for grouping and for identifying pieces of the matching substring.

*

Matches a sequence of 0 or more matches of the preceding atom.

+

Matches a sequence of 1 or more matches of the preceding atom.

?

Matches either a null string or a match of the preceding atom.

regexp1 | regexp2

Matches anything that matches either regexp1 or regexp2.

   

space.gif

  ../images/main/bulllet_4dots_orange.gif Syntax:
   

space.gif

regsub ?-nocase? ?-all? pattern input_string replacement_value new_string

   

space.gif

The first argument to regsub is the regular expression pattern. If a match is found in the input string, regsub return 1, otherwise it returns 0 (like regexp command). If the pattern is matched, the substring of the input string is replaced by the third argument and the new string is stored in the fourth argument. If a match was not found, the fourth argument contains the original input string. Two switches can be used: -nocase is equivalent to the nocase switch in the regexp command; -all causes every matching substring in the input string to be replaced.

   

space.gif

   

space.gif

  ../images/main/bulllet_4dots_orange.gif Formatted output
   

space.gif

The format command provides facilities like sprintf in ANSI C.

   

space.gif

Example:

   

space.gif

format "The square root of 10 is %.3f" [expr exp(10)]

=> The square root of 10 is 3.162

   

space.gif

Other format specifiers:

   

space.gif

Format

Meaning

%s

String

%d

Decimal integer

%f

Real number

%e

Real number in mantissa-exponent form

%x

Hexadecimal

%c

Character

The format command can also be used to change the representation of a value. For example, formatting an integer with %c generates the ASCII character represented by the integer.

   

space.gif

  ../images/main/bulllet_4dots_orange.gif Parsing strings with scan

Syntax:

   

space.gif

scan parse_string format_string ?variable ...?

   

space.gif

Example:

   

space.gif

scan "16 units, 24.2 margin" "%d units, %f" a b

=> 2

   

space.gif

  ../images/main/bulllet_4dots_orange.gif Character functions
   

space.gif

String manipulation commands are options of the string command.

   

space.gif

string index "See Spot run." 5

=> p

   

space.gif

string range "See Spot run." 5 8

=> Spot

   

space.gif

string range "See Spot run." 5 end

=> Spot run.

   

space.gif

   

space.gif

  ../images/main/bulllet_4dots_orange.gif Seaching and comparison
   

space.gif

Searching for a substring with first or last returns the position of the first character of the substring (starting at 0 for the first character in the input string). Returns -1 if no match was found.

   

space.gif

string first th "The trains were thirty minutes late this past week"

=> 16

   

space.gif

string last th "The trains were thirty minutes late this past week"

=> 36

   

space.gif

Compare returns 0 if the strings match, -1 if the first string sorts before the second, and 1 if the first string sorts after the second.

   

space.gif

string compare twelve thirteen

=> 1

   

space.gif

string compare twelve twelve

=> 0

   

space.gif

  ../images/main/bulllet_4dots_orange.gif Length, case conversion, and trimming
   

space.gif

string length "not too long"

=> 12

   

space.gif

string toupper "Hello, World!"

=> HELLO, WORLD!

   

space.gif

string tolower "You are lucky winner 13!"

=> you are lucky winner 13!

   

space.gif

string trim abracadabra abr

=> cad

   

space.gif

string trim takes a string to trim and an optional set of trim characters and removes all instances of the trim characters from both the beginning and end of its argument string, returning the trimmed string as result. trimleft and trimright options work in the same way except they only remove the trim characters from the beginning or end of the string. The trim comands are mostly commonly used to remove excess white space; if no trim characters are specified, they default to the white space characters (space, tab, newline, carriage return, and form feed)

   

space.gif

   

space.gif

   

space.gif

   

space.gif

space2.gif

space2.gif

space2.gif

space2.gif

space2.gif

  

Copyright © 1998-2014

Deepak Kumar Tala - All rights reserved

Do you have any Comment? mail me at:deepak@asic-world.com